BOORU CHARS dataset 2025 (danbooru gelbooru e621 zerochan) 2560/1920/2480 px with metadata
Download this torrent!
BOORU CHARS dataset 2025 (danbooru gelbooru e621 zerochan) 2560/1920/2480 px with metadata
To start this P2P download, you have to install a BitTorrent client like qBittorrent
Category: Anime
Total size: 439.70 GB
Added: 6 days ago (2025-08-10 03:50:01)
Share ratio: 6 seeders, 17 leechers
Info Hash: 4a235ac1381a9485a409423aae8557cd1896b768
Last updated: 4 hours ago (2025-08-16 03:25:15)
Description:
Next after BOORU CHARS 2024 volume of several imageboards image stream
based on danbooru (safe+questionable, ID 8200000…9100000 = 24.09.2024…23.04.2025),
with added “the best of” furry-related e621 and loli-enabled gelbooru for the nearly same interval
and also unique zerochan content for ID 3960000…4430000 = 13.06.2023…06.03.2025.
As usual :
images initially filtered Mpixels>=0.48, shorter_side>=600 px, volume>=60000 bytes, no animations
stripes dropped or cropped to aspect ratio 0.4…2.1
PNG/WEBP/AVIF converted to JPG using cjpegli 96% quality (2000000 bytes limit)
modest downsampling done to longer side 2560px (landscape) 1920px (1x1) 2480px (portrait)
verbose file naming used "%website% - %id% - %up_to_3_copyrights% ~ %up_to_5_characters% (%up_to_2_artists%).jpg"
files uniquely identified by “%website%+%id%”
some general image statistics got with EXIFTOOL and IMAGE MAGICK
content analisys was mostly the same as BC2023 with actual software and models
CRAFT text detector used to estimate total size and number of text pieces
torso components detected with custom PyTorch model
being built over Ultralitics YOLOv11
clustering and sorting inside cluster implemented to arrange compositionally and visually similar pictures
inspect “readme” for details
images deduplicatied using AntiDupl up to 3-4% similarity along with BOORU CHARS 2024, 2023 and 2022
semi-automated quality check done as follow
real-life photos, no-character landscapes, foods and macro thrown away
most of comic and N-koma, overtexted images and line-arts filtered out
too “questionable” images (uncensored nipples or vulva, obvious adult actions) excluded
some background crops, gamma correction, rotation, denoise and other nontrivial improvements implemented
Beside images release contains tab separated texts :
BC_2025.tsv file/image related metadata 896.142 rows
BC_2025_tags.tsv tags list with enrichment
BC_2025_yolo.tsv detailed results for torso components detection
BC_2025_yolov11m_aa22.pt PyTorch YOLOv11 model
and also additional “readme”.
Keep in mind this release is first of all
a dataset of character-centric art in effective local format suited for batch processing
and then
a representative catalog of anime/game/cartoon copyrights, characters and artists for visual estimation
but
not offer high image resolution and pretending on completeness.
NOTE content is a little more NSFW compared to predecessors. Such themes wasn’t allowed before.