CrowdHuman
CrowdHuman is a benchmark dataset for evaluating detector performance in crowded scenes. It is large‑scale, richly annotated, and highly diverse, comprising training, validation, and test sets with a total of 470,000 annotated human instances (average 23 persons per image) covering various occlusion conditions. Each person instance is annotated with a head bounding box, a visible region box, and a full‑body box.
Description
Dataset Overview
Dataset Name
CrowdHuman
Dataset Purpose
Used to evaluate detector performance in crowded scenes.
Dataset Scale
- Training set: 15,000 images
- Validation set: 4,370 images
- Test set: 5,000 images
- Total human instances: 470K (training and validation sets)
- Average persons per image: 23
Dataset Characteristics
- Includes various occlusion situations
- Each person instance is annotated with a head box, visible box, and full‑body box
Dataset Structure
Annotation Format
-
File format:
odgt, each line is a JSON containing all annotations for the corresponding image. -
JSON structure: python JSON{ "ID" : image_filename, "gtboxes" : [gtbox], } gtbox{ "tag" : "person" or "mask", "vbox": [x, y, w, h], "fbox": [x, y, w, h], "hbox": [x, y, w, h], "extra" : extra, "head_attr" : head_attr, } extra{ "ignore": 0 or 1, "box_id": int, "occ": int, } head_attr{ "ignore": 0 or 1, "unsure": int, "occ": int, }
-
Annotation notes:
tagofmaskindicates the box is for crowd/reflective/people‑like objects and should be ignored (extra'signore=1)vbox, fbox, hboxrepresent visible, full, and head boxes respectively
Download Links
- Training set: CrowdHuman_train01.zip
- Training set: CrowdHuman_train02.zip
- Training set: CrowdHuman_train03.zip
- Validation set: CrowdHuman_val.zip
- Training annotations: annotation_train.odgt
- Validation annotations: annotation_val.odgt
- Test set: CrowdHuman_test.zip
Usage Guide
Annotation Conversion
- Use script
crowdhuman2coco.pyto convert CrowdHuman annotations to COCO format.
Dataset Classes
-
Simple PyTorch and MegEngine implementations are provided for reading the CrowdHuman dataset.
-
Supported return order: python class CrowdHuman(VisionDataset): supported_order = ( "image", "boxes", "vboxes", "hboxes", "boxes_category", "info", )
-
Example usage: python crowdhuman_dataset = CrowdHuman( root=path/to/CrowdHuman, ann_file=path/to/annotations.json, remove_images_without_annotations=True, order=[ image, boxes, boxes_category info ] )
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: github
Created: 12/5/2021
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.