JUHE API Marketplace
DATASET
Open Source Community

CrowdHuman

CrowdHuman is a benchmark dataset for evaluating detector performance in crowded scenes. It is large‑scale, richly annotated, and highly diverse, comprising training, validation, and test sets with a total of 470,000 annotated human instances (average 23 persons per image) covering various occlusion conditions. Each person instance is annotated with a head bounding box, a visible region box, and a full‑body box.

Updated 12/5/2021
github

Description

Dataset Overview

Dataset Name

CrowdHuman

Dataset Purpose

Used to evaluate detector performance in crowded scenes.

Dataset Scale

  • Training set: 15,000 images
  • Validation set: 4,370 images
  • Test set: 5,000 images
  • Total human instances: 470K (training and validation sets)
  • Average persons per image: 23

Dataset Characteristics

  • Includes various occlusion situations
  • Each person instance is annotated with a head box, visible box, and full‑body box

Dataset Structure

Annotation Format

  • File format: odgt, each line is a JSON containing all annotations for the corresponding image.

  • JSON structure: python JSON{ "ID" : image_filename, "gtboxes" : [gtbox], } gtbox{ "tag" : "person" or "mask", "vbox": [x, y, w, h], "fbox": [x, y, w, h], "hbox": [x, y, w, h], "extra" : extra, "head_attr" : head_attr, } extra{ "ignore": 0 or 1, "box_id": int, "occ": int, } head_attr{ "ignore": 0 or 1, "unsure": int, "occ": int, }

  • Annotation notes:

    • tag of mask indicates the box is for crowd/reflective/people‑like objects and should be ignored (extra's ignore = 1)
    • vbox, fbox, hbox represent visible, full, and head boxes respectively

Download Links

Usage Guide

Annotation Conversion

  • Use script crowdhuman2coco.py to convert CrowdHuman annotations to COCO format.

Dataset Classes

  • Simple PyTorch and MegEngine implementations are provided for reading the CrowdHuman dataset.

  • Supported return order: python class CrowdHuman(VisionDataset): supported_order = ( "image", "boxes", "vboxes", "hboxes", "boxes_category", "info", )

  • Example usage: python crowdhuman_dataset = CrowdHuman( root=path/to/CrowdHuman, ann_file=path/to/annotations.json, remove_images_without_annotations=True, order=[ image, boxes, boxes_category info ] )

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Crowd Detection
Benchmark Dataset

Source

Organization: github

Created: 12/5/2021

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.