student/FFHQ
The FFHQ (Flickr‑Faces‑HQ) dataset comprises 70,000 high‑quality PNG images at 1024 × 1024 resolution, featuring diverse ages, ethnicities, backgrounds, and accessories (glasses, hats, etc.). Images were sourced from Flickr under permissive licenses, automatically aligned and cropped using dlib, and filtered to remove non‑photos. The dataset supports research in generative adversarial networks and related fields.
Dataset description and usage context
Dataset Overview
Name: Flickr‑Faces‑HQ Dataset (FFHQ)
Description: FFHQ is a high‑quality human‑face image dataset containing 70,000 PNG images at 1024 × 1024 resolution. It exhibits substantial variation in age, ethnicity, and background, and includes accessories such as glasses, sunglasses, and hats. Images were crawled from Flickr, automatically aligned and cropped using dlib, and only images under permissive licenses were collected.
Features:
- Number of Images: 70,000
- Resolution: 1024 × 1024
- Format: PNG
- Diversity: Varying ages, ethnicities, backgrounds, and accessories
License:
- Individual images are released under various licenses (CC BY 2.0, CC BY‑NC 2.0, Public Domain Mark 1.0, CC0 1.0, U.S. Government Works). These allow free use, redistribution, and adaptation for non‑commercial purposes, with appropriate attribution and indication of changes where required.
- The dataset itself (metadata, download script, documentation) is provided under CC BY‑NC‑SA 4.0 by NVIDIA Corporation.
Data Structure:
- Main Folder: ffhq-dataset (2.56 TB, 210,014 files)
- Metadata: ffhq-dataset-v1.json (254 MB)
- Images: images1024x1024 (89.1 GB, 70,000 PNG files)
- Thumbnails: thumbnails128x128 (1.95 GB, 70,000 PNG files)
- Raw Images: in-the-wild-images (955 GB, 70,000 PNG files)
- TFRecords: tfrecords (273 GB, 9 files)
Download & Usage:
- Data can be downloaded directly from Google Drive or via the provided
download_ffhq.pyscript, which handles verification, retries, and parallel downloading.
Training & Validation Split:
- The first 60,000 images are designated for training; the remaining 10,000 are for validation.
Metadata Details:
- Each image entry includes original Flickr information, aligned image details, thumbnail information, and raw image data, all recorded in
ffhq-dataset-v1.json.
Acknowledgements:
- Thanks to contributors and researchers who assisted with data collection, alignment, and release.
Contact:
- For business inquiries: researchinquiries@nvidia.com
- For press inquiries: hmarinez@nvidia.com
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.