Flickr-Faces-HQ (FFHQ)
Flickr‑Faces‑HQ (FFHQ) is a high‑quality face image dataset originally created as a benchmark for Generative Adversarial Networks (GANs). The dataset contains 70,000 high‑quality PNG images at a resolution of 1024×1024, featuring significant variation in age, race, and background, as well as accessories such as glasses, sunglasses, and hats. Images were scraped from Flickr, inheriting its biases, and were automatically aligned and cropped using dlib. Only images with appropriate licenses were collected, and various automatic filters and Amazon Mechanical Turk were employed to remove occasional statues, paintings, or non‑photographic content.
Description
Dataset Overview
Name: Flickr‑Faces‑HQ Dataset (FFHQ)
Description: FFHQ is a high‑quality face image dataset containing 70,000 PNG images at 1024×1024 resolution. The dataset exhibits significant diversity in age, race, and background, and includes accessories such as glasses, sunglasses, and hats. Images are sourced from Flickr and have been automatically aligned and cropped.
Usage: Primarily intended for research on Generative Adversarial Networks (GANs); not for development or improvement of facial recognition technologies.
Dataset Content
- Number of Images: 70,000
- Image Format: PNG
- Resolution: 1024×1024
- Dataset Size: 2.56 TB
Dataset Structure
- Root Folder: ffhq-dataset
- Subfolders and Contents:
ffhq-dataset-v2.json: Metadata (including copyright information, URLs, etc.) – 255 MBimages1024x1024: Aligned and cropped 1024×1024 images – 89.1 GBthumbnails128x128: 128×128 thumbnails – 1.95 GBin-the-wild-images: Original Flickr images – 955 GBtfrecords: Multi‑resolution data for StyleGAN and StyleGAN2 – 273 GBzips: ZIP archives of each folder's contents – 1.28 TB
Dataset Usage
- Download Script:
download_ffhq.pyscript is provided for automated download and verification. - Training & Validation: First 60,000 images are used for training; the remaining 10,000 for validation.
Copyright & License
- Image Licenses: Various Creative Commons licenses that permit free use, redistribution, and adaptation, with some requiring attribution and indication of changes.
- Dataset License: Released by NVIDIA Corporation under Creative Commons BY‑NC‑SA 4.0, allowing non‑commercial use, redistribution, and adaptation provided the original paper is cited and changes are noted; derivative works must use the same license.
Privacy Protection
- The dataset only includes photos whose authors have explicitly permitted free use and redistribution.
- Mechanisms are provided for users to check whether their photos are included and request removal.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: github
Created: 2/4/2019
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.