DATASET
Open Source Community
Synthetic Faces High Quality (SFHQ) dataset
The dataset comprises approximately 425,000 carefully selected high‑quality synthetic face images at 1024 × 1024 resolution, generated by transforming various inspirations such as paintings, sketches, 3D models, and text‑to‑image generators into realistic faces. It also includes facial landmarks (an extended set of 110 points) and semantic segmentation masks for face parsing.
Updated 12/20/2022
github
Description
Dataset Overview
Dataset Name
- Synthetic Faces High Quality (SFHQ) dataset
Dataset Composition
- Total Images: ~425,000
- Resolution: 1024 × 1024
- Four Parts:
- Part 1: 89,785 images sourced from Artstation‑Artistic‑face‑HQ Dataset (AAHQ), Close‑Up Humans Dataset, and UIBVFED Dataset.
- Part 2: 91,361 images sourced from Face Synthetics Dataset and Stable Diffusion v1.4.
- Part 3: 118,358 images generated via the StyleGAN2 mapping network.
- Part 4: 125,754 images generated via Stable Diffusion v2.1.
Generation Process
- Inspiration Sources: paintings, 3D models, text‑to‑image generators, etc.
- Image Processing: StyleGAN2 latent‑space encoding and fine‑tuning to produce photo‑realistic images.
- Selection: Semi‑automatic and manual filtering using a visual taste approximator tool.
Additional Information
- Facial Features: 110 facial landmark points and semantic segmentation maps.
- Tools:
explore_dataset.pyscript provided for accessing landmarks, masks, and text‑based search. - Privacy & License: All images are synthetic; no privacy or copyright concerns.
Use Cases
- Training machine‑learning models, especially generative adversarial networks (e.g., StyleGAN).
- Provides extensive diversity across identity, ethnicity, age, pose, expression, lighting, hairstyle, and hair color.
Download
- Available on Kaggle as separate parts.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Face Recognition
Image Processing
Source
Organization: github
Created: 9/4/2022
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.