Back to datasets
Dataset assetOpen Source CommunityModel TrainingMulti‑View Images
WebVi3D
WebVi3D is a multi‑view image dataset containing 320 M frames extracted from 16 M video clips, used for training See3D models. The dataset expands training data by automatically filtering video clips with inconsistent multi‑view information or insufficient observations, yielding a high‑quality, diverse multi‑view image collection.
Source
github
Created
Dec 9, 2024
Updated
Dec 10, 2024
Signals
258 views
Availability
Linked source ready
Overview
Dataset description and usage context
See3D Dataset Overview
Dataset Summary
See3D is a vision‑conditioned multi‑view diffusion model trained on large‑scale internet video data for open‑world 3D creation. The model extracts visual content solely from video data to generate 3D knowledge.
Dataset Features
- WebVi3D: 320 M image frames from 16 M video clips, used for multi‑view training.
- Data Curation: Automatic filtering removes clips with inconsistent multi‑view cues or insufficient observation, producing a high‑quality, diverse dataset.
- Pose‑Free: By introducing temporally dependent visual noise, the approach eliminates the need for explicit pose annotations.
Applications
- 3D Generation: Supports object‑level and scene‑level 3D generation, including sparse‑view‑to‑3D, text/image‑to‑3D, and 3D editing.
- High‑Fidelity 3D: Integrating See3D into distortion‑based pipelines yields high‑fidelity 3D outputs.
Dataset Download
- Pre‑trained Models & Test Data: Available from Google Drive.
Citation
If you use the See3D dataset, please cite:
@inproceedings{Ma2024See3D,
title = {You See it, You Got it: Learning 3D Creation on Pose‑Free Videos at Scale},
author = {Baorui Ma and Huachen Gao and Haoge Deng and Zhengxiong Luo and Tiejun Huang and Lulu Tang and Xinlong Wang},
journal = {arXiv preprint arXiv:2412.06699},
year = {2024}
}
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.