Dataset assetOpen Source Community3D Animation GenerationMulti-View Video Processing

MV-Video

The MV-Video dataset is a large‑scale multi‑view video dataset consisting of 53 K rendered animated 3D objects. It is used to train the Animate3D model ([Animate3D: Animating Any 3D Model with Multi‑view Video Diffusion](https://animate3d.github.io/)).

Source

huggingface

Created

Oct 21, 2024

Updated

Oct 22, 2024

Signals

233 views

Availability

Linked source ready

Overview

Dataset description and usage context

MV-Video Dataset

Overview

MV-Video is a large‑scale multi‑view video dataset rendered from 53 K animated 3D objects. The dataset is used to train Animate3D: Animating Any 3D Model with Multi‑view Video Diffusion.

Rendering Details

Each object is rendered from 16 viewpoints uniformly distributed in azimuth.
Elevation (elv) is randomly sampled between 0‑30°, and the starting azimuth (azi_start) is perturbed by ±11.25°.
Each video lasts 2 seconds (24 fps). For animations lasting 2‑4 seconds, the first 2 seconds are rendered; for longer animations, the first 2 seconds and the last 2 seconds are rendered.
Objects with more than 6 animations are randomly sampled down to 6 to avoid overfitting.

Data Structure

The dataset contains multiple multi_view_video_*.tar.gz files. After extraction the structure is:

videos/
├── [UID1]/
│   ├── 00/
│   │   ├── view_0.mp4
│   │   ├── view_1.mp4
│   │   └── ...
│   ├── 01/
│   │   ├── view_0.mp4
│   │   ├── view_1.mp4
│   │   └── ...
│   └── ...
├── [UID2]/
│   ├── 00/
│   │   ├── view_0.mp4
│   │   ├── view_1.mp4
│   │   └── ...
│   └── ...
└── ...

A uid_info_dict.json file is provided, containing metadata for each 3D object.

Notes

Approximately 500 animation models were filtered during data inspection, so the provided quantity is slightly lower than reported in the paper.
About 7.7 K objects are labelled as high quality, listed in high_quality_uid.txt.
Text prompts were generated with Minigpt4‑video; some prompts may be inaccurate, and users are encouraged to re‑annotate with advanced video captioning models.

License

The dataset is released under the ODC‑By v1.0 license. Rendered object licenses are:

CC‑BY 4.0 – 50,000
CC‑BY‑NC 4.0 – ~1,500
CC‑BY‑SA 4.0 – ~400
CC‑BY‑NC‑SA 4.0 – ~400
CC0 1.0 – ~100

Citation

@article{jiang2024animate3d,
  title={Animate3D: Animating Any 3D Model with Multi-view Video Diffusion},
  author={Yanqin Jiang and Chaohui Yu and Chenjie Cao and Fan Wang and Weiming Hu and Jin Gao},
  journal={arXiv},
  year={2024}
}

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio