Back to datasets
Dataset assetOpen Source Community3D Animation GenerationMulti-View Video Processing

MV-Video

The MV-Video dataset is a large‑scale multi‑view video dataset consisting of 53 K rendered animated 3D objects. It is used to train the Animate3D model ([Animate3D: Animating Any 3D Model with Multi‑view Video Diffusion](https://animate3d.github.io/)).

Source
huggingface
Created
Oct 21, 2024
Updated
Oct 22, 2024
Signals
233 views
Availability
Linked source ready
Overview

Dataset description and usage context

MV-Video Dataset

Overview

MV-Video is a large‑scale multi‑view video dataset rendered from 53 K animated 3D objects. The dataset is used to train Animate3D: Animating Any 3D Model with Multi‑view Video Diffusion.

Rendering Details

  • Each object is rendered from 16 viewpoints uniformly distributed in azimuth.
  • Elevation (elv) is randomly sampled between 0‑30°, and the starting azimuth (azi_start) is perturbed by ±11.25°.
  • Each video lasts 2 seconds (24 fps). For animations lasting 2‑4 seconds, the first 2 seconds are rendered; for longer animations, the first 2 seconds and the last 2 seconds are rendered.
  • Objects with more than 6 animations are randomly sampled down to 6 to avoid overfitting.

Data Structure

The dataset contains multiple multi_view_video_*.tar.gz files. After extraction the structure is:

videos/
├── [UID1]/
│   ├── 00/
│   │   ├── view_0.mp4
│   │   ├── view_1.mp4
│   │   └── ...
│   ├── 01/
│   │   ├── view_0.mp4
│   │   ├── view_1.mp4
│   │   └── ...
│   └── ...
├── [UID2]/
│   ├── 00/
│   │   ├── view_0.mp4
│   │   ├── view_1.mp4
│   │   └── ...
│   └── ...
└── ...
  • A uid_info_dict.json file is provided, containing metadata for each 3D object.

Notes

  1. Approximately 500 animation models were filtered during data inspection, so the provided quantity is slightly lower than reported in the paper.
  2. About 7.7 K objects are labelled as high quality, listed in high_quality_uid.txt.
  3. Text prompts were generated with Minigpt4‑video; some prompts may be inaccurate, and users are encouraged to re‑annotate with advanced video captioning models.

License

The dataset is released under the ODC‑By v1.0 license. Rendered object licenses are:

Citation

@article{jiang2024animate3d,
  title={Animate3D: Animating Any 3D Model with Multi-view Video Diffusion},
  author={Yanqin Jiang and Chaohui Yu and Chenjie Cao and Fan Wang and Weiming Hu and Jin Gao},
  journal={arXiv},
  year={2024}
}
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio