conglu/vd4rl
The V‑D4RL dataset provides pixel‑based analogues of the D4RL benchmark tasks derived from the dm_control suite and extends two state‑of‑the‑art online pixel‑based continuous control algorithms, DrQ‑v2 and DreamerV2, to offline settings. It includes data of varying difficulty across multiple environments such as walker_walk, cheetah_run, and humanoid_walk, along with corresponding benchmarks and algorithm evaluations.
Description
V‑D4RL Dataset Overview
Dataset Description
V‑D4RL provides pixel‑based analogues of the D4RL benchmark tasks, originating from the dm_control suite, and serves as a natural extension of two state‑of‑the‑art online pixel continuous‑control algorithms, DrQ‑v2 and DreamerV2, to offline settings.
Dataset Structure
The dataset is stored under the vd4rl_data directory with the following layout:
vd4rl_data
├── main
│ ├── walker_walk
│ │ ├── random
│ │ │ ├── 64px
│ │ │ └── 84px
│ │ └── medium_replay
│ │ ...
│ ├── cheetah_run
│ │ ...
│ └── humanoid_walk
│ ...
├── distracting
│ ...
└── multitask
...
Benchmarks
Environment Setup
Each environment folder contains a conda_env.yml file specifying the required dependencies. Create the environment with:
conda env create -f conda_env.yml
A Dockerfile is also provided in the dockerfiles directory; replace <<USER_ID>> with your user ID.
Evaluation Command Examples
Below are example commands for various algorithms. Set ENVNAME to one of [walker_walk, cheetah_run, humanoid_walk] and TYPE to one of [random, medium_replay, medium, medium_expert, expert].
Offline DV2
python offlinedv2/train_offline.py \
--configs dmc_vision \
--task dmc_${ENVNAME} \
--offline_dir vd4rl_data/main/${ENVNAME}/${TYPE}/64px \
--offline_penalty_type meandis \
--offline_lmbd_cons 10 \
--seed 0
DrQ+BC
python drqbc/train.py \
task_name=offline_${ENVNAME}_${TYPE} \
offline_dir=vd4rl_data/main/${ENVNAME}/${TYPE}/84px \
nstep=3 seed=0
DrQ+CQL
python drqbc/train.py \
task_name=offline_${ENVNAME}_${TYPE} \
offline_dir=vd4rl_data/main/${ENVNAME}/${TYPE}/84px \
algo=cql cql_importance_sample=false min_q_weight=10 seed=0
BC
python drqbc/train.py \
task_name=offline_${ENVNAME}_${TYPE} \
offline_dir=vd4rl_data/main/${ENVNAME}/${TYPE}/84px algo=bc seed=0
Distracting and Multitask Experiments
Run distracting or multitask experiments by altering the offline_dir in the commands above.
Data Collection & Format
The data‑collection pipeline is described in Appendix B of the paper. Conversion scripts are located in the conversion_scripts directory.
- Offline DV2 stores data as
*.npzfiles with 64 px images. - DrQ+BC uses
*.hdf5files with 84 px images.
Acknowledgements
V‑D4RL builds upon several open‑source repositories for offline reinforcement learning and online pixel‑based control. Special thanks to the authors of:
Contact
For questions, contact Cong Lu or Philip Ball. Contributions and suggestions are welcome!
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.