NineRec
NineRec is a TransRec dataset suite that includes a large-scale source-domain recommendation dataset and nine different target-domain recommendation datasets. Each item is accompanied by descriptive text and a high-resolution cover image.
Dataset description and usage context
NineRec Dataset Overview
Dataset Introduction
NineRec is a benchmark suite for evaluating transferable recommendation systems, published in TPAMI 2024. The suite supports multimodal, foundation‑model, transfer learning, and recommendation tasks.
Data Download
The complete dataset is publicly available via the following links:
- Google Drive: Source Dataset, Downstream Datasets
Data Format (Example: QB Dataset)
QB_cover: Original images, filenames correspond to item IDs.QB_behaviour.tsv: User‑item interaction sequences; first column = user ID, second column = item ID sequence.QB_pair.csv: User‑item interaction pairs; columns: user ID, item ID, timestamp.QB_item.csv: Original text; columns: item ID, Chinese text, English text.QB_url.csv: Item URLs; columns: item ID, URL.
Citation
If you use this dataset, please cite:
@article{zhang2023ninerec,
title={NineRec: A Benchmark Dataset Suite for Evaluating Transferable Recommendation},
author={Jiaqi Zhang and Yu Cheng and Yongxin Ni and Yunzhu Pan and Zheng Yuan and Junchen Fu and Youhua Li and Jie Wang and Fajie Yuan},
journal={arXiv preprint arXiv:2309.07705},
year={2023}
}
Code Environment
- Pytorch==1.12.1
- cudatoolkit==11.2.1
- sklearn==1.2.0
- python==3.9.12
Data Preparation
Run get_lmdb.py to generate an LMDB database for image loading. Run get_behaviour.py to convert user‑item pairs into item‑sequence format.
Experiment Execution
Run train.py for pre‑training and transfer. Run test.py for evaluation.
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.