High Quality Data

Dataset Hub

Explore high-quality datasets for your AI and machine learning projects.

Sort:

Browse by Category

BUAADreamer/llava-med-zh-instruct-60k

This Chinese dataset is translated from llava‑med, built using the Qwen1.5‑14B‑Chat model, and contains 60 k medical visual instruction data points. Features include a messages‑and‑images structure: messages consist of role and content fields; images are sequences. The dataset provides a training split of 56,649 samples (size: 6.66 GB) with a download size of 6.57 GB. Task categories are visual question answering and image‑to‑text. The language is Chinese, tags involve medical and biology, and the scale lies between 10 K and 100 K.

hugging_face

View Details