BUAADreamer/llava-med-zh-instruct-60k
This Chinese dataset is translated from llava‑med, built using the Qwen1.5‑14B‑Chat model, and contains 60 k medical visual instruction data points. Features include a messages‑and‑images structure: messages consist of role and content fields; images are sequences. The dataset provides a training split of 56,649 samples (size: 6.66 GB) with a download size of 6.57 GB. Task categories are visual question answering and image‑to‑text. The language is Chinese, tags involve medical and biology, and the scale lies between 10 K and 100 K.
Description
Dataset Overview
Basic Information
- License: Apache‑2.0
- Language: Chinese
- Tags: Medical, Biology, llama‑factory
- Size Category: 10K < size < 100K
Dataset Content
- Features:
- messages:
- role: string
- content: string
- images: image sequences
- messages:
Dataset Split
- Training set:
- Sample count: 56,649
- Data size: 6,664,412,158.42 bytes
- Download size: 6,567,484,534 bytes
Task Types
- Visual Question Answering
- Image‑to‑Text
Configuration
- Default configuration:
- Data files:
- Split: train
- Path: data/train-*
- Data files:
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.