JUHE API Marketplace
DATASET
Open Source Community

BUAADreamer/llava-med-zh-instruct-60k

This Chinese dataset is translated from llava‑med, built using the Qwen1.5‑14B‑Chat model, and contains 60 k medical visual instruction data points. Features include a messages‑and‑images structure: messages consist of role and content fields; images are sequences. The dataset provides a training split of 56,649 samples (size: 6.66 GB) with a download size of 6.57 GB. Task categories are visual question answering and image‑to‑text. The language is Chinese, tags involve medical and biology, and the scale lies between 10 K and 100 K.

Updated 5/21/2024
hugging_face

Description

Dataset Overview

Basic Information

  • License: Apache‑2.0
  • Language: Chinese
  • Tags: Medical, Biology, llama‑factory
  • Size Category: 10K < size < 100K

Dataset Content

  • Features:
    • messages:
      • role: string
      • content: string
    • images: image sequences

Dataset Split

  • Training set:
    • Sample count: 56,649
    • Data size: 6,664,412,158.42 bytes
    • Download size: 6,567,484,534 bytes

Task Types

  • Visual Question Answering
  • Image‑to‑Text

Configuration

  • Default configuration:
    • Data files:
      • Split: train
      • Path: data/train-*

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Medical Visual Question Answering
Image to Text

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.