Back to datasets
Dataset assetOpen Source CommunityNatural Language ProcessingMedical

FreedomIntelligence/DxBench

This dataset is a benchmark for text generation and label‑classification tasks in the medical domain, supporting both English and Chinese. It consists of multiple configurations (DxBench, Dxy, Muzhi), each with corresponding English and Chinese data files.

Source
hugging_face
Created
Nov 28, 2025
Updated
Aug 23, 2024
Signals
92 views
Availability
Linked source ready
Overview

Dataset description and usage context

Disease Diagnostic Benchmark Dataset Overview

Basic Information

  • License: Apache 2.0
  • Task Categories:
    • Text Generation
    • Label Classification
  • Languages:
    • English
    • Chinese
  • Domain: Medical

Dataset Configurations

  • Configuration Name: DxBench

    • Data Files:
      • Split: English
        • Path: English/DxBench_en.json
      • Split: Chinese
        • Path: Chinese/DxBench_zh.json
  • Configuration Name: Dxy

    • Data Files:
      • Split: English
        • Path: English/Dxy_en.json
      • Split: Chinese
        • Path: Chinese/Dxy_zh.json
  • Configuration Name: Muzhi

    • Data Files:
      • Split: English
        • Path: English/Muzhi_en.json
      • Split: Chinese
        • Path: Chinese/Muzhi_zh.json
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio