pubmed-en-quality-annotations-7

This dataset includes several features such as id, French translation, educational score, domain, and document type. 'Domain' and 'document type' are categorical variables with three and four categories respectively. The dataset is split into a training set and a validation set, containing 358,199 and 39,800 samples respectively. The total download size is 245,314,153 bytes, and the overall size is 438,787,962 bytes.

Updated 12/12/2024

huggingface

Description

Dataset Overview

Dataset Information

Features:
- id: data type int32.
- french_translation: data type string.
- educational_score: data type int32.
- domain: data type class_label, with the following categories:
  - 0: biomedical
  - 1: clinical
  - 2: other
- document_type: data type class_label, with the following categories:
  - 0: Study
  - 1: Other
  - 2: Review
  - 3: Clinical case

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Please login to view download links and access full dataset details.

Topics

Document Classification

Quality Assessment

Source

Organization: huggingface

Created: 12/12/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.

Check Prices →