Back to datasets
Dataset assetOpen Source CommunityQuality AssessmentDocument Classification
pubmed-en-quality-annotations-7
This dataset includes several features such as id, French translation, educational score, domain, and document type. 'Domain' and 'document type' are categorical variables with three and four categories respectively. The dataset is split into a training set and a validation set, containing 358,199 and 39,800 samples respectively. The total download size is 245,314,153 bytes, and the overall size is 438,787,962 bytes.
Source
huggingface
Created
Dec 12, 2024
Updated
Dec 12, 2024
Signals
81 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Dataset Information
- Features:
- id: data type
int32. - french_translation: data type
string. - educational_score: data type
int32. - domain: data type
class_label, with the following categories:0: biomedical1: clinical2: other
- document_type: data type
class_label, with the following categories:0: Study1: Other2: Review3: Clinical case
- id: data type
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.