rag-datasets/rag-mini-bioasq
This dataset is primarily used for question answering and sentence similarity tasks in the biomedical domain. It includes two configurations: text‑corpus and question‑answer‑passages, each corresponding to different data file paths. The dataset originates from the training set of BioASQ Task 11b and subsets were generated using the `generate.py` script.
Description
Dataset Overview
License
- This dataset follows the CC-BY-2.5 license.
Task Categories
- Question Answering (question-answering)
- Sentence Similarity (sentence-similarity)
Language
- English (en)
Tags
- RAG
- DPR
- Information Retrieval (information-retrieval)
- Question Answering (question-answering)
- Biomedical (biomedical)
Configurations
-
Configuration Name: text-corpus
- Data File:
- Split: passages
- Path: "data/passages.parquet/*"
- Data File:
-
Configuration Name: question-answer-passages
- Data File:
- Split: test
- Path: "data/test.parquet/*"
- Data File:
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.