DATASET
Open Source Community
iamnguyen/edu_child_01
The dataset is primarily intended for text analysis and processing, containing text content, metadata, and vector information. The metadata records in detail the answer to a question, identifier, prefix, the question itself, school ID, sequence number, source, tokenized question, URL, and vector data. The dataset is suitable for training models for text understanding and related tasks.
Updated 12/18/2023
hugging_face
Description
Dataset Overview
Dataset Features
- content: Data type is string.
- metadata: Structured data containing the following fields:
- answer: Data type is string.
- id: Data type is string.
- prefix: Data type is string.
- question: Data type is string.
- school_id: Data type is string.
- seq_num: Data type is integer (int64).
- source: Data type is string.
- tokenized_question: Data type is string.
- url: Data type is string.
- vector: Data type is a sequence of floats (float64).
- vector: Data type is a sequence of floats (float64).
Dataset Split
- train: Contains 1,015 samples, occupying 18,574,718 bytes.
Dataset Size
- Download size: 12,148,966 bytes.
- Dataset size: 18,574,718 bytes.
Configuration
- default: Includes training data files, located at
data/train-*.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Education
Question Answering Systems
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.