Back to datasets
Dataset assetOpen Source CommunityChatbotDialogue Data
NemoSheng/codefuse_fc_v1_sharegpt
The dataset contains dialogues and tool information, primarily for training and testing models. Dialogue information is stored as a list, each dialogue having a source and content field. Tool information is stored as a string. The dataset is split into training and test sets, with 72,032 training examples and 1,250 test examples. Download size 193,720,278 bytes, total size 1,002,393,963 bytes.
Source
hugging_face
Created
Nov 28, 2025
Updated
Jul 18, 2024
Signals
50 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Data Features
- conversations:
- from: string type
- value: string type
- tools: string type
Data Splits
- train:
- Bytes: 999,501,804.0
- Samples: 72,032
- test:
- Bytes: 2,892,159.0
- Samples: 1,250
Dataset Size
- Download size: 193,720,278
- Total size: 1,002,393,963.0
Configuration
- default:
- train: data file path
data/train-* - test: data file path
data/test-*
- train: data file path
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.