Dataset assetOpen Source CommunityChatbotDialogue Data

NemoSheng/codefuse_fc_v1_sharegpt

The dataset contains dialogues and tool information, primarily for training and testing models. Dialogue information is stored as a list, each dialogue having a source and content field. Tool information is stored as a string. The dataset is split into training and test sets, with 72,032 training examples and 1,250 test examples. Download size 193,720,278 bytes, total size 1,002,393,963 bytes.

Source

hugging_face

Created

Nov 28, 2025

Updated

Jul 18, 2024

Signals

50 views

Availability

Linked source ready

Overview

Dataset description and usage context

Dataset Overview

Data Features

conversations:
- from: string type
- value: string type
tools: string type

Data Splits

train:
- Bytes: 999,501,804.0
- Samples: 72,032
test:
- Bytes: 2,892,159.0
- Samples: 1,250

Dataset Size

Download size: 193,720,278
Total size: 1,002,393,963.0

Configuration

default:
- train: data file path data/train-*
- test: data file path data/test-*

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio