Back to datasets
Dataset assetOpen Source CommunitySentiment AnalysisText Classification

sst2_combined

The dataset includes three primary features: 'sentence' (string), 'label' (categorical with two classes: 0 for negative sentiment, 1 for positive sentiment), and 'idx' (integer index). The training set has 68,221 samples, the validation set 872 samples, and the test set 1,821 samples. Total download size is 3,403,184 bytes; total dataset size is 5,110,747 bytes.

Source
huggingface
Created
Dec 14, 2024
Updated
Dec 14, 2024
Signals
102 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Information

  • Features:

    • sentence: type string, representing the sentence.
    • label: type categorical, containing two classes:
      • 0: negative sentiment.
      • 1: positive sentiment.
    • idx: type integer, representing the index.
  • Dataset Split:

    • train: training set, 68,221 samples, occupying 4,787,855 bytes.
    • validation: validation set, 872 samples, occupying 106,252 bytes.
    • test: test set, 1,821 samples, occupying 216,640 bytes.
  • Dataset Size:

    • Download size: 3,403,184 bytes.
    • Total dataset size: 5,110,747 bytes.

Configuration

  • Configuration name: default
  • Data files:
    • train: path data/train-*
    • validation: path data/validation-*
    • test: path data/test-*
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio