C-MTEB/TNews-classification
--- configs: - config_name: default data_files: - split: test path: data/test-* - split: train path: data/train-* - split: validation path: data/validation-* dataset_info: features: - name: text dtype: string - name: label dtype: class_label: names: '0': '100' '1': '101' '2': '102' '3': '103' '4': '104' '5': '106' '6': '107' '7': '108' '8': '109' '9': '110' '10': '112' '11': '113' '12': '114' '13': '115' '14': '116' - name: idx dtype: int32 splits: - name: test num_bytes: 810970 num_examples: 10000 - name: train num_bytes: 4245677 num_examples: 53360 - name: validation num_bytes: 797922 num_examples: 10000 download_size: 4697191 dataset_size: 5854569 --- # Dataset Card for "TNews-classification" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
Dataset description and usage context
Dataset Overview
Configuration Information
- Default Configuration:
- Data Files:
- Test Set: path
data/test-* - Training Set: path
data/train-* - Validation Set: path
data/validation-*
- Test Set: path
- Data Files:
Dataset Information
-
Features:
- Text: string type
- Label: categorical type with the following names:
- 0: 100
- 1: 101
- 2: 102
- 3: 103
- 4: 104
- 5: 106
- 6: 107
- 7: 108
- 8: 109
- 9: 110
- 10: 112
- 11: 113
- 12: 114
- 13: 115
- 14: 116
- Index: int32 type
-
Dataset Splits:
- Test Set:
- Bytes: 810,970
- Samples: 10,000
- Training Set:
- Bytes: 4,245,677
- Samples: 53,360
- Validation Set:
- Bytes: 797,922
- Samples: 10,000
- Test Set:
-
Dataset Size:
- Download Size: 4,697,191 bytes
- Dataset Size: 5,854,569 bytes
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.