AnanthZeke/oscar_tamil_clean

The dataset oscar_tamil_clean may involve Tamil text data cleaning or processing, containing text and sentence token features.

Updated 4/5/2023

hugging_face

Dataset Overview

Training Set:
- Number of Samples: 1263180
- Data Size: 19533337624 bytes

Generate PPTs instantly with Nano Banana Pro.

Please login to view download links and access full dataset details.

Tamil

Natural Language Processing

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.