JUHE API Marketplace
DATASET
Open Source Community

dutta18/omcs_dataset_full_with_embeds

--- dataset_info: features: - name: fact dtype: string - name: count dtype: int64 - name: embeddings sequence: float32 splits: - name: train num_bytes: 4951309139 num_examples: 1578238 download_size: 5895178326 dataset_size: 4951309139 --- # Dataset Card for "omcs_dataset_full_with_embeds" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)

Updated 11/27/2023
hugging_face

Description

Dataset Overview

Dataset Name

  • Name: omcs_dataset_full_with_embeds

Dataset Features

  • Feature List:
    • fact: data type is string (string)
    • count: data type is integer (int64)
    • embeddings: data type is a sequence of float numbers (sequence: float32)

Dataset Split

  • Training Set (train):
    • Bytes: 4951309139 bytes
    • Samples: 1578238 samples

Dataset Size

  • Download Size: 5895178326 bytes
  • Actual Size: 4951309139 bytes

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Natural Language Processing
Text Embedding

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.