JUHE API Marketplace
DATASET
Open Source Community

fine-tuned/NFCorpus-512-192-gpt-4o-2024-05-13-43315

The dataset "news articles" is a generated dataset designed to support the development of domain‑specific embedding models for retrieval tasks.

Updated 5/28/2024
hugging_face

Description

NFCorpus-512-192-gpt-4o-2024-05-13-43315 Dataset

Overview

  • Name: news articles
  • License: Apache-2.0
  • Language: English
  • Task Categories:
    • Feature Extraction
    • Sentence Similarity
  • Tags:
    • sentence-transformers
    • feature-extraction
    • sentence-similarity
    • mteb
    • News
    • Articles
    • Journalism
    • Media
    • Current Events
  • Size Category: n<1K

Dataset Description

This dataset is a synthetic collection created specifically to enable the development of domain‑specific embedding models, primarily for retrieval tasks.

Associated Model

The dataset is used to train the NFCorpus-512-192-gpt-4o-2024-05-13-43315 model.

Usage

To train or evaluate models with this dataset, load it via the Hugging Face datasets library:

from datasets import load_dataset

dataset = load_dataset("fine-tuned/NFCorpus-512-192-gpt-4o-2024-05-13-43315")
print(dataset[test][0])

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Natural Language Processing
News Articles

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.