NFCorpus-512-192-gpt-4o-2024-05-13-43315 Dataset

Overview

Name: news articles
License: Apache-2.0
Language: English
Task Categories:
- Feature Extraction
- Sentence Similarity
Tags:
- sentence-transformers
- feature-extraction
- sentence-similarity
- mteb
- News
- Articles
- Journalism
- Media
- Current Events
Size Category: n<1K

Dataset Description

This dataset is a synthetic collection created specifically to enable the development of domain‑specific embedding models, primarily for retrieval tasks.

Associated Model

The dataset is used to train the NFCorpus-512-192-gpt-4o-2024-05-13-43315 model.

Usage

To train or evaluate models with this dataset, load it via the Hugging Face datasets library:

from datasets import load_dataset

dataset = load_dataset("fine-tuned/NFCorpus-512-192-gpt-4o-2024-05-13-43315")
print(dataset[test][0])

fine-tuned/NFCorpus-512-192-gpt-4o-2024-05-13-43315

Description

NFCorpus-512-192-gpt-4o-2024-05-13-43315 Dataset

Overview

Dataset Description

Associated Model

Usage

AI studio

Access Dataset

Topics

Source