cdminix/libritts-aligned
The LibriTTS Corpus with Forced Alignments dataset is a speech dataset for automatic speech recognition (ASR) and text‑to‑speech (TTS) tasks. It includes audio files, corresponding transcripts, phonemes, and their durations. The dataset provides pre‑processed alignment information so users do not need to run the Montreal Forced Aligner locally. A data collator is also provided to create training batches. The dataset is divided into several subsets (train, dev, test, etc.) corresponding to different subsets of LibriSpeech.
Description
Dataset Overview
Name: LibriTTS Corpus with Forced Alignments
Description: This dataset contains forced‑alignment information for speech data, suitable for ASR and TTS tasks.
Dataset Details
Language: English (en)
Tags:
- speech
- audio
- automatic-speech-recognition
- text-to-speech
License: CC-BY-4.0
Task Categories:
- Automatic Speech Recognition
- Text‑to‑Speech
Dataset Contents:
- Each entry includes an audio file ID, speaker information, transcript, start and end times, phonemes with durations, and the audio file path.
- Phonemes are represented using the International Phonetic Alphabet (IPA) and durations are given in frames.
Dataset Splits:
train: All training data except one sample per speaker reserved for validation.dev: One sample per speaker for validation.train.clean.100,train.clean.360,train.other.500: Training data extracted from different LibriSpeech subsets.dev.clean,dev.other: Validation data extracted from different LibriSpeech subsets.test.clean,test.other: Test data extracted from different LibriSpeech subsets.
Environment Variables:
LIBRITTS_VERBOSE: Controls verbosity of the dataset creation process.LIBRITTS_MAX_WORKERS: Sets the maximum number of worker threads for alignment creation.LIBRITTS_PATH: Sets the download path for LibriTTS data.
Usage Requirements
Software Dependencies:
pip install alignments phones(required)pip install speech-collator(optional)
Data Collator:
- A data collator is provided for creating training batches.
- Install via
pip install speech-collator; supports customspeaker2idxandphone2idxmappings.
Citation Information
References:
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.