cdminix/libritts-aligned
The LibriTTS Corpus with Forced Alignments dataset is a speech dataset for automatic speech recognition (ASR) and text‑to‑speech (TTS) tasks. It includes audio files, corresponding transcripts, phonemes, and their durations. The dataset provides pre‑processed alignment information so users do not need to run the Montreal Forced Aligner locally. A data collator is also provided to create training batches. The dataset is divided into several subsets (train, dev, test, etc.) corresponding to different subsets of LibriSpeech.
Dataset description and usage context
Dataset Overview
Name: LibriTTS Corpus with Forced Alignments
Description: This dataset contains forced‑alignment information for speech data, suitable for ASR and TTS tasks.
Dataset Details
Language: English (en)
Tags:
- speech
- audio
- automatic-speech-recognition
- text-to-speech
License: CC-BY-4.0
Task Categories:
- Automatic Speech Recognition
- Text‑to‑Speech
Dataset Contents:
- Each entry includes an audio file ID, speaker information, transcript, start and end times, phonemes with durations, and the audio file path.
- Phonemes are represented using the International Phonetic Alphabet (IPA) and durations are given in frames.
Dataset Splits:
train: All training data except one sample per speaker reserved for validation.dev: One sample per speaker for validation.train.clean.100,train.clean.360,train.other.500: Training data extracted from different LibriSpeech subsets.dev.clean,dev.other: Validation data extracted from different LibriSpeech subsets.test.clean,test.other: Test data extracted from different LibriSpeech subsets.
Environment Variables:
LIBRITTS_VERBOSE: Controls verbosity of the dataset creation process.LIBRITTS_MAX_WORKERS: Sets the maximum number of worker threads for alignment creation.LIBRITTS_PATH: Sets the download path for LibriTTS data.
Usage Requirements
Software Dependencies:
pip install alignments phones(required)pip install speech-collator(optional)
Data Collator:
- A data collator is provided for creating training batches.
- Install via
pip install speech-collator; supports customspeaker2idxandphone2idxmappings.
Citation Information
References:
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.