Back to datasets
Dataset assetOpen Source CommunityDatasetSpeech Synthesis
LibriSeVoc
We provide the LibriSeVoc dataset, which contains self‑vocoded samples generated by six state‑of‑the‑art neural vocoders. The goal is to highlight and exploit vocoder‑induced artifacts. The underlying real data are sourced from LibriTTS, following its naming convention.
Source
github
Created
Apr 4, 2023
Updated
May 23, 2024
Signals
195 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Dataset Name
- LibriSeVoc Dataset
Description
- Designed to identify and exploit artifacts introduced by neural vocoders in synthetic speech.
- Contains self‑vocoded samples generated by six cutting‑edge vocoders, emphasizing vocoder‑produced signal artifacts.
Composition
- Detailed composition is shown in the accompanying table image.
Source Data
- Real data originates from LibriTTS and follows its naming logic.
Usage
- Intended for detecting synthetic human speech by revealing neural‑vocoder artifacts, improving the RawNet2 baseline and lowering error rates.
Access
- More information and download link: Dataset Download
Related Paper
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.