Dataset assetOpen Source CommunityDatasetSpeech Synthesis

LibriSeVoc

We provide the LibriSeVoc dataset, which contains self‑vocoded samples generated by six state‑of‑the‑art neural vocoders. The goal is to highlight and exploit vocoder‑induced artifacts. The underlying real data are sourced from LibriTTS, following its naming convention.

Source

github

Created

Apr 4, 2023

Updated

May 23, 2024

Signals

195 views

Availability

Linked source ready

Overview

Dataset description and usage context

Dataset Overview

Dataset Name

LibriSeVoc Dataset

Description

Designed to identify and exploit artifacts introduced by neural vocoders in synthetic speech.
Contains self‑vocoded samples generated by six cutting‑edge vocoders, emphasizing vocoder‑produced signal artifacts.

Composition

Detailed composition is shown in the accompanying table image.

Source Data

Real data originates from LibriTTS and follows its naming logic.

Usage

Intended for detecting synthetic human speech by revealing neural‑vocoder artifacts, improving the RawNet2 baseline and lowering error rates.

Access

More information and download link: Dataset Download

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio