Back to datasets
Dataset assetOpen Source CommunityDatasetSpeech Synthesis

LibriSeVoc

We provide the LibriSeVoc dataset, which contains self‑vocoded samples generated by six state‑of‑the‑art neural vocoders. The goal is to highlight and exploit vocoder‑induced artifacts. The underlying real data are sourced from LibriTTS, following its naming convention.

Source
github
Created
Apr 4, 2023
Updated
May 23, 2024
Signals
195 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Name

  • LibriSeVoc Dataset

Description

  • Designed to identify and exploit artifacts introduced by neural vocoders in synthetic speech.
  • Contains self‑vocoded samples generated by six cutting‑edge vocoders, emphasizing vocoder‑produced signal artifacts.

Composition

  • Detailed composition is shown in the accompanying table image.

Source Data

  • Real data originates from LibriTTS and follows its naming logic.

Usage

  • Intended for detecting synthetic human speech by revealing neural‑vocoder artifacts, improving the RawNet2 baseline and lowering error rates.

Access

Related Paper

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio