JUHE API Marketplace
DATASET
Classic Dataset

VCTK Corpus

This repository provides full‑context label files for the VCTK corpus. These label files were created following the preprocessing steps in r9y9/deepvoice3_pytorch. The dataset includes both full and mono label files, detailing the segmentation and annotation format of the audio data.

Updated 6/23/2022
github

Description

Dataset Overview

Dataset Name

  • Full‑context label for VCTK‑Corpus

Dataset Content

  • Provides full‑context label files for the VCTK‑Corpus.

Dataset Structure

├── lab
│   ├── full
│   │   ├── p225
│   │   │   ├── p225_001.lab
│   │   │   ├── p225_002.lab
│   │   │   ├── p225_003.lab
│   │   │   ├── p225_004.lab
│   │   │   ├── p225_005.lab
│   │   │   ...
│   ├── mono
│   │   ├── p225
│   │   │   ├── p225_001.lab
│   │   │   ├── p225_002.lab
│   │   │   ├── p225_003.lab
│   │   │   ├── p225_004.lab
│   │   │   ├── p225_005.lab
│   │   │   ...

Missing Files

  • lab/*/p315/*.lab (p315 lacks txt)
  • lab/mono/p295/p295_047.lab (alignment failed)
  • lab/mono/p305/p305_423.lab (alignment failed)
  • lab/mono/p317/p317_424.lab (alignment failed)
  • lab/mono/p345/p345_387.lab (alignment failed)

Label Format

Mono label
         0     850000 pau
    850000    2850000 pau
   2850000    3600000 p
   3600000    3900000 l
   3900000    6000000 iy
   6000000    8450000 z
   8450000    8600000 k
   8600000   11300000 ao
  11300000   11450000 l
  11450000   12800000 s
  12800000   13099999 t
  13099999   15800000 eh
  15800000   16050000 l
  16050000   17600000 ax
  17600000   20400000 pau
Full context label
         0     850000 x^x-pau+pau=p@x_x/A:0_0_0/B:x-x-x@x-x&x-x#x-x$x-x!x-x;x-x|x/C:0+0+0/D:0_0/E:x+x@x+x&x+x#x+x/F:0_0/G:0_0/H:x=x@1=1|0/I:0=0/J:4+3-1
    850000    2850000 x^pau-pau+p=l@x_x/A:0_0_0/B:x-x-x@x-x&x-x#x-x$x-x!x-x;x-x|x/C:1+1+4/D:0_0/E:x+x@x+x&x+x#x+x/F:content_1/G:0_0/H:x=x@1=1|0/I:4=3/J:4+3-1
   2850000    3600000 pau^pau-p+l=iy@1_4/A:0_0_0/B:1-1-4@1-1&1-4#1-3$1-4!0-1;0-1|iy/C:1+1+3/D:0_0/E:content+1@1+3&1+2#0+1/F:content_1/G:0_0/H:4=3@1=1|L-L%/I:0=0/J:4+3-1
   ...

References

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Speech Recognition
Speech Synthesis

Source

Organization: github

Created: 3/8/2020

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.