Back to datasets
Dataset assetOpen Source CommunitySpeech RecognitionAnti‑spoofing

encoded_LA_2021

This dataset is the original ASVspoof 2021 LA subset, containing two main features: 'label' and 'input_values'. The 'label' feature is a categorical label with two classes: 'fake' and 'real'. The 'input_values' feature is a sequence of floating‑point numbers. The dataset is split into training, validation, and test sets, each with specified sample counts and file sizes. The configuration name is 'default', and data files are assigned per split. The license is 'odc‑by', and the dataset name is 'ASVspoof 2021 LA'.

Source
huggingface
Created
Aug 20, 2024
Updated
Aug 20, 2024
Signals
164 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Information

  • Feature Information:

    • label: categorical label with two classes:
      • 0: fake
      • 1: real
    • input_values: sequence of float32 values
  • Data Splits:

    • train: 16,464 samples, 2,461,510,888 bytes
    • validation: 16,926 samples, 1,849,172,416 bytes
    • test: 148,176 samples, 22,112,409,484 bytes
  • Dataset Size:

    • Download size: 23,303,764,736 bytes
    • Total size: 26,423,092,788 bytes
  • Configuration:

    • config_name: default
    • data_files:
      • train: path data/train-*
      • validation: path data/validation-*
      • test: path data/test-*
  • License: odc‑by (Open Data Commons Attribution License)

  • Dataset Name: ASVspoof 2021 LA

Source

  • Source: ASVspoof 2021 LA subset
  • Copyright: Derived from the ASVspoof 2021 challenge; licensed under ODC‑BY and available via Zenodo.
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio