speech-recognition-dataset

This dataset consists of video recordings of people uttering different phrases. It is based on the State University of Nizhny Novgorod in Russia and is unique because it contains a Russian phrase library. Most of the phrases used in the dataset come from classic Russian literature and other publicly available texts. Participants sat in front of a phone or laptop screen and spoke the phrases from various distances. Each person in a video utters a specific phrase from the total phrase list. Videos are recorded in mp4 format.

Updated 5/13/2023

github

Description

Dataset Overview

Name: Speech recognition dataset
Content: Contains video recordings of people reading different sentences, mainly from Russian literary works and other public texts.
Features: The dataset is unique, containing a database of Russian sentences.
Video format: mp4

Current Status

Number of speakers: 46
Number of video recordings: 1194
Number of sentences: 221

Organization

File naming format: {speakerID}.{sentenceID}.mp4, e.g., 43.168.mp4
Sentence text: Included in a file named “Фразы”

Access

Download link: Yandex.Disk

License

Type: Creative Commons Attribution 4.0 International License
Link: Creative Commons License

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Please login to view download links and access full dataset details.

Topics

Speech Recognition

Russian

Source

Organization: github

Created: 5/13/2020

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.

Check Prices →