DATASET
Open Source Community
speech-commands
This dataset is used to train speech recognition models and contains 35 words divided into numeric, directional, command, animal, and other categories.
Updated 10/25/2024
github
Description
Simple Speech Recognition System
Dataset
- Dataset Name:
speech-commands - Number of Recognizable Words: 35
- Word Categories:
- Numeric: zero, one, two, three, four, five, six, seven, eight, nine
- Directional: left, right, forward, backward, up, down
- Command: go, stop, yes, no, on, off, follow
- Animal: bird, cat, dog
- Other: bed, house, happy, tree, wow, learn, visual, sheila, marvin
Model Files
- Model File:
speech_commands_model_epoch_20_9621--64mel.pth - Test Set Accuracy: 96.05%
Training Code
- Training Code File:
train.py - Function: Train a speech recognition model using the
speech-commandsdataset
Inference Code
- Inference Code File:
Inference.ipynb - Functions:
- Recognize the word corresponding to a single
.wavaudio file - Recognize the words corresponding to all
.wavfiles in a folder - Record for 2 seconds and recognize the spoken word
- Continuously record and recognize a series of spoken words, providing the (start time, end time) for each word
- Recognize the word corresponding to a single
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Speech Recognition
Word Classification
Source
Organization: github
Created: 10/25/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.