Parkinsons Disease Speech Dataset
This dataset originates from the University of Oxford and contains 195 instances, of which 147 are Parkinson's disease patients and 48 are non‑patients. It includes 22 features such as frequency, pitch, amplitude/period of the waveform, etc., and a label where 1 indicates Parkinson's disease and 0 indicates non‑Parkinson's.
Description
Dataset Overview
Dataset Name
A Machine Learning Approach for the Diagnosis of Parkinson's Disease via Speech Analysis
Research Time
March 2022
Dataset Source
University of Oxford
Dataset Composition
- Number of Instances: 195
- 147 Parkinson's subjects
- 48 without Parkinson's
- Number of Features: 22
- Includes features such as frequency, pitch, amplitude/period of the waveform, etc.
- Label: 1 represents Parkinson’s, 0 represents non‑Parkinson’s
Algorithms Used
- Logistic Regression (LR)
- Linear Discriminant Analysis (LDA)
- k Nearest Neighbors (KNN)
- Decision Tree (DT)
- Neural Network (NN)
- Naive Bayes (NB)
- Gradient Boost (GB)
Engineering Goal
Develop a machine‑learning model for Parkinson’s diagnosis, achieving at least 90 % accuracy and/or a Matthews Correlation Coefficient of at least 0.9.
Data Analysis Results
After rebalancing the dataset, a 75‑25 train‑test split yielded the best performance. K‑Nearest Neighbors and Neural Network achieved a maximum accuracy of 98 %.
Conclusion
The project demonstrates that machine learning significantly improves Parkinson’s diagnosis compared with existing methods, achieving 98 % accuracy, which is crucial for effective treatment.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: github
Created: 12/17/2023
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.