Back to datasets
Dataset assetOpen Source CommunityParkinson’s Disease DiagnosisSpeech Analysis

Parkinsons Disease Speech Dataset

This dataset originates from the University of Oxford and contains 195 instances, of which 147 are Parkinson's disease patients and 48 are non‑patients. It includes 22 features such as frequency, pitch, amplitude/period of the waveform, etc., and a label where 1 indicates Parkinson's disease and 0 indicates non‑Parkinson's.

Source
github
Created
Dec 17, 2023
Updated
Dec 17, 2023
Signals
190 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Name

A Machine Learning Approach for the Diagnosis of Parkinson's Disease via Speech Analysis

Research Time

March 2022

Dataset Source

University of Oxford

Dataset Composition

  • Number of Instances: 195
    • 147 Parkinson's subjects
    • 48 without Parkinson's
  • Number of Features: 22
    • Includes features such as frequency, pitch, amplitude/period of the waveform, etc.
  • Label: 1 represents Parkinson’s, 0 represents non‑Parkinson’s

Algorithms Used

  1. Logistic Regression (LR)
  2. Linear Discriminant Analysis (LDA)
  3. k Nearest Neighbors (KNN)
  4. Decision Tree (DT)
  5. Neural Network (NN)
  6. Naive Bayes (NB)
  7. Gradient Boost (GB)

Engineering Goal

Develop a machine‑learning model for Parkinson’s diagnosis, achieving at least 90 % accuracy and/or a Matthews Correlation Coefficient of at least 0.9.

Data Analysis Results

After rebalancing the dataset, a 75‑25 train‑test split yielded the best performance. K‑Nearest Neighbors and Neural Network achieved a maximum accuracy of 98 %.

Conclusion

The project demonstrates that machine learning significantly improves Parkinson’s diagnosis compared with existing methods, achieving 98 % accuracy, which is crucial for effective treatment.

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio