DATASET
Open Source Community
Daily and Sports Activities
This dataset originates from the UCI Machine Learning Repository and contains data on daily sports activities. The dataset was collected using multiple sensors, capturing movement data from various body parts, and underwent complex preprocessing steps such as feature extraction and normalization, intended for training and testing machine learning models.
Updated 12/2/2022
github
Description
Dataset Overview
Dataset Name
- Name: sports_activities_dataset
- Source: UCI ML repository
- Link: Daily and Sports Activities
Dataset Files
- Main file: csir_cdri_test.ipynb
- Auxiliary files: prediction-pca.ipynb, pytorch-model.ipynb
Data Preprocessing
- Input: Patient activity data with 5‑second windows, containing 125 observations, each with 45 features.
- Processing steps:
- Step 1: 225 features, including min, max, mean, skewness, and kurtosis of five units for each of the 9 sensor axes.
- Step 2: 225 features representing the top 5 DFT peak magnitudes of five units for each of the 9 sensor axes.
- Step 3: 225 features indicating the frequencies corresponding to the peaks from Step 2.
- Step 4: 495 features representing 11 selected autocorrelation values of the time series.
- Output: 1,170 features; each feature file is normalized to the range [0,1] and includes patient ID and activity ID.
Dataset Applications
- Model testing: Includes two parts, one using a 9,120 × 1,170 matrix, the other using a 9,120 × 30 matrix.
- PCA application: PCA performed on the original matrix, but did not achieve the expected results.
Model Performance
Actual dataset (9,120 × 1,172)
- Activity prediction:
- Best model: Gradient Boosting Classifier, Accuracy: 0.9368
- Other models: Bagging Classifier, Random Forest Classifier, ExtraTrees Classifier, Decision Tree
- Patient and activity prediction:
- Best model: Bagging Classifier, Accuracy: 0.8245
- Other models: Gradient Boosting Classifier, Random Forest Classifier, Decision Tree, kNN (k=3)
PCA dataset (9,120 × 32)
- Activity prediction:
- Best model: ExtraTrees Classifier, Accuracy: 0.8767
- Other models: Gradient Boosting Classifier, Random Forest Classifier, Bagging Classifier, Neural Networks (DNN)
- Neural Network (PyTorch):
- Best model: With Adam Optimizer, Accuracy: 0.8951
- Other models: With Adam Optimizer + Karpathy constant, With RMSProp optimizer, With SGD Optimizer
- Neural Network (scikit‑learn):
- Best model: MLP with Adam + ReLU, Accuracy: 0.8065
- Other models: MLP with Adam + Sigmoid
Additional task (PyTorch on PCA dataset)
- Model: With RMSProp optimizer, Accuracy: 0.5043
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Sports Activity Data
Machine Learning
Source
Organization: github
Created: 1/15/2018
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.