JUHE API Marketplace
DATASET
Open Source Community

Daily and Sports Activities

This dataset originates from the UCI Machine Learning Repository and contains data on daily sports activities. The dataset was collected using multiple sensors, capturing movement data from various body parts, and underwent complex preprocessing steps such as feature extraction and normalization, intended for training and testing machine learning models.

Updated 12/2/2022
github

Description

Dataset Overview

Dataset Name

Dataset Files

  • Main file: csir_cdri_test.ipynb
  • Auxiliary files: prediction-pca.ipynb, pytorch-model.ipynb

Data Preprocessing

  • Input: Patient activity data with 5‑second windows, containing 125 observations, each with 45 features.
  • Processing steps:
    • Step 1: 225 features, including min, max, mean, skewness, and kurtosis of five units for each of the 9 sensor axes.
    • Step 2: 225 features representing the top 5 DFT peak magnitudes of five units for each of the 9 sensor axes.
    • Step 3: 225 features indicating the frequencies corresponding to the peaks from Step 2.
    • Step 4: 495 features representing 11 selected autocorrelation values of the time series.
  • Output: 1,170 features; each feature file is normalized to the range [0,1] and includes patient ID and activity ID.

Dataset Applications

  • Model testing: Includes two parts, one using a 9,120 × 1,170 matrix, the other using a 9,120 × 30 matrix.
  • PCA application: PCA performed on the original matrix, but did not achieve the expected results.

Model Performance

Actual dataset (9,120 × 1,172)

  • Activity prediction:
    • Best model: Gradient Boosting Classifier, Accuracy: 0.9368
    • Other models: Bagging Classifier, Random Forest Classifier, ExtraTrees Classifier, Decision Tree
  • Patient and activity prediction:
    • Best model: Bagging Classifier, Accuracy: 0.8245
    • Other models: Gradient Boosting Classifier, Random Forest Classifier, Decision Tree, kNN (k=3)

PCA dataset (9,120 × 32)

  • Activity prediction:
    • Best model: ExtraTrees Classifier, Accuracy: 0.8767
    • Other models: Gradient Boosting Classifier, Random Forest Classifier, Bagging Classifier, Neural Networks (DNN)
  • Neural Network (PyTorch):
    • Best model: With Adam Optimizer, Accuracy: 0.8951
    • Other models: With Adam Optimizer + Karpathy constant, With RMSProp optimizer, With SGD Optimizer
  • Neural Network (scikit‑learn):
    • Best model: MLP with Adam + ReLU, Accuracy: 0.8065
    • Other models: MLP with Adam + Sigmoid

Additional task (PyTorch on PCA dataset)

  • Model: With RMSProp optimizer, Accuracy: 0.5043

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Sports Activity Data
Machine Learning

Source

Organization: github

Created: 1/15/2018

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.