Back to datasets
Dataset assetOpen Source CommunityMachine LearningTraffic Management
PEMS_SF UCI Machine learning dataset
The PEMS_SF UCI Machine Learning Dataset is a collection for training and testing machine learning models. It includes separate training and test files for model development and validation. Files include PEMS_train, PEMS_test, among others.
Source
github
Created
Nov 18, 2024
Updated
Nov 21, 2024
Signals
455 views
Availability
Linked source ready
Overview
Dataset description and usage context
PEMSF_Project Dataset Overview
Dataset Files
- PEMS_train: Training data file; due to its large size it is not uploaded to GitLab. It can be downloaded from the following link: PEMS_train
- PEMS_trainlabels.txt: Training data label file
- PEMS_test.txt: Test data file
- PEMS_testlabels.txt: Test data label file
- First_Day_Guess_label.txt: First‑day guess label file
- First_Day_Guess_test.txt: First‑day guess test file
- Second_Day_Guess_label.txt: Second‑day guess label file
- Second_Day_Guess_test.txt: Second‑day guess test file
- Third_Day_Guess_label.txt: Third‑day guess label file
- Third_Day_Guess_test.txt: Third‑day guess test file
- stations_list.txt: Text file containing all sensor IDs for data extraction
Code Files
- project_group2.ipynb: Python notebook for model training
- Group2_Project_Prototype.ipynb: Prototype notebook for the project
- Project_Data_Extractions.ipynb: Python notebook for extracting occupancy data from https://pems.dot.ca.gov
Usage Instructions
-
Run project_group2.ipynb:
- Download and place
PEMS_trainin the same directory as the notebook. - Download
PEMS_trainlabels.txt,PEMS_test.txt, andPEMS_testlabels.txtand ensure they are in the same directory.
- Download and place
-
Run Group2_Project_Prototype.ipynb:
- Download the notebook and the associated files (
PEMS_test.txt,PEMS_trainlabels.txt,First_Day_Guess_label.txt,First_Day_Guess_test.txt,Second_Day_Guess_label.txt,Second_Day_Guess_test.txt,Third_Day_Guess_label.txt,Third_Day_Guess_test.txt) and place them alongside the notebook.
- Download the notebook and the associated files (
-
Run Project_Data_Extractions.ipynb:
- Create an account at https://pems.dot.ca.gov and enter the username and password on lines 110‑111 of the notebook.
- Download
stations_list.txtand keep it in the notebook’s directory. - Execute the notebook to collect and preprocess occupancy sensor data into
self_test.txt.
Future Work
Project_Data_Extractions.ipynbis under development to automate the entire data collection and organization process for seamless model ingestion.
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.