DATASET
Open Source Community
PEMS_SF UCI Machine learning dataset
The PEMS_SF UCI Machine Learning Dataset is a collection for training and testing machine learning models. It includes separate training and test files for model development and validation. Files include PEMS_train, PEMS_test, among others.
Updated 11/21/2024
github
Description
PEMSF_Project Dataset Overview
Dataset Files
- PEMS_train: Training data file; due to its large size it is not uploaded to GitLab. It can be downloaded from the following link: PEMS_train
- PEMS_trainlabels.txt: Training data label file
- PEMS_test.txt: Test data file
- PEMS_testlabels.txt: Test data label file
- First_Day_Guess_label.txt: First‑day guess label file
- First_Day_Guess_test.txt: First‑day guess test file
- Second_Day_Guess_label.txt: Second‑day guess label file
- Second_Day_Guess_test.txt: Second‑day guess test file
- Third_Day_Guess_label.txt: Third‑day guess label file
- Third_Day_Guess_test.txt: Third‑day guess test file
- stations_list.txt: Text file containing all sensor IDs for data extraction
Code Files
- project_group2.ipynb: Python notebook for model training
- Group2_Project_Prototype.ipynb: Prototype notebook for the project
- Project_Data_Extractions.ipynb: Python notebook for extracting occupancy data from https://pems.dot.ca.gov
Usage Instructions
-
Run project_group2.ipynb:
- Download and place
PEMS_trainin the same directory as the notebook. - Download
PEMS_trainlabels.txt,PEMS_test.txt, andPEMS_testlabels.txtand ensure they are in the same directory.
- Download and place
-
Run Group2_Project_Prototype.ipynb:
- Download the notebook and the associated files (
PEMS_test.txt,PEMS_trainlabels.txt,First_Day_Guess_label.txt,First_Day_Guess_test.txt,Second_Day_Guess_label.txt,Second_Day_Guess_test.txt,Third_Day_Guess_label.txt,Third_Day_Guess_test.txt) and place them alongside the notebook.
- Download the notebook and the associated files (
-
Run Project_Data_Extractions.ipynb:
- Create an account at https://pems.dot.ca.gov and enter the username and password on lines 110‑111 of the notebook.
- Download
stations_list.txtand keep it in the notebook’s directory. - Execute the notebook to collect and preprocess occupancy sensor data into
self_test.txt.
Future Work
Project_Data_Extractions.ipynbis under development to automate the entire data collection and organization process for seamless model ingestion.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Traffic Management
Machine Learning
Source
Organization: github
Created: 11/18/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.