Back to datasets
Dataset assetOpen Source CommunityMachine LearningPower System Security
Electrical Substations Cybersecurity Dataset
This dataset is used to train and evaluate machine‑learning models for power substation network cybersecurity, containing network capture data for IEC 61850 and IEC 104 protocols.
Source
github
Created
Oct 6, 2024
Updated
Oct 7, 2024
Signals
178 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Research Background
This dataset is part of the research project "Training Dataset for Machine‑Learning‑Based Power Substation Intrusion Detection System" and is currently awaiting publication approval. It is intended to train and evaluate machine‑learning models for power substation network security.
Data Format
- File formats: PCAP or PCAPNG
- Sources: IEC 61850 or IEC 60870‑5‑104 (also known as IEC 104)
Data‑Processing Tools
- tshark: used for preprocessing scripts
- Sanicap: used for anonymization
- Cicflowmeter: used for feature extraction
Data Processing Workflow
- Filtering & Splitting: Use Wireshark’s tshark tool to filter and split large files into 10 GB units.
- Merging: Merge the split files.
- Anonymization: Apply Sanicap to anonymize PCAPNG files.
- CSV Generation: Extract features based on IEC 104 and IEC 61850 protocols and generate CSV files.
Dataset Usage
- Machine‑Learning Algorithm Testing: Run Python scripts to test machine‑learning algorithms on the dataset.
- Label Generation: Generate labels from the text after the final "‑" in the filename to indicate attack type or no attack.
Environment Requirements
- Python and related libraries
- GPU (optional but recommended)
Installation & Execution
- Install Tools: Include tshark, Sanicap, and Cicflowmeter.
- Install Python & Libraries: Create a Conda virtual environment and install required Python packages.
- Run IDS: Activate the environment and execute
pycaret_ids.pyto test the dataset.
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.