JUHE API Marketplace
DATASET
Open Source Community

Electrical Substations Cybersecurity Dataset

This dataset is used to train and evaluate machine‑learning models for power substation network cybersecurity, containing network capture data for IEC 61850 and IEC 104 protocols.

Updated 10/7/2024
github

Description

Dataset Overview

Research Background

This dataset is part of the research project "Training Dataset for Machine‑Learning‑Based Power Substation Intrusion Detection System" and is currently awaiting publication approval. It is intended to train and evaluate machine‑learning models for power substation network security.

Data Format

  • File formats: PCAP or PCAPNG
  • Sources: IEC 61850 or IEC 60870‑5‑104 (also known as IEC 104)

Data‑Processing Tools

  • tshark: used for preprocessing scripts
  • Sanicap: used for anonymization
  • Cicflowmeter: used for feature extraction

Data Processing Workflow

  1. Filtering & Splitting: Use Wireshark’s tshark tool to filter and split large files into 10 GB units.
  2. Merging: Merge the split files.
  3. Anonymization: Apply Sanicap to anonymize PCAPNG files.
  4. CSV Generation: Extract features based on IEC 104 and IEC 61850 protocols and generate CSV files.

Dataset Usage

  • Machine‑Learning Algorithm Testing: Run Python scripts to test machine‑learning algorithms on the dataset.
  • Label Generation: Generate labels from the text after the final "‑" in the filename to indicate attack type or no attack.

Environment Requirements

  • Python and related libraries
  • GPU (optional but recommended)

Installation & Execution

  • Install Tools: Include tshark, Sanicap, and Cicflowmeter.
  • Install Python & Libraries: Create a Conda virtual environment and install required Python packages.
  • Run IDS: Activate the environment and execute pycaret_ids.py to test the dataset.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Power System Security
Machine Learning

Source

Organization: github

Created: 10/6/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.