Back to datasets
Dataset assetOpen Source CommunityNetwork Traffic AnalysisIoT Security
IoT-23
The IoT‑23 dataset, provided by the Stratosphere Laboratory, contains labeled IoT network traffic covering both benign and malicious flows.
Source
github
Created
Apr 17, 2024
Updated
Apr 17, 2024
Signals
1,143 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Dataset Name
- IoT-23
Source
- Provider: Stratosphere Laboratory
Content
- Labeled IoT network traffic, including both benign and malicious flows.
- The lightweight version excludes PCAP files and contains only the labeled flows.
Size
- Compressed download: 8.8 GB
- Extracted size: ~44 GB
Use Cases
- Research on network‑traffic anomaly detection and classification.
- After download and extraction, data are processed with specific Python scripts.
Processing Steps
- Data Extraction: Extract traffic from scenario files, producing separate benign and attack flow files.
- Data Shuffling: Split large files and shuffle randomly to improve sample reliability.
Dependencies
- Programming language: Python 3.8.8
- ML tools: scikit‑learn 0.24.1
- Scientific computing: NumPy 1.19.5
- Data analysis: pandas 1.2.2
- Visualization: matplotlib 3.3.4, seaborn 0.11.1
- System info: psutil 5.8.0
- Model serialization: pickle
Configuration
- Config file:
config.py(set dataset and experiment paths). - Config check: run the provided script to verify correctness.
Experiments
- Demo: Quick verification with 10 000 records.
- Full: Process >20 million records (≈24 h).
- Custom: TODO
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.