Back to datasets
Dataset assetOpen Source CommunityCybersecurityNetwork Intrusion Detection

CICIDS2018

The dataset comprises labeled network traffic data, encompassing various attacks (e.g., DoS, brute‑force, SQL injection, botnet) and normal traffic.

Source
github
Created
Oct 2, 2024
Updated
Oct 3, 2024
Signals
830 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Information

Dataset Name

  • CICIDS2018 Dataset

Dataset Description

  • Description: This dataset contains labeled network traffic data covering multiple attack types (e.g., DoS, brute‑force, SQL injection, botnet) and normal traffic.
  • Link: The dataset can be downloaded here.
  • Size: Large dataset split into multiple CSV files, total size exceeds several hundred MB.

Dataset Usage

  • Training Data: dataset/train_data.csv
  • Test Data: dataset/test.csv
  • Training Data Version: artifacts/train_data.csv

Dataset Processing

Data Ingestion

  • Script: src/components/data_ingestion.py

Data Transformation

  • Script: src/components/data_transformation.py

Model Training

  • Script: src/components/model_trainer.py

Model Performance

Test Accuracy

  • Test Accuracy: 89.75%
  • Training Accuracy: 89.87%

F1 Score

  • Test F1 Score: 88.27%
  • Training F1 Score: 88.40%

Recall

  • Test Recall: 89.75%
  • Training Recall: 89.87%

Precision

  • Test Precision: 89.08%
  • Training Precision: 89.31%

Balanced Accuracy

  • Balanced Accuracy: 86.55%

ROC AUC

  • Test ROC AUC: 99.17%
  • Training ROC AUC: 99.21%
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio