JUHE API Marketplace
High Quality Data

Dataset Hub

Explore high-quality datasets for your AI and machine learning projects.

Sort:

Browse by Category

Jetlime/NF-CSE-CIC-IDS2018-v2

Cybersecurity
Intrusion Detection

The NF‑CSE‑CIC‑IDS2018‑v2 dataset is a NetFlow version derived from the original CSE‑CIC‑IDS2018 pcaps, intended for network intrusion detection systems. It includes 18,893,708 flow records, of which 2,258,141 (11.95 %) are attack samples and 16,635,567 (88.05 %) are benign. The dataset is stratified by attack type and split into training (95 %) and testing (5 %) sets. Features include source/destination IPs, ports, protocol, byte/packet counts, flow duration, and many derived statistics. ## Dataset Structure - **Classes**: Benign, BruteForce, Bot, DoS, DDoS, Infiltration, Web Attacks, etc. - **Feature List**: Includes fields such as IPV4_SRC_ADDR, IPV4_DST_ADDR, L4_SRC_PORT, PROTOCOL, IN_BYTES, OUT_BYTES, FLOW_DURATION_MILLISECONDS, TCP_FLAGS, and many others. - **Splits**: Train (≈17.9 M samples), Test (≈0.94 M samples). The dataset is publicly available for academic research; commercial use requires author permission.

hugging_face
View Details

CSIC 2010 Dataset

Cybersecurity
Intrusion Detection

The project uses the CSIC 2010 Dataset, a comprehensive collection of HTTP request logs that includes both normal and malicious traffic. It is designed for network intrusion detection research and contains various attack types such as SQL injection, buffer overflow, and directory traversal.

github
View Details

NSL-KDD

Cybersecurity
Intrusion Detection

NSL‑KDD is an improved dataset designed to address several inherent problems of the KDD99 dataset. It removes redundant records from the training set, eliminates duplicate records from the test set, and provides a moderate number of records, enabling consistent and comparable evaluation across different research works.

github
View Details

Jetlime/NF-UNSW-NB15-v2

Cybersecurity
Intrusion Detection

The NF‑UNSW‑NB15‑v2 dataset is an extension of the UNSW‑NB15 dataset in NetFlow format, adding extra NetFlow features and labeling corresponding attack categories. It contains 2,390,275 flows, of which 95,053 (3.98%) are attack samples and 2,295,222 (96.02%) are benign. Attack samples are divided into nine sub‑categories: Fuzzers, Analysis, Backdoor, DoS, Exploits, Generic, Reconnaissance, Shellcode, and Worms. The dataset is primarily used for network‑traffic intrusion detection system research.

hugging_face
View Details