Back to datasets
Dataset assetOpen Source CommunityCredit Card Fraud DetectionTransaction Analysis

Credit Card Fraud Detection Dataset

The dataset contains transaction records of European credit‑card holders from September 2013. It comprises 284,807 transactions over two days, of which 492 are fraudulent. The dataset is highly imbalanced, with fraud (positive class) accounting for 0.172 % of all transactions. Only numerical input variables are provided, which are the result of a PCA transformation. The original features (Time and Amount) are not transformed. Time records the seconds elapsed between each transaction and the first transaction in the dataset. Amount is the transaction amount and can be used for cost‑sensitive learning. Class is the response variable, set to 1 for fraud and 0 otherwise.

Source
github
Created
Apr 14, 2020
Updated
Mar 9, 2023
Signals
664 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Name

Credit Card Fraud Detection

Dataset Description

This dataset contains transaction records of European credit‑card users from September 2013. It covers two days of transactions, with 492 fraudulent cases out of a total of 284,807 transactions. The dataset is extremely imbalanced, with fraudulent transactions (positive class) constituting 0.172 % of all records.

Dataset Features

  • Input Variables: The dataset includes only numerical variables that have been transformed via PCA. Original features and additional background information are not provided due to privacy concerns.
  • Feature Details:
    • V1–V28: Principal components obtained via PCA.
    • Time: Seconds elapsed between each transaction and the first transaction in the dataset.
    • Amount: Transaction amount, usable for example‑based cost‑sensitive learning.
    • Class: Response variable; 1 indicates fraud, 0 indicates legitimate transaction.

Data Source

The dataset was collected and analyzed jointly by Worldline and the Machine Learning Group at Université Libre de Bruxelles (ULB) for big‑data mining and fraud‑detection research.

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio