Back to datasets
Dataset assetOpen Source CommunityFraud DetectionSimulated Data
PaySim
The PaySim dataset contains over 6 million data points, each with 9 features, generated by the PaySim retail simulation software. It is used for fraud and anomaly detection, where fraudulent behavior simulates agents profiting by transferring funds and withdrawing cash from the system.
Source
github
Created
Feb 23, 2018
Updated
May 7, 2024
Signals
351 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Dataset Name
Fraud and Anomaly Detection using Synthetic Transactional Data
Dataset Goal
Develop a method to minimize false negatives when evaluating new data points.
Dataset Source
PaySim dataset, generated by PaySim Retail Simulation Software, containing over 6 million data points.
Dataset Location
Dataset Features
- type: Transaction type, including CASH-IN, CASH-OUT, DEBIT, PAYMENT, and TRANSFER.
- amount: Transaction amount in local currency.
- nameOrig: Customer initiating the transaction.
- oldbalanceOrg: Original balance before the transaction.
- newbalanceOrig: New balance after the transaction.
- nameDest: Recipient customer of the transaction.
- oldbalanceDest: Recipient's original balance before the transaction. Note: customers prefixed with M (merchant) do not have this information.
- newbalanceDest: Recipient's new balance after the transaction. Note: customers prefixed with M (merchant) do not have this information.
Target Variable
- isFraud: Transactions conducted by fraudulent agents in the simulation.
Additional Feature
Step: Synthetic timestamp of the transaction occurrence.
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.