JUHE API Marketplace
DATASET
Open Source Community

PaySim

The PaySim dataset contains over 6 million data points, each with 9 features, generated by the PaySim retail simulation software. It is used for fraud and anomaly detection, where fraudulent behavior simulates agents profiting by transferring funds and withdrawing cash from the system.

Updated 5/7/2024
github

Description

Dataset Overview

Dataset Name

Fraud and Anomaly Detection using Synthetic Transactional Data

Dataset Goal

Develop a method to minimize false negatives when evaluating new data points.

Dataset Source

PaySim dataset, generated by PaySim Retail Simulation Software, containing over 6 million data points.

Dataset Location

Kaggle

Dataset Features

  1. type: Transaction type, including CASH-IN, CASH-OUT, DEBIT, PAYMENT, and TRANSFER.
  2. amount: Transaction amount in local currency.
  3. nameOrig: Customer initiating the transaction.
  4. oldbalanceOrg: Original balance before the transaction.
  5. newbalanceOrig: New balance after the transaction.
  6. nameDest: Recipient customer of the transaction.
  7. oldbalanceDest: Recipient's original balance before the transaction. Note: customers prefixed with M (merchant) do not have this information.
  8. newbalanceDest: Recipient's new balance after the transaction. Note: customers prefixed with M (merchant) do not have this information.

Target Variable

  1. isFraud: Transactions conducted by fraudulent agents in the simulation.

Additional Feature

Step: Synthetic timestamp of the transaction occurrence.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Fraud Detection
Simulated Data

Source

Organization: github

Created: 2/23/2018

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.