Dataset assetOpen Source CommunityVehicle ClaimAudit Analysis

Vehicle Claim

This dataset is synthetic data created from the DVI dataset for vehicle claim auditing. It includes attributes such as vehicle make, model, color, registration year, body type, mileage, engine size, transmission type, fuel type, price, seat count, door count, damage type, specific damage, repair complexity, repair hours, and repair cost.

Source

github

Created

Sep 26, 2022

Updated

Dec 24, 2022

Signals

159 views

Availability

Linked source ready

Overview

Dataset description and usage context

Dataset Overview

Dataset List

Vehicle Claim - Synthetic dataset created using the DVI dataset.
Car Insurance - Dataset from Kaggle, link: Car Insurance.
Vehicle Insurance - Dataset from Github, link: Vehicle Insurance.

Vehicle Claim dataset details

Creation code: Code to create the dataset.
Dataset storage location: Dataset storage location.
Attribute list:
- Maker - Categorical, vehicle make.
- GenModel - Categorical, vehicle model.
- Color - Categorical, vehicle color.
- Reg_Year - Categorical, registration year.
- Body_Type - Categorical, e.g., SUV, Convertible.
- Runned_Miles - Numerical, vehicle mileage.
- Engin_Size - Categorical, engine size.
- GearBox - Categorical, automatic or manual.
- FuelType - Categorical, gasoline or diesel.
- Price - Numerical, vehicle price.
- Seat_num - Numerical, number of seats.
- Door_num - Numerical, number of doors.
- issue - Categorical, damage type.
- issue_id - Categorical, specific damage.
- repair_complexity - Categorical, repair difficulty.
- repair_hours - Numerical, repair time required.
- repair_cost - Numerical, repair cost.

Training and Evaluation Parameters

Training parameters:
- dataset - Training dataset selection (vehicle_claims, car_insurance, vehicle_insurance).
- data - Data type (normal or mixed).
- encoding - Categorical feature encoding method.
- numerical - Whether to use only numerical features.
- batch_size - Batch size.
- epoch - Number of training epochs.
- latent_dim - Latent space dimension.
Evaluation parameters:
- threshold - Evaluation threshold.

Citation

Paper citation:

@article{ Author = {Ajay Chawda and Stefanie Grimm and Marius Kloft}, Title = {Unsupervised Anomaly detection for Auditing Data and Impact of Cetgorical Encodings}, Journal = {https://arxiv.org/abs/2210.14056}, Year = {2022}, }

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio