DATASET
Open Source Community
Credit Approval Dataset
This dataset contains attributes A1 to A16, including continuous and categorical data, with missing values represented by "?". The target variable is binary classification, appearing as "+" or "-" in attribute A16.
Updated 4/23/2020
github
Description
数据集概述
数据集名称
- Title / Machine Learning Model Test
- Credit Approval Dataset
数据集来源
- UCI Machine Learning Lab
- UCI data repository
数据集链接
数据集内容
- 属性数量:16个(A1至A16)
- 数据类型:包含连续型和非连续型数据
- 缺失值:存在以“?”表示的缺失值
- 目标数据:二元分类,以“+”和“-”形式出现在A16中
数据预处理步骤
- 数据清洗:处理以“?”表示的缺失值
- 数据类型转换:将A2, A3, A8, A11, A14, A15的连续型数据转换为float类型
- 目标数据转换:将A16中的“+”和“-”转换为0和1
模型评估方法
- 10-fold交叉验证
- 评估指标:Accuracy, Precision, Recall, F-1 Score
模型评估结果
- 最高准确度:Random Forest模型,达到0.902
- 最高精确度:Random Forest模型,达到0.855
- 最高召回率:CNN模型,达到0.870
- 最高F1分数:CNN模型,达到0.830
结论
- 最优模型:Random Forest, CNN, Decision Tree
- 性能较低模型:Multilayer Perceptron
参考文献
- Bhukya, D. and Ramachandram, S. (2010). Decision Tree Induction: An Approach for Data Classification Using AVL-Tree. International Journal of Computer and Electrical Engineering, pp.660-665.
- Chen, L. and Tang, H. (2004). Improved computation of beliefs based on confusion matrix for combining multiple classifiers. Electronics Letters, 40(4), p.238.
- Fourie, C. (2003). Deep learning? What deep learning?. South African Journal of Higher Education, 17(1).
- Koo, I., Lee, N. and Kil, R. (2008). Parameterized cross-validation for nonlinear regression models. Neurocomputing, 71(16-18), pp.3089-3095.
- Mantas, C., Castellano, J., Moral-García, S. and Abellán, J. (2018). A comparison of random forest based algorithms: random credal random forest versus oblique random forest. Soft Computing, 23(21), pp.10739-10754.
- Mühlenbein, H. (1990). Limitations of multi-layer perceptron networks - steps towards genetic neural networks. Parallel Computing, 14(3), pp.249-260.
- Uchida, K., Tanaka, M. and Okutomi, M. (2018). Coupled convolution layer for convolutional neural network. Neural Networks, 105, pp.197-205.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Credit Approval
Binary Classification
Source
Organization: github
Created: 12/16/2019
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.