Abstract
Among machine learning techniques, classification techniques are useful for various business applications, but classification algorithms perform poorly with imbalanced data. In this study, we propose a classification technique with improved binary classification performance on both the minority and majority classes of imbalanced structured data. The proposed framework is composed of three steps. In the first step, a balanced training set is created via under-sampling. Then, each example is converted into an image depicting a line graph. In the last step, a Convolutional Neural Network (CNN) is trained using the images. In the experiments, we selected six datasets from the UCI Repository and applied the proposed framework to them. The proposed model achieved the best receiver operating characteristic (ROC) curve and Balanced Accuracy (BA) on all the datasets and five datasets, respectively. This demonstrates that the combination of under-sampling and CNNs is a viable approach for imbalanced structure data classification.
Original language | English |
---|---|
Pages (from-to) | 1795-1809 |
Number of pages | 15 |
Journal | Information Systems Frontiers |
Volume | 24 |
Issue number | 6 |
DOIs | |
State | Published - Dec 2022 |
Scopus Subject Areas
- Theoretical Computer Science
- Software
- Information Systems
- Computer Networks and Communications
Keywords
- Class imbalance
- Convolutional neural network
- Deep learning
- Structured data