Framework for the Classification of Imbalanced Structured Data Using Under-sampling and Convolutional Neural Network

Yoon Sang Lee, Chulhwan Chris Bang

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Among machine learning techniques, classification techniques are useful for various business applications, but classification algorithms perform poorly with imbalanced data. In this study, we propose a classification technique with improved binary classification performance on both the minority and majority classes of imbalanced structured data. The proposed framework is composed of three steps. In the first step, a balanced training set is created via under-sampling. Then, each example is converted into an image depicting a line graph. In the last step, a Convolutional Neural Network (CNN) is trained using the images. In the experiments, we selected six datasets from the UCI Repository and applied the proposed framework to them. The proposed model achieved the best receiver operating characteristic (ROC) curve and Balanced Accuracy (BA) on all the datasets and five datasets, respectively. This demonstrates that the combination of under-sampling and CNNs is a viable approach for imbalanced structure data classification.

Original languageEnglish
Pages (from-to)1795-1809
Number of pages15
JournalInformation Systems Frontiers
Volume24
Issue number6
DOIs
StatePublished - Dec 2022

Scopus Subject Areas

  • Theoretical Computer Science
  • Software
  • Information Systems
  • Computer Networks and Communications

Keywords

  • Class imbalance
  • Convolutional neural network
  • Deep learning
  • Structured data

Fingerprint

Dive into the research topics of 'Framework for the Classification of Imbalanced Structured Data Using Under-sampling and Convolutional Neural Network'. Together they form a unique fingerprint.

Cite this