Tackling class imbalance problem in binary classification using augmented Neighborhood cleaning algorithm

Nadyah Obaid Al Abdouli, Zeyar Aung, Wei Lee Woon, Davor Svetinovic

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Many natural processes generate some observations more frequently than others. These processes result in an imbalanced distributions which cause classifiers to bias toward the majority class because most classifiers assume a normal distribution. In order to address the problem of class imbalance, a number of data preprocessing techniques, which can be generally categorized into over-sampling and under-sampling methods, have been proposed throughout the years. The Neighborhood cleaning rule (NCL) method proposed by Laurikkala is among the most popular under-sampling methods. In this paper, we augment the original NCL algorithm by cleaning the unwanted samples using CHC evolutionary algorithm instead of a simple nearest neighborbased cleaning as in NCL. We name our augmented algorithm as NCL+. The performance of NCL+ is compared to that of NCL on 9 imbalanced datasets using 11 different classifiers. Experimental results show noticeable accuracy improvements by NCL+ over NCL. Moreover, NCL+ is also compared to another popular over-sampling method called Synthetic minority over-sampling technique (SMOTE), and is found to offer better results as well.

Original languageBritish English
Pages (from-to)827-834
Number of pages8
JournalLecture Notes in Electrical Engineering
Volume339
DOIs
StatePublished - 2015

Keywords

  • Class Imbalance
  • Data Preprocessing
  • Evolutionary Algorithm
  • Neighborhood Cleaning
  • Under-Sampling

Fingerprint

Dive into the research topics of 'Tackling class imbalance problem in binary classification using augmented Neighborhood cleaning algorithm'. Together they form a unique fingerprint.

Cite this