Handling class imbalance in customer behavior prediction

Nengbao Liu, Wei Lee Woon, Zeyar Aung, Afshin Afshari

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

Class imbalance is a common problem in real world applications and it affects significantly the prediction accuracy. In this study, investigation on better handling class imbalance problem in customer behavior prediction is performed. Using a more appropriate evaluation metric (AUC), we investigated the increase of performance for under-sampling and two machine learning algorithms (weight Random Forests and RUSBoost) against a benchmark case of just using Random Forests. Results show that under-sampling is the most effective way to deal with class imbalance. RUSBoost, as a specific algorithm designed to deal with class imbalance problem, is also effective but not as good as under-sampling. Weighted Random Forests, as a cost-sensitive learner, only improves the performance of appetency classification problem out of three classification problems.

Original languageBritish English
Title of host publication2014 International Conference on Collaboration Technologies and Systems, CTS 2014
PublisherIEEE Computer Society
Pages100-103
Number of pages4
ISBN (Print)9781479951567
DOIs
StatePublished - 2014
Event2014 15th International Conference on Collaboration Technologies and Systems, CTS 2014 - Minneapolis, MN, United States
Duration: 19 May 201423 May 2014

Publication series

Name2014 International Conference on Collaboration Technologies and Systems, CTS 2014

Conference

Conference2014 15th International Conference on Collaboration Technologies and Systems, CTS 2014
Country/TerritoryUnited States
CityMinneapolis, MN
Period19/05/1423/05/14

Keywords

  • Class imbalance
  • Customer behavior
  • Prediction
  • Random forests
  • RUSBoost
  • Under-sampling

Fingerprint

Dive into the research topics of 'Handling class imbalance in customer behavior prediction'. Together they form a unique fingerprint.

Cite this