An over-sampling technique with rejection for imbalanced class learning

Jaedong Lee, Noo Ri Kim, Jee Hyong Lee

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

27 Scopus citations

Abstract

Imbalanced data situation is that there are unequal distributions of data samples between different classes. It usually poses a challenge to any classification methods as it becomes hard to learn and predict the minority class samples since there are too small number of minority instances compare to majority instances. One of approaches for imbalanced class problems is to oversample by generating synthetic samples around given minority instances based on their nearest neighbors, so that the numbers of major and minor instances are balanced. However, if nearest neighbors are wrongly chosen, it may cause overfitting or underfitting problems. We propose a novel oversampling method for efficiently handling imbalanced data problems. Our proposed method generates synthetic samples and decides whether to reject or accept it by considering the location of the synthetic samples. With our proposed method, we have observed the outperformed results obtained within the framework of real world imbalanced datasets. In addition, our proposed method is not sensitive to how to choose nearest neighbors for generating synthetic samples as much as the existing approaches for imbalance problem.

Original languageEnglish
Title of host publicationACM IMCOM 2015 - Proceedings
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450333771
DOIs
StatePublished - 8 Jan 2015
Event9th International Conference on Ubiquitous Information Management and Communication, ACM IMCOM 2015 - Bali, Indonesia
Duration: 8 Jan 201510 Jan 2015

Publication series

NameACM IMCOM 2015 - Proceedings

Conference

Conference9th International Conference on Ubiquitous Information Management and Communication, ACM IMCOM 2015
Country/TerritoryIndonesia
CityBali
Period8/01/1510/01/15

Keywords

  • Data distribution
  • Imbalanced Problem
  • Rejection Rule
  • Synthetic Minority Oversampling Technique

Fingerprint

Dive into the research topics of 'An over-sampling technique with rejection for imbalanced class learning'. Together they form a unique fingerprint.

Cite this