Ensemble Approaches for Class Imbalance Problem: A Review

International Journal of Research in Signal Processing, Computing & Communication System Design

Volume 5 Issue 1 & 2

Published: 2019
Author(s) Name: Anjana Gosain and Arushi Gupta | Author(s) Affiliation: Department of Information Technology, USICT, GGSIP University, Dwarka, Delhi, India.

Locked

Subscribed

Available for All

Abstract

In data mining, performing classification for skewed data distribution is a challenging problem. Traditional Classification Techniques (TCT) work efficiently in classifying data having symmetric distribution, as their internal design favors the balanced datasets. The Class Imbalance Problem (CIP) take place when number of instances of one class outnumbers instances of other classes. Some factors that contribute towards this imbalancing are noisy data, borderline samples, degree of class overlapping, small disjuncts, etc. In machine learning, ensembles are basically built to improve the performance and correctness of single classifier by training multiple classifiers to form the results that output the correct single class label. In this paper, our aim is to review ensemble learning methods having two-class problem. We propose different levels for ensemble learning methods that are at data level, at algorithm level and according to the base classifier.

Keywords: Bagging, Boosting, Classification, Class imbalance problem, Oversampling, Skewed data distribution, Undersampling.

View PDF

Welcome Guest

Ensemble Approaches for Class Imbalance Problem: A Review

International Journal of Research in Signal Processing, Computing & Communication System Design

Volume 5 Issue 1 & 2

Abstract