Download PDFOpen PDF in browser

On relationships between imbalance and overlapping of datasets

10 pagesPublished: March 9, 2020

Abstract

The paper deals with problems that imbalanced and overlapping datasets often en- counter. Performance indicators as accuracy, precision and recall of imbalanced data sets, both with and without overlapping, are discussed and compared with the same performance indicators of balanced datasets with overlapping. Three popular classification algorithms, namely, Decision Tree, KNN (k-Nearest Neighbors) and SVM (Support Vector Machines) classifiers are analyzed and compared.

Keyphrases: classification algorithms, imbalance data, machine learning, overlapping classes, oversampling algorithms, undersampling algorithms

In: Gordon Lee and Ying Jin (editors). Proceedings of 35th International Conference on Computers and Their Applications, vol 69, pages 141-150.

BibTeX entry
@inproceedings{CATA2020:relationships_between_imbalance_overlapping,
  author    = {Waleed Almutairi and Ryszard Janicki},
  title     = {On relationships between imbalance and overlapping of datasets},
  booktitle = {Proceedings of 35th International Conference on Computers and Their Applications},
  editor    = {Gordon Lee and Ying Jin},
  series    = {EPiC Series in Computing},
  volume    = {69},
  publisher = {EasyChair},
  bibsource = {EasyChair, https://easychair.org},
  issn      = {2398-7340},
  url       = {/publications/paper/Xk8r},
  doi       = {10.29007/h71z},
  pages     = {141-150},
  year      = {2020}}
Download PDFOpen PDF in browser