Download PDFOpen PDF in browser

Evaluating LLM Performance on Imbalanced Event Data

EasyChair Preprint 15140

19 pagesDate: September 28, 2024

Abstract

Evaluating the performance of Large Language Models (LLMs) on imbalanced event data presents unique challenges, as these models often struggle with accurately detecting minority class events. Imbalanced datasets, where certain events are underrepresented, are common in real-world scenarios such as fraud detection, medical diagnosis, and anomaly detection. While LLMs excel in natural language processing tasks, their ability to generalize across imbalanced event distributions is less understood.

 

This study investigates the performance of LLMs in handling imbalanced event data by examining how they fare against traditional machine learning models and evaluating the effectiveness of various imbalance mitigation techniques. We assess LLMs using a range of metrics—F1-score, recall, PR-AUC, and ROC-AUC—focusing on the ability to detect minority class events. We explore both data-level (oversampling, undersampling, and augmentation) and algorithm-level (cost-sensitive learning, transfer learning) strategies to mitigate imbalances.

Keyphrases: Imbalanced Event Data, Large Language Models (LLMs), Model Interpretability, rare event detection

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:15140,
  author    = {Docas Akinyele and Godwin Olaoye},
  title     = {Evaluating LLM Performance on Imbalanced Event Data},
  howpublished = {EasyChair Preprint 15140},
  year      = {EasyChair, 2024}}
Download PDFOpen PDF in browser