Reduced Robust Random Cut Forest for out-of-Distribution Detection in Machine Learning Models

EasyChair Preprint 7247

8 pages•Date: December 20, 2021

Abstract

Most machine learning based regressors extract information from data collected via past observations of limited length to make predictions in the future. Consequently, when input to these trained models is data with significantly different statistical properties from data used for training, there is no guarantee of accurate prediction. Consequently, using these models on out-of-distribution input data may result in a completely different predicted outcome from the desired one, which is not only erroneous but can also be hazardous in some cases. Successful deployment of these machine learning models in any system requires a detection system, which should be able to distinguish between out-of-distribution and in-distribution data (i.e. similar to training data). In this paper, we introduce a novel approach for this detection process using Reduced Robust Random Cut Forest (RRRCF) data structure, which can be used on both small and large data sets. Similar to the Robust Random Cut Forest (RRCF), RRRCF is a structured, but reduced representation of the training data sub-space in form of cut-trees. Empirical results of this method on both low and high dimensional data showed that inference about data being in/out of training distribution can be made efficiently and the model is easy to train with no difficult hyper-parameter tuning. The paper discusses two different use-cases for testing and validating results.

Keyphrases: CARLA, Interpretable intelligence, Random Cut Forest, Reinforcement Learning, cyber-physical system, machine learning, out-of-distribution detection, robust random cut forest

Links:

https://easychair.org/publications/preprint/5pz6

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:7247,
  author    = {Harsh Vardhan and Janos Sztipanovits},
  title     = {Reduced Robust Random Cut Forest for out-of-Distribution Detection in Machine Learning Models},
  howpublished = {EasyChair Preprint 7247},
  year      = {EasyChair, 2021}}

Download PDF Open PDF in browser