从概念漂移到模型退化：有关性能吸引漂移检测器的概述

论文标题

从概念漂移到模型退化：有关性能吸引漂移检测器的概述

From Concept Drift to Model Degradation: An Overview on Performance-Aware Drift Detectors

论文作者

Bayram, Firas, Ahmed, Bestoun S., Kassler, Andreas

论文摘要

现实世界系统的动态性对部署的预测机学习（ML）模型构成了重大挑战。训练ML模型的系统的变化可能会导致系统生命周期期间的性能退化。研究非平稳环境的最新进展主要集中在识别和解决由称为概念漂移现象引起的这种变化。文献中已使用不同的术语来指代相同类型的概念漂移和各种类型的术语。缺乏统一的术语是为了区分不同概念漂移变体而造成混乱。在本文中，我们首先通过其数学定义对概念漂移类型进行分组，并调查文献中用于构建该领域的合并分类法的不同术语。我们还审查并分类了过去十年中提出的基于绩效的概念漂移检测方法。这些方法利用预测模型的性能退化来表示系统中的实质变化。分类在层次图中概述，以提供方法之间的有序导航。我们对跟踪和评估模型在预测系统中的性能的主要属性和策略进行了全面分析。本文通过讨论开放研究挑战和可能的研究方向结束。

The dynamicity of real-world systems poses a significant challenge to deployed predictive machine learning (ML) models. Changes in the system on which the ML model has been trained may lead to performance degradation during the system's life cycle. Recent advances that study non-stationary environments have mainly focused on identifying and addressing such changes caused by a phenomenon called concept drift. Different terms have been used in the literature to refer to the same type of concept drift and the same term for various types. This lack of unified terminology is set out to create confusion on distinguishing between different concept drift variants. In this paper, we start by grouping concept drift types by their mathematical definitions and survey the different terms used in the literature to build a consolidated taxonomy of the field. We also review and classify performance-based concept drift detection methods proposed in the last decade. These methods utilize the predictive model's performance degradation to signal substantial changes in the systems. The classification is outlined in a hierarchical diagram to provide an orderly navigation between the methods. We present a comprehensive analysis of the main attributes and strategies for tracking and evaluating the model's performance in the predictive system. The paper concludes by discussing open research challenges and possible research directions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题