论文标题
基于不确定性的工业工具磨损分析的基于不确定性的人类
An Uncertainty-based Human-in-the-loop System for Industrial Tool Wear Analysis
论文作者
论文摘要
卷积神经网络已证明可以在图像分割任务上实现卓越的性能。但是,作为黑盒系统运行的卷积神经网络通常并没有提供有关其决策信心的可靠衡量标准。这导致了工业环境中的各种问题,以及其他模型输出中用户的信任程度不足,以及与当前的政策指南(例如欧盟AI策略)的不合规。为了解决这些问题,我们使用基于蒙特 - 卡洛辍学的不确定性度量在人类的系统中,以提高系统的透明度和性能。特别是,我们演示了上述在加工行业中现实世界中多级图像分割任务上所描述的好处。在先前的工作之后,我们表明预测的质量与模型的不确定性相关。此外,我们证明了使用模型的不确定性作为自变量的多个线性回归显着解释了预测的质量(\(r^2 = 0.718 \))。在基于不确定性的人类在环境系统中,多重回归旨在识别图像级别的失败预测。该系统利用人类专家手动标记这些失败的预测。一项仿真研究表明,与基于随机的人类在环境系统相比,基于不确定性的人类在环境系统中,人类参与的性能提高了性能。为了确保概括性,我们表明所提出的方法在公开可用的CityScapes数据集上取得了相似的结果。
Convolutional neural networks have shown to achieve superior performance on image segmentation tasks. However, convolutional neural networks, operating as black-box systems, generally do not provide a reliable measure about the confidence of their decisions. This leads to various problems in industrial settings, amongst others, inadequate levels of trust from users in the model's outputs as well as a non-compliance with current policy guidelines (e.g., EU AI Strategy). To address these issues, we use uncertainty measures based on Monte-Carlo dropout in the context of a human-in-the-loop system to increase the system's transparency and performance. In particular, we demonstrate the benefits described above on a real-world multi-class image segmentation task of wear analysis in the machining industry. Following previous work, we show that the quality of a prediction correlates with the model's uncertainty. Additionally, we demonstrate that a multiple linear regression using the model's uncertainties as independent variables significantly explains the quality of a prediction (\(R^2=0.718\)). Within the uncertainty-based human-in-the-loop system, the multiple regression aims at identifying failed predictions on an image-level. The system utilizes a human expert to label these failed predictions manually. A simulation study demonstrates that the uncertainty-based human-in-the-loop system increases performance for different levels of human involvement in comparison to a random-based human-in-the-loop system. To ensure generalizability, we show that the presented approach achieves similar results on the publicly available Cityscapes dataset.