稳健性保证模式估算，并应用于土匪

论文标题

稳健性保证模式估算，并应用于土匪

Robustness Guarantees for Mode Estimation with an Application to Bandits

论文作者

Pacchiano, Aldo, Jiang, Heinrich, Jordan, Michael I.

论文摘要

模式估计是统计数据中的经典问题，在机器学习中具有广泛的应用。尽管如此，在可能的对抗数据污染下，其鲁棒性属性几乎没有理解。在本文中，我们提供了精确的鲁棒性保证以及在简单的随机化下提供的隐私保证。然后，我们引入了一个多军匪徒的理论，其中值是奖励分布而不是均值的模式。我们证明，遗憾的是保证顶级识别，顶级M臂识别，上下文模态匪徒和无限连续的手臂恢复的问题。我们在模拟中表明，我们的算法可以通过对抗性噪声序列对武器的扰动进行鲁棒，从而使模态匪徒成为奖励可能具有异常值或对抗性腐败的情况下的诱人选择。

Mode estimation is a classical problem in statistics with a wide range of applications in machine learning. Despite this, there is little understanding in its robustness properties under possibly adversarial data contamination. In this paper, we give precise robustness guarantees as well as privacy guarantees under simple randomization. We then introduce a theory for multi-armed bandits where the values are the modes of the reward distributions instead of the mean. We prove regret guarantees for the problems of top arm identification, top m-arms identification, contextual modal bandits, and infinite continuous arms top arm recovery. We show in simulations that our algorithms are robust to perturbation of the arms by adversarial noise sequences, thus rendering modal bandits an attractive choice in situations where the rewards may have outliers or adversarial corruptions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题