论文标题
统一的聚集和非平板土匪
Unifying Clustered and Non-stationary Bandits
论文作者
论文摘要
非平稳的强盗和匪徒的在线聚类提高了上下文土匪中的限制性假设,并为许多重要的现实世界情景提供了解决方案。尽管解决这两个问题的本质大大重叠,但它们已被独立研究。在本文中,我们在同质性测试概念下将这两条土匪研究连接起来,该研究无缝地解决了在统一解决方案框架中的非平稳匪徒和集群识别的更改检测。严格的遗憾分析和广泛的经验评估证明了我们提出的解决方案的价值,尤其是其在处理各种环境假设方面的灵活性。
Non-stationary bandits and online clustering of bandits lift the restrictive assumptions in contextual bandits and provide solutions to many important real-world scenarios. Though the essence in solving these two problems overlaps considerably, they have been studied independently. In this paper, we connect these two strands of bandit research under the notion of test of homogeneity, which seamlessly addresses change detection for non-stationary bandit and cluster identification for online clustering of bandit in a unified solution framework. Rigorous regret analysis and extensive empirical evaluations demonstrate the value of our proposed solution, especially its flexibility in handling various environment assumptions.