论文标题

统一的聚集和非平板土匪

Unifying Clustered and Non-stationary Bandits

论文作者

Li, Chuanhao, Wu, Qingyun, Wang, Hongning

论文摘要

非平稳的强盗和匪徒的在线聚类提高了上下文土匪中的限制性假设,并为许多重要的现实世界情景提供了解决方案。尽管解决这两个问题的本质大大重叠,但它们已被独立研究。在本文中,我们在同质性测试概念下将这两条土匪研究连接起来,该研究无缝地解决了在统一解决方案框架中的非平稳匪徒和集群识别的更改检测。严格的遗憾分析和广泛的经验评估证明了我们提出的解决方案的价值,尤其是其在处理各种环境假设方面的灵活性。

Non-stationary bandits and online clustering of bandits lift the restrictive assumptions in contextual bandits and provide solutions to many important real-world scenarios. Though the essence in solving these two problems overlaps considerably, they have been studied independently. In this paper, we connect these two strands of bandit research under the notion of test of homogeneity, which seamlessly addresses change detection for non-stationary bandit and cluster identification for online clustering of bandit in a unified solution framework. Rigorous regret analysis and extensive empirical evaluations demonstrate the value of our proposed solution, especially its flexibility in handling various environment assumptions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源