论文标题
独立抽样的秘书问题
The Secretary Problem with Independent Sampling
论文作者
论文摘要
在秘书问题中,我们面临着具有价值的在线元素。看到一个元素后,我们必须做出不可撤销的承担或决定的决定。目标是最大化选择最大值元素的概率。问题的最经典版本是,元素以随机顺序到达且其值是任意的。但是,通过改变可用信息,出现了新的有趣问题。同样,到达顺序是对抗性而不是随机的情况,导致文献中已经考虑的有趣变体。 在本文中,我们研究了随机秩序和对抗秩序秘书的问题,并具有额外的扭曲。值是任意的,但是在启动在线序列之前,我们用固定的概率$ p $独立采样了每个元素。采样元素成为我们的信息或历史记录集,并且游戏在其余元素上进行。我们将这些问题称为随机订单秘书问题,其中$ p $ -smpling(Short Ros $ p $)和$ p $ -smpling的对抗订单秘书问题(AOS $ p $ for Short)。我们的主要结果是获得有关问题的最佳算法和$ p $的所有值。随着$ p $的增长到1,所获得的保证会汇总到完整信息案例中的最佳保证。在对抗顺序设置中,最佳算法原来是一种简单的固定阈值算法,其中最佳阈值仅是$ p $的函数。在随机顺序设置中,我们证明,最佳算法的特征是固定的时间阈值序列,这决定了在此时间点,我们应该开始接受一个既是在线序列的最大值,又在采样元素中具有给定排名的值。
In the secretary problem we are faced with an online sequence of elements with values. Upon seeing an element we have to make an irrevocable take-it-or-leave-it decision. The goal is to maximize the probability of picking the element of maximum value. The most classic version of the problem is that in which the elements arrive in random order and their values are arbitrary. However, by varying the available information, new interesting problems arise. Also the case in which the arrival order is adversarial instead of random leads to interesting variants that have been considered in the literature. In this paper we study both the random order and adversarial order secretary problems with an additional twist. The values are arbitrary, but before starting the online sequence we independently sample each element with a fixed probability $p$. The sampled elements become our information or history set and the game is played over the remaining elements. We call these problems the random order secretary problem with $p$-sampling (ROS$p$ for short) and the adversarial order secretary problem with $p$-sampling (AOS$p$ for short). Our main result is to obtain best possible algorithms for both problems and all values of $p$. As $p$ grows to 1 the obtained guarantees converge to the optimal guarantees in the full information case. In the adversarial order setting, the best possible algorithm turns out to be a simple fixed threshold algorithm in which the optimal threshold is a function of $p$ only. In the random order setting we prove that the best possible algorithm is characterized by a fixed sequence of time thresholds, dictating at which point in time we should start accepting a value that is both a maximum of the online sequence and has a given ranking within the sampled elements.