扬声器诊断是在线学习问题的全面学习问题

论文标题

扬声器诊断是在线学习问题的全面学习问题

Speaker Diarization as a Fully Online Learning Problem in MiniVox

论文作者

Lin, Baihan, Zhang, Xinxin

论文摘要

我们提出了一个新颖的机器学习框架，以进行实时多演讲者的诊断和识别，而无需事先注册和在完全在线学习环境中进行预处理。我们的贡献是两个方面。首先，我们提出了一个新的基准测试，以评估很少研究的完全在线扬声器诊断问题。我们建立在现有真实世界话语的现有数据集上，以自动策划Minivox，这是一个实验环境，该环境生成了连续多演讲者语音流的无限配置。其次，我们考虑了在线学习的实用问题，并以情节揭示的奖励，并基于半监督和自我监督的学习方法引入了解决方案。此外，我们提供了一个可行的基于Web的识别系统，该系统通过将旧武器的表示形式转移到具有可扩展上下文强盗的新武器的新武器中，从而交互处理新用户的添加的冷启动问题。我们证明了我们提出的方法在在线微型框架框架中获得了强劲的性能。

We proposed a novel machine learning framework to conduct real-time multi-speaker diarization and recognition without prior registration and pretraining in a fully online learning setting. Our contributions are two-fold. First, we proposed a new benchmark to evaluate the rarely studied fully online speaker diarization problem. We built upon existing datasets of real world utterances to automatically curate MiniVox, an experimental environment which generates infinite configurations of continuous multi-speaker speech stream. Second, we considered the practical problem of online learning with episodically revealed rewards and introduced a solution based on semi-supervised and self-supervised learning methods. Additionally, we provided a workable web-based recognition system which interactively handles the cold start problem of new user's addition by transferring representations of old arms to new ones with an extendable contextual bandit. We demonstrated that our proposed method obtained robust performance in the online MiniVox framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题