自然语言扎根导航的环境敏锐的多任务学习

论文标题

自然语言扎根导航的环境敏锐的多任务学习

Environment-agnostic Multitask Learning for Natural Language Grounded Navigation

论文作者

Wang, Xin Eric, Jain, Vihan, Ie, Eugene, Wang, William Yang, Kozareva, Zornitsa, Ravi, Sujith

论文摘要

最近的研究工作使在照片真实环境中，例如按照自然语言说明或对话框进行自然语言基础导航的研究。但是，现有的方法倾向于在可见的环境中过度拟合培训数据，并且在以前看不见的环境中无法很好地概括。为了缩小可见环境和看不见的环境之间的差距，我们旨在从两个新颖的角度学习通用导航模型：（1）我们引入了一个多任务导航模型，该模型可以在视觉导航（VLN）（VLN）和对话对话历史记录（NDH）任务（从Richer Natural Language Guidance指导和有效的跨越跨越的任务中受益的）任务中进行无缝培训，并从视觉导航（VLN）进行导航; （2）我们建议学习在训练过程中看到的环境中不变的导航策略的环境不足表示，从而更好地概括了看不见的环境。广泛的实验表明，环境不合时宜的多任务学习显着降低了可见环境和看不见的环境之间的性能差距，而训练的导航剂因此在看不见的环境上胜过16％的基线（相对衡量了成功率），而在NDH上，VLN和120％（目标进度）。我们提交给CVDN排行榜的提交为Holdout测试集中的NDH任务建立了新的最新技术。代码可在https://github.com/google-research/valan上找到。

Recent research efforts enable study for natural language grounded navigation in photo-realistic environments, e.g., following natural language instructions or dialog. However, existing methods tend to overfit training data in seen environments and fail to generalize well in previously unseen environments. To close the gap between seen and unseen environments, we aim at learning a generalized navigation model from two novel perspectives: (1) we introduce a multitask navigation model that can be seamlessly trained on both Vision-Language Navigation (VLN) and Navigation from Dialog History (NDH) tasks, which benefits from richer natural language guidance and effectively transfers knowledge across tasks; (2) we propose to learn environment-agnostic representations for the navigation policy that are invariant among the environments seen during training, thus generalizing better on unseen environments. Extensive experiments show that environment-agnostic multitask learning significantly reduces the performance gap between seen and unseen environments, and the navigation agent trained so outperforms baselines on unseen environments by 16% (relative measure on success rate) on VLN and 120% (goal progress) on NDH. Our submission to the CVDN leaderboard establishes a new state-of-the-art for the NDH task on the holdout test set. Code is available at https://github.com/google-research/valan.

下载PDF全文

下载文献需遵守相关版权规定

论文标题