多视图动作识别的协作注意机制

论文标题

多视图动作识别的协作注意机制

Collaborative Attention Mechanism for Multi-View Action Recognition

论文作者

Bai, Yue, Tao, Zhiqiang, Wang, Lichen, Li, Sheng, Yin, Yu, Fu, Yun

论文摘要

多视图动作识别（MVAR）利用不同观点的互补时间信息来提高学习绩效。获得信息性特定的表示表示在MVAR中起着至关重要的作用。注意已被广泛用作发现时间数据潜在的歧视性线索的有效策略。但是，大多数现有的MVAR方法仅利用注意力来分别提取每个视图的表示，而忽略了基于注意空间中相互支持的信息来挖掘潜在模式的潜力。为此，我们提出了一种协作性注意机制（CAM），以解决本文中的MVAR问题。提出的CAM检测到多视图之间的注意力差异，并自适应地整合框架级信息以相互受益。具体而言，我们将长期的短期内存（LSTM）扩展到共同的RNN（MAR），以实现多视图协作过程。 CAM具有特定于视图的注意力模式的优势来指导另一种视图并发现潜在的信息，而潜在信息很难探索。它铺平了一种新的方式来利用注意力信息并增强多视图表示学习。四个动作数据集上的广泛实验说明了所提出的摄像机为每种视图带来更好的结果，并提高了多视图性能。

Multi-view action recognition (MVAR) leverages complementary temporal information from different views to improve the learning performance. Obtaining informative view-specific representation plays an essential role in MVAR. Attention has been widely adopted as an effective strategy for discovering discriminative cues underlying temporal data. However, most existing MVAR methods only utilize attention to extract representation for each view individually, ignoring the potential to dig latent patterns based on mutual-support information in attention space. To this end, we propose a collaborative attention mechanism (CAM) for solving the MVAR problem in this paper. The proposed CAM detects the attention differences among multi-view, and adaptively integrates frame-level information to benefit each other. Specifically, we extend the long short-term memory (LSTM) to a Mutual-Aid RNN (MAR) to achieve the multi-view collaboration process. CAM takes advantages of view-specific attention pattern to guide another view and discover potential information which is hard to be explored by itself. It paves a novel way to leverage attention information and enhances the multi-view representation learning. Extensive experiments on four action datasets illustrate the proposed CAM achieves better results for each view and also boosts multi-view performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题