论文标题
层次图像分类的语义指导级别类别混合预测网络
Semantic Guided Level-Category Hybrid Prediction Network for Hierarchical Image Classification
论文作者
论文摘要
分层分类(HC)将每个对象分配为多个标签,这些标签组织为层次结构。现有的基于深度学习的HC方法通常可以预测从根节点开始直到达到叶节点的实例。但是,在现实世界中,由于噪声,遮挡,模糊或低分辨率干扰的图像可能无法在下属级别提供足够的信息。为了解决这个问题,我们提出了一个新型的语义指导级别类别混合预测网络(SGLCHPN),该网络可以以端到端的方式共同执行级别和类别预测。 SGLCHPN包含两个模块:一种视觉变压器,该模块从输入图像中提取向量,以及一个使用类别单词嵌入作为查询的语义引导的交叉意见模块来指导学习类别特定于学习类别的表示。为了评估所提出的方法,我们构建了两个新数据集,其中图像在质量广泛,因此根据层次结构的个人质量在层次结构中标记为不同的级别(深度)。实验结果证明了我们提出的HC方法的有效性。
Hierarchical classification (HC) assigns each object with multiple labels organized into a hierarchical structure. The existing deep learning based HC methods usually predict an instance starting from the root node until a leaf node is reached. However, in the real world, images interfered by noise, occlusion, blur, or low resolution may not provide sufficient information for the classification at subordinate levels. To address this issue, we propose a novel semantic guided level-category hybrid prediction network (SGLCHPN) that can jointly perform the level and category prediction in an end-to-end manner. SGLCHPN comprises two modules: a visual transformer that extracts feature vectors from the input images, and a semantic guided cross-attention module that uses categories word embeddings as queries to guide learning category-specific representations. In order to evaluate the proposed method, we construct two new datasets in which images are at a broad range of quality and thus are labeled to different levels (depths) in the hierarchy according to their individual quality. Experimental results demonstrate the effectiveness of our proposed HC method.