论文标题
II-20:图像收集的智能和务实的分析分类
II-20: Intelligent and pragmatic analytic categorization of image collections
论文作者
论文摘要
我们介绍了II-20(Image Insight 2020),这是一种多媒体分析方法,用于分析图像收集。存在图像收集的高级可视化,但是它们需要与机器模型紧密整合以支持分析分类。直接采用计算机视觉和交互式学习技术吸引了搜索。但是,分析分类不是机器分类(两者之间的差异称为务实的差距):人类添加/重新定义/删除相关性类别以建立洞察力,而机器分类器是刚性和非适应性的。将用户带入洞察力的分析分类需要一个灵活的机器模型,该模型允许在探索搜索轴上进行动态滑动以及语义交互。 II-20为图像收集和缩小务实差距的多媒体分析带来了3个主要贡献。首先,一个密切关注用户交互并动态建模其相关性类别的机器模型。 II-20的模型除了匹配和超过最新状态W之外。 r。 t。相关性,允许用户在探索 - 搜索轴上动态滑动,而无需其他侧面输入。其次,与模型协同作用的动态,1形图像tetris隐喻。它允许该模型通过用户的最小交互作用来分析集合,并补充经典的网格隐喻。第三,快速前进的互动,使用户可以利用模型快速扩展相关性的类别,扩展了多媒体分析语义互动词典。自动化实验表明,II-20的模型的表现优于艺术的状态,并且还展示了俄罗斯方块的分析质量。用户研究证实,II-20是一种直观,高效且有效的多媒体分析工具。
We introduce II-20 (Image Insight 2020), a multimedia analytics approach for analytic categorization of image collections. Advanced visualizations for image collections exist, but they need tight integration with a machine model to support analytic categorization. Directly employing computer vision and interactive learning techniques gravitates towards search. Analytic categorization, however, is not machine classification (the difference between the two is called the pragmatic gap): a human adds/redefines/deletes categories of relevance on the fly to build insight, whereas the machine classifier is rigid and non-adaptive. Analytic categorization that brings the user to insight requires a flexible machine model that allows dynamic sliding on the exploration-search axis, as well as semantic interactions. II-20 brings 3 major contributions to multimedia analytics on image collections and towards closing the pragmatic gap. Firstly, a machine model that closely follows the user's interactions and dynamically models her categories of relevance. II-20's model, in addition to matching and exceeding the state of the art w. r. t. relevance, allows the user to dynamically slide on the exploration-search axis without additional input from her side. Secondly, the dynamic, 1-image-at-a-time Tetris metaphor that synergizes with the model. It allows the model to analyze the collection by itself with minimal interaction from the user and complements the classic grid metaphor. Thirdly, the fast-forward interaction, allowing the user to harness the model to quickly expand ("fast-forward") the categories of relevance, expands the multimedia analytics semantic interaction dictionary. Automated experiments show that II-20's model outperforms the state of the art and also demonstrate Tetris's analytic quality. User studies confirm that II-20 is an intuitive, efficient, and effective multimedia analytics tool.