论文标题

向我展示更多详细信息:从半结构网络数据中发现过程的层次结构

Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data

论文作者

Zhou, Shuyan, Zhang, Li, Yang, Yue, Lyu, Qing, Yin, Pengcheng, Callison-Burch, Chris, Neubig, Graham

论文摘要

过程本质上是分层的。要“制作视频”,可能需要“购买相机”,这反过来可能需要一个“设定预算”。尽管这种分层知识对于关于复杂程序的推理至关重要,但大多数现有工作都将程序视为浅层结构,而无需对亲子关系进行建模。在这项工作中,我们尝试构建基于Wikihow的程序的开放域层次知识基础(KB),该程序是一个包含超过110k的教学文章的网站,每个网站都记录了执行复杂过程的步骤。为此,我们开发了一种简单有效的方法,该方法在文章中将步骤(例如,“购买相机”)链接到具有相似目标(例如“如何选择摄像机”)的其他文章中,递归构建KB。根据自动评估,人类判断以及在下游任务(例如教学视频检索)中,我们的方法大大优于几个强大的基线。 可以在https://wikihow-hierarchy.github.io上找到具有部分数据的演示。代码和数据位于https://github.com/shuyanzhou/wikihow_hierarchy。

Procedures are inherently hierarchical. To "make videos", one may need to "purchase a camera", which in turn may require one to "set a budget". While such hierarchical knowledge is critical for reasoning about complex procedures, most existing work has treated procedures as shallow structures without modeling the parent-child relation. In this work, we attempt to construct an open-domain hierarchical knowledge-base (KB) of procedures based on wikiHow, a website containing more than 110k instructional articles, each documenting the steps to carry out a complex procedure. To this end, we develop a simple and efficient method that links steps (e.g., "purchase a camera") in an article to other articles with similar goals (e.g., "how to choose a camera"), recursively constructing the KB. Our method significantly outperforms several strong baselines according to automatic evaluation, human judgment, and application to downstream tasks such as instructional video retrieval. A demo with partial data can be found at https://wikihow-hierarchy.github.io. The code and the data are at https://github.com/shuyanzhou/wikihow_hierarchy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源