图书馆学习的自上而下的合成

论文标题

图书馆学习的自上而下的合成

Top-Down Synthesis for Library Learning

论文作者

Bowers, Matthew, Olausson, Theo X., Wong, Lionel, Grand, Gabriel, Tenenbaum, Joshua B., Ellis, Kevin, Solar-Lezama, Armando

论文摘要

本文引入了语料库指导的自上而下的合成，作为合成库函数的机制，可从域特定语言（DSL）中捕获程序的共同功能。该算法使用中间抽象的句法模式匹配直接从初始DSL原始图中构建抽象，从而智能修剪搜索空间并指导该算法朝着在语料库中最大程度地捕获共享结构的抽象。我们在一种名为Stitch的工具中介绍了该方法的实现，并根据Dreamcoder的最先进的演绎图书馆学习算法对其进行了评估。我们的评估表明，针迹的数量级比3-4个数量级，并且在保持可比较或更好的库质量（通过压缩性测量）的同时，使用2个数量级。我们还展示了针迹对Corpora的可伸缩性，其中包含数百个复杂的程序，这些程序与先前的演绎方法相处棘手，并从经验上表明，尽早终止搜索程序是强大的 - 进一步允许其通过早期停止来扩展到具有挑战性的数据集。

This paper introduces corpus-guided top-down synthesis as a mechanism for synthesizing library functions that capture common functionality from a corpus of programs in a domain specific language (DSL). The algorithm builds abstractions directly from initial DSL primitives, using syntactic pattern matching of intermediate abstractions to intelligently prune the search space and guide the algorithm towards abstractions that maximally capture shared structures in the corpus. We present an implementation of the approach in a tool called Stitch and evaluate it against the state-of-the-art deductive library learning algorithm from DreamCoder. Our evaluation shows that Stitch is 3-4 orders of magnitude faster and uses 2 orders of magnitude less memory while maintaining comparable or better library quality (as measured by compressivity). We also demonstrate Stitch's scalability on corpora containing hundreds of complex programs that are intractable with prior deductive approaches and show empirically that it is robust to terminating the search procedure early -- further allowing it to scale to challenging datasets by means of early stopping.

下载PDF全文

下载文献需遵守相关版权规定

论文标题