论文标题
半监督神经建筑搜索
Semi-Supervised Neural Architecture Search
论文作者
论文摘要
神经体系结构搜索(NAS)依靠一个良好的控制器来生成更好的体系结构或预测给定体系结构的准确性。但是,培训控制器需要丰富和高质量的架构及其准确性,同时评估建筑并获得其准确性是昂贵的。在本文中,我们提出了Seminas,这是一种半监督的NAS方法,该方法利用了许多未标记的架构(没有评估,因此几乎没有成本)。具体而言,Seminas 1)训练一组架构 - 准确性数据对的初始精度预测器; 2)使用训练有素的精度预测器来预测大量架构的准确性(没有评估); 3)将生成的数据对添加到原始数据中,以进一步改善预测变量。训练有素的精度预测指标可以通过预测候选体系结构的准确性来应用于各种NAS算法。 Seminas具有两个优点:1)在相同的准确性保证下,它降低了计算成本。在NASBENCH-101基准数据集上,它仅使用1/7 Architecture-Accuracy Pairs实现了可比的精度。 2)在相同的计算成本下,它可以达到更高的准确性。它在NASBENCH-101上实现了94.02%的测试精度,在使用相同数量的体系结构时,其表现优于所有基线。在ImageNet上,它使用4个GPU-DAYS搜索可实现23.5%的TOP-1错误率(低于600m的FLOPS约束)。我们进一步将其应用于语音任务,并在低资源设置中达到97%的可理解率和鲁棒性设置的测试错误率15%,分别比基线提高了9%,分别提高了7%。
Neural architecture search (NAS) relies on a good controller to generate better architectures or predict the accuracy of given architectures. However, training the controller requires both abundant and high-quality pairs of architectures and their accuracy, while it is costly to evaluate an architecture and obtain its accuracy. In this paper, we propose SemiNAS, a semi-supervised NAS approach that leverages numerous unlabeled architectures (without evaluation and thus nearly no cost). Specifically, SemiNAS 1) trains an initial accuracy predictor with a small set of architecture-accuracy data pairs; 2) uses the trained accuracy predictor to predict the accuracy of large amount of architectures (without evaluation); and 3) adds the generated data pairs to the original data to further improve the predictor. The trained accuracy predictor can be applied to various NAS algorithms by predicting the accuracy of candidate architectures for them. SemiNAS has two advantages: 1) It reduces the computational cost under the same accuracy guarantee. On NASBench-101 benchmark dataset, it achieves comparable accuracy with gradient-based method while using only 1/7 architecture-accuracy pairs. 2) It achieves higher accuracy under the same computational cost. It achieves 94.02% test accuracy on NASBench-101, outperforming all the baselines when using the same number of architectures. On ImageNet, it achieves 23.5% top-1 error rate (under 600M FLOPS constraint) using 4 GPU-days for search. We further apply it to LJSpeech text to speech task and it achieves 97% intelligibility rate in the low-resource setting and 15% test error rate in the robustness setting, with 9%, 7% improvements over the baseline respectively.