通过基于扰动的正则化稳定可微分体系结构搜索

论文标题

通过基于扰动的正则化稳定可微分体系结构搜索

Stabilizing Differentiable Architecture Search via Perturbation-based Regularization

论文作者

Chen, Xiangning, Hsieh, Cho-Jui

论文摘要

可区分的体系结构搜索（飞镖）是识别体系结构的主要NAS解决方案。基于建筑空间的不断放松，Darts学习了一个可区分的建筑重量，并大大降低了搜索成本。但是，随着搜索的进行，其稳定性因产生恶化的架构而受到挑战。我们发现，在蒸馏最终体系结构时导致急剧性能下降的巨大验证损失格局是导致不稳定性的重要因素。基于此观察结果，我们提出了一种基于扰动的正则化 - 平滑事件（SDARTS），以平滑损失格局并提高基于飞镖的方法的普遍性。特别是，我们的新配方通过随机平滑或对抗性攻击来稳定基于飞镖的方法。 NAS-Bench-1Shot1上的搜索轨迹证明了我们方法的有效性，并且由于稳定性的提高，我们在4个数据集的各个搜索空间上实现了性能增长。此外，我们在数学上表明，SDARTS隐含地正规化了验证损失的Hessian规范，这说明了损失景观的更光滑和改善的绩效。

Differentiable architecture search (DARTS) is a prevailing NAS solution to identify architectures. Based on the continuous relaxation of the architecture space, DARTS learns a differentiable architecture weight and largely reduces the search cost. However, its stability has been challenged for yielding deteriorating architectures as the search proceeds. We find that the precipitous validation loss landscape, which leads to a dramatic performance drop when distilling the final architecture, is an essential factor that causes instability. Based on this observation, we propose a perturbation-based regularization - SmoothDARTS (SDARTS), to smooth the loss landscape and improve the generalizability of DARTS-based methods. In particular, our new formulations stabilize DARTS-based methods by either random smoothing or adversarial attack. The search trajectory on NAS-Bench-1Shot1 demonstrates the effectiveness of our approach and due to the improved stability, we achieve performance gain across various search spaces on 4 datasets. Furthermore, we mathematically show that SDARTS implicitly regularizes the Hessian norm of the validation loss, which accounts for a smoother loss landscape and improved performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题