BNA：使用广泛可扩展体系结构的有效的神经体系结构搜索方法

论文标题

BNA：使用广泛可扩展体系结构的有效的神经体系结构搜索方法

BNAS:An Efficient Neural Architecture Search Approach Using Broad Scalable Architecture

论文作者

Ding, Zixiang, Chen, Yaran, Li, Nannan, Zhao, Dongbin, Sun, Zhiquan, Chen, C. L. Philip

论文摘要

在本文中，我们提出了广泛的神经体系结构搜索（BNA），在该搜索中，我们精心设计了广泛的可扩展建筑，称为广泛的卷积神经网络（BCNN），以解决上述问题。一方面，拟议的广泛可扩展体系结构由于其浅拓扑而具有快速的训练速度。此外，我们还采用了ENA中使用的强化学习和参数共享作为BNA的优化策略。因此，提出的方法可以提高搜索效率。另一方面，广泛的可扩展体系结构提取了多尺度特征和增强表示形式，并将其馈入全球平均池层，以产生更合理和更全面的表示。因此，可以承诺广泛的可扩展体系结构的性能。特别是，我们还为修改BCNN拓扑的BNA开发了两个变体。为了验证BNA的有效性，进行了几项实验，实验结果表明，BNA提供0.19天的时间比ENA低2.37倍，比ENA低于ENA的成本低2.37倍，而ENA在基于增强学习的NAS方法中排名最高，而相比，小型（0.5百万个参数）和中等尺寸（11千万个参数），bn-necter（3。bne），bn-necter（3。），bn shopters（3. 3.）（3. 3.） 3.24％的测试错误）在CIFAR-10上，3）仅使用39百万参数，学到的架构在Imagenet上达到了25.3％的TOP-1误差。

In this paper, we propose Broad Neural Architecture Search (BNAS) where we elaborately design broad scalable architecture dubbed Broad Convolutional Neural Network (BCNN) to solve the above issue. On one hand, the proposed broad scalable architecture has fast training speed due to its shallow topology. Moreover, we also adopt reinforcement learning and parameter sharing used in ENAS as the optimization strategy of BNAS. Hence, the proposed approach can achieve higher search efficiency. On the other hand, the broad scalable architecture extracts multi-scale features and enhancement representations, and feeds them into global average pooling layer to yield more reasonable and comprehensive representations. Therefore, the performance of broad scalable architecture can be promised. In particular, we also develop two variants for BNAS who modify the topology of BCNN. In order to verify the effectiveness of BNAS, several experiments are performed and experimental results show that 1) BNAS delivers 0.19 days which is 2.37x less expensive than ENAS who ranks the best in reinforcement learning-based NAS approaches, 2) compared with small-size (0.5 millions parameters) and medium-size (1.1 millions parameters) models, the architecture learned by BNAS obtains state-of-the-art performance (3.58% and 3.24% test error) on CIFAR-10, 3) the learned architecture achieves 25.3% top-1 error on ImageNet just using 3.9 millions parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题