论文标题
诗人:在小型设备上培训神经网络,并进行集成重新布置和分页
POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging
论文作者
论文摘要
诸如移动电话等边缘设备上的微调模型将对敏感数据启用隐私的个性化。但是,历史上,Edge训练仅限于具有简单体系结构的相对较小的模型,因为训练既是记忆和能量密集型的。我们提出了诗人,这是一种算法,可以在内存射击电池经营的边缘设备上培训大型神经网络。诗人共同优化了重新布置和分页的综合搜索搜索空间,这两种算法以减少返回传播的记忆消耗。鉴于记忆预算和运行时间的限制,我们制定了一个混合成员线性计划(MILP),以进行最佳培训。我们的方法使培训能够在嵌入式设备上显着更大的模型,同时减少能耗,同时不修改逆向传播的数学正确性。我们证明,可以在皮质类嵌入式设备的内存约束中微调RESNET-18和BERT,而在能源效率方面的表现优于当前边缘训练方法。诗人是一个开源项目,请访问https://github.com/shishirpatil/poet
Fine-tuning models on edge devices like mobile phones would enable privacy-preserving personalization over sensitive data. However, edge training has historically been limited to relatively small models with simple architectures because training is both memory and energy intensive. We present POET, an algorithm to enable training large neural networks on memory-scarce battery-operated edge devices. POET jointly optimizes the integrated search search spaces of rematerialization and paging, two algorithms to reduce the memory consumption of backpropagation. Given a memory budget and a run-time constraint, we formulate a mixed-integer linear program (MILP) for energy-optimal training. Our approach enables training significantly larger models on embedded devices while reducing energy consumption while not modifying mathematical correctness of backpropagation. We demonstrate that it is possible to fine-tune both ResNet-18 and BERT within the memory constraints of a Cortex-M class embedded device while outperforming current edge training methods in energy efficiency. POET is an open-source project available at https://github.com/ShishirPatil/poet