基于演员批评的学习网络切片中的零连接联合资源和能源控制

论文标题

基于演员批评的学习网络切片中的零连接联合资源和能源控制

Actor-Critic-Based Learning for Zero-touch Joint Resource and Energy Control in Network Slicing

论文作者

Rezazadeh, Farhad, Chergui, Hatim, Christofi, Loizos, Verikoukis, Christos

论文摘要

为了利用超越5G（B5G）通信系统的全部潜力，零接触网络切片（NS）被视为有希望的完全自动化的管理和编排（MANO）系统。本文提出了一种新颖的知识平面（KP）基于MANO的框架，该框架可容纳和利用最近的NS技术，并被称为KB5G。具体而言，我们旨在考虑KB5G中的算法创新和人工智能（AI）。我们调用了一种连续的无模型深钢筋学习（DRL）方法，以最大程度地减少能耗和虚拟网络功能（VNF）实例化成本。我们提出了一种新型的基于参与者的NS方法，用于稳定研究，称为双含双Q软批评者（TDSAC）方法。 TDSAC使中央单位（CU）能够不断学习，以积累过去学到的知识，以最大程度地降低未来的NS成本。最后，我们提出了数值结果，以展示采用方法的增益，并在能源消耗，CPU利用率和时间效率方面验证性能。

To harness the full potential of beyond 5G (B5G) communication systems, zero-touch network slicing (NS) is viewed as a promising fully-automated management and orchestration (MANO) system. This paper proposes a novel knowledge plane (KP)-based MANO framework that accommodates and exploits recent NS technologies and is termed KB5G. Specifically, we deliberate on algorithmic innovation and artificial intelligence (AI) in KB5G. We invoke a continuous model-free deep reinforcement learning (DRL) method to minimize energy consumption and virtual network function (VNF) instantiation cost. We present a novel Actor-Critic-based NS approach to stabilize learning called, twin-delayed double-Q soft Actor-Critic (TDSAC) method. The TDSAC enables central unit (CU) to learn continuously to accumulate the knowledge learned in the past to minimize future NS costs. Finally, we present numerical results to showcase the gain of the adopted approach and verify the performance in terms of energy consumption, CPU utilization, and time efficiency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题