用于增强缓存无人机网络的缓存位置和资源分配

论文标题

用于增强缓存无人机网络的缓存位置和资源分配

Caching Placement and Resource Allocation for Cache-Enabling UAV NOMA Networks

论文作者

Zhang, Tiankui, Wang, Ziduan, Liu, Yuanwei, Xu, Wenjun, Nallanathan, Arumugam

论文摘要

本文研究了具有巨大的访问能力，由非正交多重访问（NOMA）支持，具有巨大的访问能力。移动无人机基站为地面用户提供了大量的多媒体内容，该基地可以在无线链接链路流量卸载中缓存一些流行的内容。在增强缓存的无人机网络中，内容缓存阶段的缓存位置和内容输送阶段的无线电资源分配对于网络性能至关重要。为了应对在实际情况下的动态无人机位置和内容请求，我们制定了长期的缓存位置和资源分配优化问题，以最大程度地减少作为马尔可夫决策过程（MDP）。无人机充当采取行动以缓存放置和资源分配的代理，其中包括用户计划内容请求和NOMA用户的功率分配。为了解决MDP，我们提出了一个基于Q学习的缓存位置和资源分配算法，无人机在其中使用\ emph {soft $ {\ varepsilon} $ - greedy}策略来搜索动作与状态之间的最佳匹配策略。由于Q学习的动作状态表大小随动态网络中的状态数量而增长，因此我们提出了一种基于函数近似算法，并结合了随机梯度下降和深神经网络的组合，该算法适用于大型网络。最后，数值结果表明，与基准算法相比，所提出的算法提供了相当大的性能，并在网络性能和计算复杂性之间取消了权衡。

This article investigates the cache-enabling unmanned aerial vehicle (UAV) cellular networks with massive access capability supported by non-orthogonal multiple access (NOMA). The delivery of a large volume of multimedia contents for ground users is assisted by a mobile UAV base station, which caches some popular contents for wireless backhaul link traffic offloading. In cache-enabling UAV NOMA networks, the caching placement of content caching phase and radio resource allocation of content delivery phase are crucial for network performance. To cope with the dynamic UAV locations and content requests in practical scenarios, we formulate the long-term caching placement and resource allocation optimization problem for content delivery delay minimization as a Markov decision process (MDP). The UAV acts as an agent to take actions for caching placement and resource allocation, which includes the user scheduling of content requests and the power allocation of NOMA users. In order to tackle the MDP, we propose a Q-learning based caching placement and resource allocation algorithm, where the UAV learns and selects action with \emph{soft ${\varepsilon}$-greedy} strategy to search for the optimal match between actions and states. Since the action-state table size of Q-learning grows with the number of states in the dynamic networks, we propose a function approximation based algorithm with combination of stochastic gradient descent and deep neural networks, which is suitable for large-scale networks. Finally, the numerical results show that the proposed algorithms provide considerable performance compared to benchmark algorithms, and obtain a trade-off between network performance and calculation complexity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题