NSGZERO：通过神经蒙特卡洛树搜索有效地学习大规模网络安全游戏中的非开发政策

论文标题

NSGZERO：通过神经蒙特卡洛树搜索有效地学习大规模网络安全游戏中的非开发政策

NSGZero: Efficiently Learning Non-Exploitable Policy in Large-Scale Network Security Games with Neural Monte Carlo Tree Search

论文作者

Xue, Wanqi, An, Bo, Yeo, Chai Kiat

论文摘要

如何部署资源以通过网络安全游戏（NSG）建模网络中的关键目标。尽管最新的深度学习进展（DL）为处理大型NSG提供了强大的方法，但NSG-NFSP等DL方法却遭受了数据效率低下的问题。此外，由于集中控制，它们无法扩展到具有大量资源的场景。在本文中，我们提出了一种基于DL的新方法NSGZERO，以学习NSGS中的非爆炸性政策。 NSGZERO通过使用神经蒙特卡洛树（MCT）进行计划来提高数据效率。我们的主要贡献是三倍。首先，我们设计深神经网络（DNN）以在NSG中执行神经MCT。其次，我们启用具有分散控制的神经MCT，使NSGZERO适用于具有许多资源的NSG。第三，我们提供有效的学习范式，以实现NSGZERO中DNN的联合培训。与最先进的算法相比，我们的方法可实现更好的数据效率和可扩展性。

How resources are deployed to secure critical targets in networks can be modelled by Network Security Games (NSGs). While recent advances in deep learning (DL) provide a powerful approach to dealing with large-scale NSGs, DL methods such as NSG-NFSP suffer from the problem of data inefficiency. Furthermore, due to centralized control, they cannot scale to scenarios with a large number of resources. In this paper, we propose a novel DL-based method, NSGZero, to learn a non-exploitable policy in NSGs. NSGZero improves data efficiency by performing planning with neural Monte Carlo Tree Search (MCTS). Our main contributions are threefold. First, we design deep neural networks (DNNs) to perform neural MCTS in NSGs. Second, we enable neural MCTS with decentralized control, making NSGZero applicable to NSGs with many resources. Third, we provide an efficient learning paradigm, to achieve joint training of the DNNs in NSGZero. Compared to state-of-the-art algorithms, our method achieves significantly better data efficiency and scalability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题