论文标题
扩大最大K-plex枚举
Scaling Up Maximal k-plex Enumeration
论文作者
论文摘要
由于许多重要的应用,例如社区检测,生物图分析等,在网络上找到所有最大$ k $ plexes是图形分析中的一个基本研究问题。 $ k $ plex是一个子图,其中每个顶点都与所有顶点相邻,除了该子图中最多$ k $的顶点。在本文中,我们研究了图表的所有大型最大$ k $ - 并开发出几种新的高效技术来解决该问题的问题。具体而言,我们首先提出了几种新型的上层技术,以在枚举过程中修剪不必要的计算。我们表明,提出的上限可以在线性时间内计算。然后,我们开发了一种新的分支结构算法,并具有精心设计的枢轴重新选择策略,以枚举所有$ k $ - 杂项,从理论上讲,所有$ k $ plexes in $ o(n^2γ_k^n)$ time在理论上是$ o(N^2γ_k^n)$ time,从理论上讲,$ n $是图形和$γ_k$ clitly and partermy and partermy and partermy and partermy and partime a partermy a partime and partime and a partime and partime and partime and partime and a partime and。进一步开发以扩大规模以处理大型现实图形。最后,广泛的实验结果表明,在大多数基准图上,提出的顺序算法可以在最先进的顺序算法上实现高达$ 2 \ times $至$ 100 \ times $ speedup。结果还证明了所提出的平行算法的高扩展性。例如,在具有超过2亿个边缘的大型现实图表上,我们的平行算法可以在两分钟内完成计算,而最新的并行算法则不能在24小时内终止。
Finding all maximal $k$-plexes on networks is a fundamental research problem in graph analysis due to many important applications, such as community detection, biological graph analysis, and so on. A $k$-plex is a subgraph in which every vertex is adjacent to all but at most $k$ vertices within the subgraph. In this paper, we study the problem of enumerating all large maximal $k$-plexes of a graph and develop several new and efficient techniques to solve the problem. Specifically, we first propose several novel upper-bounding techniques to prune unnecessary computations during the enumeration procedure. We show that the proposed upper bounds can be computed in linear time. Then, we develop a new branch-and-bound algorithm with a carefully-designed pivot re-selection strategy to enumerate all $k$-plexes, which outputs all $k$-plexes in $O(n^2γ_k^n)$ time theoretically, where $n$ is the number of vertices of the graph and $γ_k$ is strictly smaller than 2. In addition, a parallel version of the proposed algorithm is further developed to scale up to process large real-world graphs. Finally, extensive experimental results show that the proposed sequential algorithm can achieve up to $2\times$ to $100\times$ speedup over the state-of-the-art sequential algorithms on most benchmark graphs. The results also demonstrate the high scalability of the proposed parallel algorithm. For example, on a large real-world graph with more than 200 million edges, our parallel algorithm can finish the computation within two minutes, while the state-of-the-art parallel algorithm cannot terminate within 24 hours.