论文标题
网络持续图的还原算法:Coraltda和Prunit
Reduction Algorithms for Persistence Diagrams of Networks: CoralTDA and PrunIT
论文作者
论文摘要
拓扑数据分析(TDA)提供了有关传统方法无法访问的数据的内在属性的宝贵和互补信息。但是,高计算成本仍然是阻碍TDA在实际研究中成功应用的主要障碍,尤其是在大型复杂网络上的机器学习中。 确实,大多数现代网络(例如引文,区块链和在线社交网络)通常具有数十万个顶点,这使得现有的TDA方法的应用不可行。我们开发了两个新的,非常简单但有效的算法,以计算大图的确切持久图,以解决这一主要TDA限制。首先,我们证明$(k+1)$ - 图形的核心$ \ MATHCAL {g} $足以计算其$ k^{th} $持久图,$ pd_k(\ Mathcal {g})$。其次,我们引入了用于图形的修剪算法,以通过删除主导的顶点来计算其持久图。我们在大型网络上的实验表明,我们的新方法可以实现高达95%的计算增长。 开发的框架提供了图理论与TDA之间的第一个桥梁,并在机器学习中的应用中提供了大型复杂网络的应用。我们的实施可从https://github.com/cakcora/persistenthomologywithcoralprunit获得
Topological data analysis (TDA) delivers invaluable and complementary information on the intrinsic properties of data inaccessible to conventional methods. However, high computational costs remain the primary roadblock hindering the successful application of TDA in real-world studies, particularly with machine learning on large complex networks. Indeed, most modern networks such as citation, blockchain, and online social networks often have hundreds of thousands of vertices, making the application of existing TDA methods infeasible. We develop two new, remarkably simple but effective algorithms to compute the exact persistence diagrams of large graphs to address this major TDA limitation. First, we prove that $(k+1)$-core of a graph $\mathcal{G}$ suffices to compute its $k^{th}$ persistence diagram, $PD_k(\mathcal{G})$. Second, we introduce a pruning algorithm for graphs to compute their persistence diagrams by removing the dominated vertices. Our experiments on large networks show that our novel approach can achieve computational gains up to 95%. The developed framework provides the first bridge between the graph theory and TDA, with applications in machine learning of large complex networks. Our implementation is available at https://github.com/cakcora/PersistentHomologyWithCoralPrunit