论文标题
扩展的随机块模型,并应用于犯罪网络
Extended Stochastic Block Models with Application to Criminal Networks
论文作者
论文摘要
网络数据中节点之间可靠学习组的结构在几种应用程序中具有挑战性。通过研究编码犯罪分子之间关系的秘密网络,我们特别动机。这些数据可能会遇到测量错误,并表现出可能揭示犯罪组织关键架构的核心,分类和分离结构的复杂组合。这些嘈杂的块模式的共存限制了常规使用的社区检测算法的可靠性,并且需要扩展基于模型的解决方案,以现实地表征节点分区过程,从节点属性中纳入信息,并为估算和不确定性量化提供了改进的策略。为了弥补这些差距,我们开发了一类新的扩展随机块模型(ESBM),这些模型(ESBM)通过在分区过程中通过Gibbs型先验推断出具有共同连通性模式的节点。这种选择涵盖了许多犯罪网络的现实先验,涵盖了固定,随机和无限数量可能的群体的解决方案,并促进了以原则上的方式纳入节点属性。在我们班级的新替代方案中,我们将重点放在Gnedin过程中,这是一个现实的先验,使小组的数量是有限的,随机的,并且与犯罪网络相一致。为整个ESBM类提出了崩溃的Gibbs采样器,并概述了估计,预测,不确定性量化和模型选择的精致策略。 ESBM性能在逼真的模拟和意大利黑手党网络的应用中进行了说明,在该网络中,我们揭开了关键的复杂块结构,主要是从最新的替代方案中隐藏的。
Reliably learning group structures among nodes in network data is challenging in several applications. We are particularly motivated by studying covert networks that encode relationships among criminals. These data are subject to measurement errors, and exhibit a complex combination of an unknown number of core-periphery, assortative and disassortative structures that may unveil key architectures of the criminal organization. The coexistence of these noisy block patterns limits the reliability of routinely-used community detection algorithms, and requires extensions of model-based solutions to realistically characterize the node partition process, incorporate information from node attributes, and provide improved strategies for estimation and uncertainty quantification. To cover these gaps, we develop a new class of extended stochastic block models (ESBM) that infer groups of nodes having common connectivity patterns via Gibbs-type priors on the partition process. This choice encompasses many realistic priors for criminal networks, covering solutions with fixed, random and infinite number of possible groups, and facilitates the inclusion of node attributes in a principled manner. Among the new alternatives in our class, we focus on the Gnedin process as a realistic prior that allows the number of groups to be finite, random and subject to a reinforcement process coherent with criminal networks. A collapsed Gibbs sampler is proposed for the whole ESBM class, and refined strategies for estimation, prediction, uncertainty quantification and model selection are outlined. The ESBM performance is illustrated in realistic simulations and in an application to an Italian mafia network, where we unveil key complex block structures, mostly hidden from state-of-the-art alternatives.