论文标题
与上下文有关的自我激发点过程:高维度的模型,方法和风险界限
Context-dependent self-exciting point processes: models, methods, and risk bounds in high dimensions
论文作者
论文摘要
高维自回旋点过程模型时事事件如何触发或抑制未来的事件,例如社交网络的一个活动会影响其邻居的未来活动。尽管过去的工作重点是基于网络每个节点上发生的事件发生的时间来估算基础网络结构,但本文研究了估计上下文依赖性网络的更细微的问题,这些网络反映了与事件相关的特征(例如社交媒体帖子的内容)如何调节节点之间的影响力强度。具体而言,我们利用组成时间序列和机器学习中的正则化方法的想法来对高维标记的点过程进行网络估计。详细考虑了两个模型和相应的估计量:一种自回归的多项式模型,适用于分类标记和适用于不同类别中具有混合成员的标记的物流正常模型。重要的是,逻辑正常模型会导致凸面负模样目标,并捕获跨类别的依赖性。我们为两个估计器提供了理论保证,我们通过模拟和合成数据生成模型来验证。我们通过两个真实的数据示例进一步验证了我们的方法,并证明了两种方法的优点和缺点。
High-dimensional autoregressive point processes model how current events trigger or inhibit future events, such as activity by one member of a social network can affect the future activity of his or her neighbors. While past work has focused on estimating the underlying network structure based solely on the times at which events occur on each node of the network, this paper examines the more nuanced problem of estimating context-dependent networks that reflect how features associated with an event (such as the content of a social media post) modulate the strength of influences among nodes. Specifically, we leverage ideas from compositional time series and regularization methods in machine learning to conduct network estimation for high-dimensional marked point processes. Two models and corresponding estimators are considered in detail: an autoregressive multinomial model suited to categorical marks and a logistic-normal model suited to marks with mixed membership in different categories. Importantly, the logistic-normal model leads to a convex negative log-likelihood objective and captures dependence across categories. We provide theoretical guarantees for both estimators, which we validate by simulations and a synthetic data-generating model. We further validate our methods through two real data examples and demonstrate the advantages and disadvantages of both approaches.