论文标题

Biconvex聚类

Biconvex Clustering

论文作者

Chakraborty, Saptarshi, Xu, Jason

论文摘要

由于其具有吸引力的理论和计算特性,凸聚类最近引起了人们的兴趣,但面对高维数据,其优点受到了限制。在这种情况下,依赖$ k $ neart邻居的成对亲和力术语的指定较差,而欧几里得的拟合度措施则提供了较弱的歧视能力。为了克服这些问题,我们建议修改凸聚类目标,以便与质心共同优化特征权重。由此产生的问题变成了Biconvex,因此在统计和算法上仍然保持良好的行为。特别是,我们得出了具有封闭形式更新和收敛保证的快速算法,并在其预测误差上建立有限样本界限。在可解释的规律性条件下,错误结合分析意味着提出的估计器的一致性。 BICONVEX聚类在整个聚类任务中执行特征选择:随着学习的权重改变有效的特征表示,可以在迭代中自适应地更新成对亲密,而不是在可疑的特征空间中预先计算。我们验证了对真实和模拟数据的贡献,这表明我们的方法有效地解决了维数的挑战,同时降低了对现有方法典型的精心调整的启发式方法的依赖。

Convex clustering has recently garnered increasing interest due to its attractive theoretical and computational properties, but its merits become limited in the face of high-dimensional data. In such settings, pairwise affinity terms that rely on $k$-nearest neighbors become poorly specified and Euclidean measures of fit provide weaker discriminating power. To surmount these issues, we propose to modify the convex clustering objective so that feature weights are optimized jointly with the centroids. The resulting problem becomes biconvex, and as such remains well-behaved statistically and algorithmically. In particular, we derive a fast algorithm with closed form updates and convergence guarantees, and establish finite-sample bounds on its prediction error. Under interpretable regularity conditions, the error bound analysis implies consistency of the proposed estimator. Biconvex clustering performs feature selection throughout the clustering task: as the learned weights change the effective feature representation, pairwise affinities can be updated adaptively across iterations rather than precomputed within a dubious feature space. We validate the contributions on real and simulated data, showing that our method effectively addresses the challenges of dimensionality while reducing dependence on carefully tuned heuristics typical of existing approaches.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源