论文标题

在半监督学习中以拉普拉斯的正则化来克服维度的诅咒

Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning

论文作者

Cabannes, Vivien, Pillaud-Vivien, Loucas, Bach, Francis, Rudi, Alessandro

论文摘要

由于数据的注释在大规模的实际问题中可能会稀缺,因此利用未标记的示例是机器学习最重要的方面之一。这是半监督学习的目的。为了从访问未标记的数据中受益,很自然地平稳地了解标记的数据以无标记的数据。这导致使用拉普拉斯正规化。然而,当前的拉普拉斯正规化实施遭受了几个缺点,尤其是众所周知的维度诅咒。在本文中,我们提供了一个统计分析来克服这些问题,并揭示出表现出理想行为的大量光谱滤波方法。它们是通过(复制)内核方法实现的,我们为此提供了现实的计算准则,以使我们的方法可与大量数据一起使用。

As annotations of data can be scarce in large-scale practical problems, leveraging unlabelled examples is one of the most important aspects of machine learning. This is the aim of semi-supervised learning. To benefit from the access to unlabelled data, it is natural to diffuse smoothly knowledge of labelled data to unlabelled one. This induces to the use of Laplacian regularization. Yet, current implementations of Laplacian regularization suffer from several drawbacks, notably the well-known curse of dimensionality. In this paper, we provide a statistical analysis to overcome those issues, and unveil a large body of spectral filtering methods that exhibit desirable behaviors. They are implemented through (reproducing) kernel methods, for which we provide realistic computational guidelines in order to make our method usable with large amounts of data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源