论文标题

关于平均距离分类器的概括

On a Generalization of the Average Distance Classifier

论文作者

Roy, Sarbojit, Sarkar, Soham, Dutta, Subhajit

论文摘要

在高尺寸,低样本量(HDLSS)设置中,如果位置之间的差异被尺度差异掩盖,则基于欧几里得距离的简单平均距离分类器的性能很差。为了纠正此问题,Chan and Hall(2009)提出了对平均距离分类器的修改。但是,现有分类器在其他方面与位置和量表的其他方面有所不同时无法区分。在本文中,我们提出了平均距离分类器的一些简单转换,以解决此问题。即使基础种群具有相同的位置和规模,最终的分类器的表现也很好。理论上研究了提出的分类器的高维行为。具有多种模拟和实际数据集的数值实验表现出所提出的方法的有用性。

In high dimension, low sample size (HDLSS)settings, the simple average distance classifier based on the Euclidean distance performs poorly if differences between the locations get masked by the scale differences. To rectify this issue, modifications to the average distance classifier was proposed by Chan and Hall (2009). However, the existing classifiers cannot discriminate when the populations differ in other aspects than locations and scales. In this article, we propose some simple transformations of the average distance classifier to tackle this issue. The resulting classifiers perform quite well even when the underlying populations have the same location and scale. The high-dimensional behaviour of the proposed classifiers is studied theoretically. Numerical experiments with a variety of simulated as well as real data sets exhibit the usefulness of the proposed methodology.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源