论文标题
K-Nearest邻居分类器:第二版(带有Python的示例)
k-Nearest Neighbour Classifiers: 2nd Edition (with Python examples)
论文作者
论文摘要
也许阿森纳或机器学习技术中最直接的分类器是最近的邻居分类器 - 通过将最近的邻居识别到查询示例并使用这些邻居来确定查询类别来实现分类。这种分类方法尤其重要,因为如今的运行时性能差的问题与可用的计算能力并不是一个问题。本文概述了最近邻居分类的技术。评估相似性(距离)的机制,识别最近邻居的计算问题以及降低数据维度的机制。 本文是先前作为技术报告发表的论文的第二版。已经添加了有关时间序列,检索加速和内在维度的相似性度量的部分。附录包括为关键方法提供对Python代码的访问。
Perhaps the most straightforward classifier in the arsenal or machine learning techniques is the Nearest Neighbour Classifier -- classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance because issues of poor run-time performance is not such a problem these days with the computational power that is available. This paper presents an overview of techniques for Nearest Neighbour classification focusing on; mechanisms for assessing similarity (distance), computational issues in identifying nearest neighbours and mechanisms for reducing the dimension of the data. This paper is the second edition of a paper previously published as a technical report. Sections on similarity measures for time-series, retrieval speed-up and intrinsic dimensionality have been added. An Appendix is included providing access to Python code for the key methods.