巴黎+：多核架构上的数据系列索引

论文标题

巴黎+：多核架构上的数据系列索引

ParIS+: Data Series Indexing on Multi-Core Architectures

论文作者

Peng, Botao, Fatourou, Panagiota, Palpanas, Themis

论文摘要

数据系列相似性搜索是跨许多不同域的多个数据系列分析应用程序的核心操作。然而，即使是最先进的技术也无法提供大型数据系列收集所需的时间性能。我们提出了巴黎和巴黎+的建议，这是第一个基于磁盘的数据系列索引仔细设计的，旨在固有地利用多核体系结构，以加速相似性搜索处理时间。我们的实验表明，巴黎+完全消除了磁盘驻留数据的索引构造期间的CPU潜伏期，并且要确切的查询答案比当前的柔和指数扫描方法的速度要快1个数量级，并且比优化的SCAN方法快3个幅度。巴黎+（这是ADS+索引的发展）应归功于有效利用多核和多存储架构，以便在索引构造和查询答案中并行分发和执行，以及对单个指令的剥削，以进一步同步执行。

Data series similarity search is a core operation for several data series analysis applications across many different domains. Nevertheless, even state-of-the-art techniques cannot provide the time performance required for large data series collections. We propose ParIS and ParIS+, the first disk-based data series indices carefully designed to inherently take advantage of multi-core architectures, in order to accelerate similarity search processing times. Our experiments demonstrate that ParIS+ completely removes the CPU latency during index construction for disk-resident data, and for exact query answering is up to 1 order of magnitude faster than the current state of the art index scan method, and up to 3 orders of magnitude faster than the optimized serial scan method. ParIS+ (which is an evolution of the ADS+ index) owes its efficiency to the effective use of multi-core and multi-socket architectures, in order to distribute and execute in parallel both index construction and query answering, and to the exploitation of the Single Instruction Multiple Data (SIMD) capabilities of modern CPUs, in order to further parallelize the execution of instructions inside each core.

下载PDF全文

下载文献需遵守相关版权规定

论文标题