加速域传播：稀疏矩阵上有效的GPU平行算法

论文标题

加速域传播：稀疏矩阵上有效的GPU平行算法

Accelerating Domain Propagation: an Efficient GPU-Parallel Algorithm over Sparse Matrices

论文作者

Sofranac, Boro, Gleixner, Ambros, Pokutta, Sebastian

论文摘要

线性约束的快速域传播已成为当今最佳算法和求解器的关键组成部分，用于混合整数编程和伪树 - 树状优化，以实现峰值解决性能。输入数据中动态算法行为，依赖性结构和稀疏模式的形式的不规则性使得对GPU的域传播有效实现，更广泛地对并行体系结构具有挑战性。这是最新求解器中的域传播仅是单线线程的主要原因之一。在本文中，我们提出了一种用于域传播的新算法，该算法（a）避免了这些问题并允许对GPU进行有效的实现，并且（b）能够完全在GPU上运行传播，而无需任何同步或与CPU进行交流。我们提出了广泛的计算结果，这些结果证明了我们方法的有效性，并表明在实际相关问题上可能有足够的加速：在最新的GPU上，我们对合理大型实例的几何平均加速度约为10倍至20倍，并且在有利的实例上可能高达180倍。

Fast domain propagation of linear constraints has become a crucial component of today's best algorithms and solvers for mixed integer programming and pseudo-boolean optimization to achieve peak solving performance. Irregularities in the form of dynamic algorithmic behaviour, dependency structures, and sparsity patterns in the input data make efficient implementations of domain propagation on GPUs and, more generally, on parallel architectures challenging. This is one of the main reasons why domain propagation in state-of-the-art solvers is single thread only. In this paper, we present a new algorithm for domain propagation which (a) avoids these problems and allows for an efficient implementation on GPUs, and is (b) capable of running propagation rounds entirely on the GPU, without any need for synchronization or communication with the CPU. We present extensive computational results which demonstrate the effectiveness of our approach and show that ample speedups are possible on practically relevant problems: on state-of-the-art GPUs, our geometric mean speed-up for reasonably-large instances is around 10x to 20x and can be as high as 180x on favorably-large instances.

下载PDF全文

下载文献需遵守相关版权规定

论文标题