计算化学方法的概率性能估计器：系统的改进概率和排名概率矩阵。 I.理论

论文标题

计算化学方法的概率性能估计器：系统的改进概率和排名概率矩阵。 I.理论

Probabilistic performance estimators for computational chemistry methods: Systematic Improvement Probability and Ranking Probability Matrix. I. Theory

论文作者

Pernot, Pascal, Savin, Andreas

论文摘要

基准误差集的比较是评估计算化学理论的重要工具。通过其平均未签名误差的方法对方法的标准排名不令人满意，原因是与误差分布的非正态性和潜在趋势的存在相关的几个原因。最近提出了补充统计数据以抑制这种缺陷，例如绝对误差分布的分位数或平均预测不确定性。我们在这里介绍了一个新的分数，即系统改进概率（SIP），基于绝对错误的直接比较。与所选的评分规则无关，由于基准数据集的不完整而导致的统计数据的不确定性也通常也被忽略了。但是，这种不确定性对于欣赏排名的鲁棒性至关重要。在本文中，我们基于可靠统计信息开发了两个指标，以解决此问题：p_ {Inv}，统计量的两个值之间的反转概率，\ m athbf {p} _ {r}，排名概率矩阵。我们还证明了这些分数比较中误差集之间的相关性的基本贡献。

The comparison of benchmark error sets is an essential tool for the evaluation of theories in computational chemistry. The standard ranking of methods by their Mean Unsigned Error is unsatisfactory for several reasons linked to the non-normality of the error distributions and the presence of underlying trends. Complementary statistics have recently been proposed to palliate such deficiencies, such as quantiles of the absolute errors distribution or the mean prediction uncertainty. We introduce here a new score, the systematic improvement probability (SIP), based on the direct system-wise comparison of absolute errors. Independently of the chosen scoring rule, the uncertainty of the statistics due to the incompleteness of the benchmark data sets is also generally overlooked. However, this uncertainty is essential to appreciate the robustness of rankings. In the present article, we develop two indicators based on robust statistics to address this problem: P_{inv}, the inversion probability between two values of a statistic, and \mathbf{P}_{r}, the ranking probability matrix. We demonstrate also the essential contribution of the correlations between error sets in these scores comparisons.

下载PDF全文

下载文献需遵守相关版权规定

论文标题