论文标题
进行基准测试和评估DeepFake检测
Towards Benchmarking and Evaluating Deepfake Detection
论文作者
论文摘要
DeepFake检测通过分析操纵视频和不变视频之间的差异自动识别受操纵的媒体。自然要问哪个是识别有前途的研究方向并提供实际指导的现有深层检测方法中的最佳表现。不幸的是,很难使用文献中的结果对现有检测方法进行合理的基准比较,因为在研究中评估条件不一致。我们的目标是建立一个全面且一致的基准,以制定可重复的评估程序,并衡量一系列检测方法的性能,以便可以很好地比较结果。已经收集了由由13种以上不同方法产生的操纵样本组成的具有挑战性的数据集,现有文献中的11种流行检测方法(9种算法)已通过6种公平且实用的评估指标进行了评估和评估。最后,已经训练了92个模型,并进行了644次实验进行评估。结果以及共享数据和评估方法构成了比较深泡检测方法和衡量进度的基准。
Deepfake detection automatically recognizes the manipulated medias through the analysis of the difference between manipulated and non-altered videos. It is natural to ask which are the top performers among the existing deepfake detection approaches to identify promising research directions and provide practical guidance. Unfortunately, it's difficult to conduct a sound benchmarking comparison of existing detection approaches using the results in the literature because evaluation conditions are inconsistent across studies. Our objective is to establish a comprehensive and consistent benchmark, to develop a repeatable evaluation procedure, and to measure the performance of a range of detection approaches so that the results can be compared soundly. A challenging dataset consisting of the manipulated samples generated by more than 13 different methods has been collected, and 11 popular detection approaches (9 algorithms) from the existing literature have been implemented and evaluated with 6 fair-minded and practical evaluation metrics. Finally, 92 models have been trained and 644 experiments have been performed for the evaluation. The results along with the shared data and evaluation methodology constitute a benchmark for comparing deepfake detection approaches and measuring progress.