论文标题
深度学习超声上的甲状腺结节分类:在独立数据集中进行验证
Deep Learning for Classification of Thyroid Nodules on Ultrasound: Validation on an Independent Dataset
论文作者
论文摘要
目的:目的是将先前验证的深度学习算法应用于新的甲状腺结节超声图像数据集,并将其性能与放射科医生进行比较。方法:先前的研究提出了一种能够检测甲状腺结节,然后使用两个超声图像进行恶性分类的算法。从1278个结节训练了多任务深卷积神经网络,最初用99个单独的结节进行了测试。结果与放射科医生相当。与培训案例相比,使用来自不同制造商和产品类型的超声机的378个结节进一步测试了该算法。要求四位经验丰富的放射科医生评估结节,以与深度学习进行比较。结果:用参数,二维估计计算了深度学习算法的曲线(AUC)面积(AUC)。对于深度学习算法,AUC为0.69(95%CI:0.64-0.75)。放射科医生的AUC为0.63(95%CI:0.59-0.67),0.66(95%CI:0.61-0.71),0.65(95%CI:0.60-0.70)和0.63(95%CI:95%CI:0.58-0.67)。结论:在新的测试数据集中,深度学习算法与所有四个放射科医生都达到了类似的性能。算法和放射科医生之间的相对性能差异不受超声扫描仪差异的显着影响。
Objectives: The purpose is to apply a previously validated deep learning algorithm to a new thyroid nodule ultrasound image dataset and compare its performances with radiologists. Methods: Prior study presented an algorithm which is able to detect thyroid nodules and then make malignancy classifications with two ultrasound images. A multi-task deep convolutional neural network was trained from 1278 nodules and originally tested with 99 separate nodules. The results were comparable with that of radiologists. The algorithm was further tested with 378 nodules imaged with ultrasound machines from different manufacturers and product types than the training cases. Four experienced radiologists were requested to evaluate the nodules for comparison with deep learning. Results: The Area Under Curve (AUC) of the deep learning algorithm and four radiologists were calculated with parametric, binormal estimation. For the deep learning algorithm, the AUC was 0.69 (95% CI: 0.64 - 0.75). The AUC of radiologists were 0.63 (95% CI: 0.59 - 0.67), 0.66 (95% CI:0.61 - 0.71), 0.65 (95% CI: 0.60 - 0.70), and 0.63 (95%CI: 0.58 - 0.67). Conclusion: In the new testing dataset, the deep learning algorithm achieved similar performances with all four radiologists. The relative performance difference between the algorithm and the radiologists is not significantly affected by the difference of ultrasound scanner.