一种在放射科医生中了解诊断策略差异的神经网络模型在曲线下对磁共振血管造影图像序列的动脉瘤状态分类的区域有改善

论文标题

一种在放射科医生中了解诊断策略差异的神经网络模型在曲线下对磁共振血管造影图像序列的动脉瘤状态分类的区域有改善

A neural network model that learns differences in diagnosis strategies among radiologists has an improved area under the curve for aneurysm status classification in magnetic resonance angiography image series

论文作者

Tachibana, Yasuhiko, Nishimori, Masataka, Kitamura, Naoyuki, Umehara, Kensuke, Ota, Junko, Obata, Takayuki, Higashi, Tatsuya

论文摘要

目的：构建一个神经网络模型，该模型可以学习放射科医生的不同诊断策略，以更好地对磁共振血管造影图像中的动脉瘤状态进行分类。材料和方法：这项回顾性研究包括3423个飞行时间的大脑磁共振血管造影图像序列（主题：男性1843 [平均年龄，50.2 +/- 11.7岁]，女性1580 [50.8 +/-- 11.3岁]记录了2017年11月至2019年1月的2019年1月至2019年1月的记录。深度学习的计算机辅助诊断（CAD）系统。对构建的神经网络进行了训练，以将CAD系统针对每个图像系列建议的零为五个动脉瘤的动脉瘤状态分类，并将放射科医生添加的任何其他动脉瘤区域进行分类，并且将此分类与注释放射科医生的判断进行了比较。图像系列被随机分配给以8：2比例的训练和测试数据。通过接收器的操作特性分析，将仅接受图像数据作为输入的控制模型与所提出的模型进行了比较，该模型的准确性是通过接收器的操作特征分析进行了比较的，该模型还接受了注释放射科医生是谁的信息。 DELONG测试用于比较曲线下的区域（P <0.05被认为是显着的）。结果：曲线下的面积在提议的模型（0.845）中比对照模型（0.793）更大，并且差异很大（p <0.0001）。结论：拟议的模型通过学习单个注释放射科医生的诊断策略提高了分类准确性。

Purpose: To construct a neural network model that can learn the different diagnosing strategies of radiologists to better classify aneurysm status in magnetic resonance angiography images. Materials and methods: This retrospective study included 3423 time-of-flight brain magnetic resonance angiography image series (subjects: male 1843 [mean age, 50.2 +/- 11.7 years], female 1580 [50.8 +/- 11.3 years]) recorded from November 2017 through January 2019. The image series were read independently for aneurysm status by one of four board-certified radiologists, who were assisted by an established deep learning-based computer-assisted diagnosis (CAD) system. The constructed neural networks were trained to classify the aneurysm status of zero to five aneurysm-suspicious areas suggested by the CAD system for each image series, and any additional aneurysm areas added by the radiologists, and this classification was compared with the judgment of the annotating radiologist. Image series were randomly allocated to training and testing data in an 8:2 ratio. The accuracy of the classification was compared by receiver operating characteristic analysis between the control model that accepted only image data as input and the proposed model that additionally accepted the information of who the annotating radiologist was. The DeLong test was used to compare areas under the curves (P < 0.05 was considered significant). Results: The area under the curve was larger in the proposed model (0.845) than in the control model (0.793), and the difference was significant (P < 0.0001). Conclusion: The proposed model improved classification accuracy by learning the diagnosis strategies of individual annotating radiologists.

下载PDF全文

下载文献需遵守相关版权规定

论文标题