Authors: Qiu Z, Rivaz H, Xiao Y
Background: As visual inspection is an inherent process during radiological screening, the associated eye gaze data can provide valuable insights into relevant clinical decision processes and facilitate computer-assisted diagnosis. However, the relevant techniques are still under-explored.
Purpose: With deep learning becoming the state-of-the-art for computer-assisted diagnosis, integrating human behavior, such as eye gaze data, into these systems is instrumental to help guide machine predictions with clinical diagnostic criteria, thus enhancing the quality of automatic radiological diagnosis. In addition, the ability to predict a radiologist's gaze saliency from a clinical scan along with the automatic diagnostic result could be instrumental for the end users.
Methods: We propose a novel deep learning framework for joint disease diagnosis and prediction of corresponding radiological gaze saliency maps for chest x-ray scans. Specifically, we introduce a new dual-encoder multitask UNet, which leverages both a DenseNet201 backbone and a Residual and Squeeze-and-Excitation block-based encoder to extract diverse features for visual saliency map prediction and a multiscale feature-fusion classifier to perform disease classification. To tackle the issue of asynchronous training schedules of individual tasks in multitask learning, we propose a multistage cooperative learning strategy, with contrastive learning for feature encoder pretraining to boost performance.
Results: Our proposed method is shown to significantly outperform existing techniques for chest radiography diagnosis (AUC = 0.93) and the quality of visual saliency map prediction (correlation coefficient = 0.58).
Conclusion: Benefiting from the proposed multitask, multistage cooperative learning, our technique demonstrates the benefit of integrating clinicians' eye gaze into radiological AI systems to boost performance and potentially explainability.
Keywords: computer‐; assisted diagnosis; contrastive learning; multitask learning; visual attention; x‐; ray;
PubMed: https://pubmed.ncbi.nlm.nih.gov/40665596/
DOI: 10.1002/mp.17977