CXR-MultiTaskNet a unified deep learning framework for joint disease localization and classification in chest radiographs.
Authors
Affiliations (2)
Affiliations (2)
- Research Scholar, Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Aziz Nagar, Hyderabad, Telangana, 500075, India. [email protected].
- Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Aziz Nagar, Hyderabad, Telangana, 500075, India.
Abstract
Chest X-ray (CXR) is a challenging problem in automated medical diagnosis, where complex visual patterns of thoracic diseases must be precisely identified through multi-label classification and lesion localization. Current approaches typically consider classification and localization in isolation, resulting in a piecemeal system that does not exploit common representations and is often not clinically interpretable, as well as limited in handling multi-label diseases. Although multi-task learning frameworks, such as DeepChest and CLN, appear to meet this goal, they suffer from task interference and poor explainability, which limits their practical application in real-world clinical workflows. To address these limitations, we present a unified multi-task deep learning framework, CXR-MultiTaskNet, for simultaneously classifying thoracic diseases and localizing lesions in chest X-rays. Our framework comprises a standard ResNet50 feature extractor, two task-specific heads for multi-task learning, and a Grad-CAM-based explainability module that provides accurate predictions and enhances clinical explainability. We formulate a joint loss that weighs the relative importance of representation extraction, which is large due to class variations, and the final loss, which is larger in the detection loss that occurs in extreme class imbalances between days and the detectability of varying disease manifestation types. Recent advances made by deep learning methods in the identification of disease in chest X-ray images are promising; however, there are limitations in their performance for complete analysis due to the lack of interpretability, some inherent weaknesses of convolutional neural networks (CNN), and prior learning of classification at the image level before localization of the disease. In this paper, we propose a dual-attention-based hierarchical feature extraction approach, which addresses the challenges of deep learning in detecting diseases in chest X-ray images. Through the use of visual attention maps, the detection steps can be better tracked, and therefore, the entire process is made more interpretable than with a traditional CNN-embedding model. We also manage to obtain both disease-level and pixel-level predictions, which enable explainable and comprehensive analysis of each image and aid in localizing each detected abnormality area. The proposed approach was further optimized for X-ray images by computing the objective losses during training, which ultimately gives higher significance to smaller lesions. Experimental evaluations on a benchmark chest X-ray dataset demonstrate the potential of the proposed approach achieving a macro F1-score of 0.965 (0.968 micro F1-score) for disease classification and mean IoU of 0.851 ([email protected]) for localization of diseases Content: Model intepretability, Chest X-ray image disease detection, Detection region localization, Weakly supervised transfer learning Lesion localization → 5 of 0.927 Compared to state-of-the-art single-task and multi-task baselines, these results are consistently better. The presented framework provides an integrated, method-based approach to chest X-ray analysis that is clinically useful, interpretable, and scalable for automation, allowing for efficient diagnostic pathways and enhanced clinical decision-making. This single framework can serve as a router for next-gen explainable AI in radiology.