Sort by:
Page 9 of 35341 results

Scaling Chest X-ray Foundation Models from Mixed Supervisions for Dense Prediction.

Wang F, Yu L

pubmed logopapersJul 16 2025
Foundation models have significantly revolutionized the field of chest X-ray diagnosis with their ability to transfer across various diseases and tasks. However, previous works have predominantly utilized self-supervised learning from medical image-text pairs, which falls short in dense medical prediction tasks due to their sole reliance on such coarse pair supervision, thereby limiting their applicability to detailed diagnostics. In this paper, we introduce a Dense Chest X-ray Foundation Model (DCXFM), which utilizes mixed supervision types (i.e., text, label, and segmentation masks) to significantly enhance the scalability of foundation models across various medical tasks. Our model involves two training stages: we first employ a novel self-distilled multimodal pretraining paradigm to exploit text and label supervision, along with local-to-global self-distillation and soft cross-modal contrastive alignment strategies to enhance localization capabilities. Subsequently, we introduce an efficient cost aggregation module, comprising spatial and class aggregation mechanisms, to further advance dense prediction tasks with densely annotated datasets. Comprehensive evaluations on three tasks (phrase grounding, zero-shot semantic segmentation, and zero-shot classification) demonstrate DCXFM's superior performance over other state-of-the-art medical image-text pretraining models. Remarkably, DCXFM exhibits powerful zero-shot capabilities across various datasets in phrase grounding and zero-shot semantic segmentation, underscoring its superior generalization in dense prediction tasks.

Interpreting Radiologist's Intention from Eye Movements in Chest X-ray Diagnosis

Trong-Thang Pham, Anh Nguyen, Zhigang Deng, Carol C. Wu, Hien Van Nguyen, Ngan Le

arxiv logopreprintJul 16 2025
Radiologists rely on eye movements to navigate and interpret medical images. A trained radiologist possesses knowledge about the potential diseases that may be present in the images and, when searching, follows a mental checklist to locate them using their gaze. This is a key observation, yet existing models fail to capture the underlying intent behind each fixation. In this paper, we introduce a deep learning-based approach, RadGazeIntent, designed to model this behavior: having an intention to find something and actively searching for it. Our transformer-based architecture processes both the temporal and spatial dimensions of gaze data, transforming fine-grained fixation features into coarse, meaningful representations of diagnostic intent to interpret radiologists' goals. To capture the nuances of radiologists' varied intention-driven behaviors, we process existing medical eye-tracking datasets to create three intention-labeled subsets: RadSeq (Systematic Sequential Search), RadExplore (Uncertainty-driven Exploration), and RadHybrid (Hybrid Pattern). Experimental results demonstrate RadGazeIntent's ability to predict which findings radiologists are examining at specific moments, outperforming baseline methods across all intention-labeled datasets.

Collaborative Integration of AI and Human Expertise to Improve Detection of Chest Radiograph Abnormalities.

Awasthi A, Le N, Deng Z, Wu CC, Nguyen HV

pubmed logopapersJul 16 2025
<i>"Just Accepted" papers have undergone full peer review and have been accepted for publication in <i>Radiology: Artificial Intelligence</i>. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content.</i> Purpose To develop a collaborative AI system that integrates eye gaze data and radiology reports to improve diagnostic accuracy in chest radiograph interpretation by identifying and correcting perceptual errors. Materials and Methods This retrospective study utilized public datasets REFLACX and EGD-CXR to develop a collaborative AI solution, named Collaborative Radiology Expert (CoRaX). It employs a large multimodal model to analyze image embeddings, eye gaze data, and radiology reports, aiming to rectify perceptual errors in chest radiology. The proposed system was evaluated using two simulated error datasets featuring random and uncertain alterations of five abnormalities. Evaluation focused on the system's referral-making process, the quality of referrals, and its performance within collaborative diagnostic settings. Results In the random masking-based error dataset, 28.0% (93/332) of abnormalities were altered. The system successfully corrected 21.3% (71/332) of these errors, with 6.6% (22/332) remaining unresolved. The accuracy of the system in identifying the correct regions of interest for missed abnormalities was 63.0% [95% CI: 59.0%, 68.0%], and 85.7% (240/280) of interactions with radiologists were deemed satisfactory, meaning that the system provided diagnostic aid to radiologists. In the uncertainty-masking-based error dataset, 43.9% (146/332) of abnormalities were altered. The system corrected 34.6% (115/332) of these errors, with 9.3% (31/332) unresolved. The accuracy of predicted regions of missed abnormalities for this dataset was 58.0% [95% CI: 55.0%, 62.0%], and 78.4% (233/297) of interactions were satisfactory. Conclusion The CoRaX system can collaborate efficiently with radiologists and address perceptual errors across various abnormalities in chest radiographs. ©RSNA, 2025.

Artificial intelligence-based diabetes risk prediction from longitudinal DXA bone measurements.

Khan S, Shah Z

pubmed logopapersJul 16 2025
Diabetes mellitus (DM) is a serious global health concern that poses a significant threat to human life. Beyond its direct impact, diabetes substantially increases the risk of developing severe complications such as hypertension, cardiovascular disease, and musculoskeletal disorders like arthritis and osteoporosis. The field of diabetes classification has advanced significantly with the use of diverse data modalities and sophisticated tools to identify individuals or groups as diabetic. But the task of predicting diabetes prior to its onset, particularly through the use of longitudinal multi-modal data, remains relatively underexplored. To better understand the risk factors associated with diabetes development among Qatari adults, this longitudinal research aims to investigate dual-energy X-ray absorptiometry (DXA)-derived whole-body and regional bone composition measures as potential predictors of diabetes onset. We proposed a case-control retrospective study, with a total of 1,382 participants contains 725 male participants (cases: 146, control: 579) and 657 female participants (case: 133, control: 524). We excluded participants with incomplete data points. To handle class imbalance, we augmented our data using Synthetic Minority Over-sampling Technique (SMOTE) and SMOTEENN (SMOTE with Edited Nearest Neighbors), and to further investigated the association between bones data features and diabetes status, we employed ANOVA analytical method. For diabetes onset prediction, we employed both conventional and deep learning (DL) models to predict risk factors associated with diabetes in Qatari adults. We used SHAP and probabilistic methods to investigate the association of identified risk factors with diabetes. During experimental analysis, we found that bone mineral density (BMD), bone mineral contents (BMC) in the hip, femoral neck, troch area, and lumbar spine showed an upward trend in diabetic patients with [Formula: see text]. Meanwhile, we found that patients with abnormal glucose metabolism had increased wards BMD and BMC with low Z-score compared to healthy participants. Consequently, it shows that the diabetic group has superior bone health than the control group in the cohort, because they exhibit higher BMD, muscle mass, and bone area across most body regions. Moreover, in the age group distribution analysis, we found that the diabetes prediction rate was higher among healthy participants in the younger age group 20-40 years. But as the age range increased, the model predictions became more accurate for diabetic participants, especially in the older age group 56-69 years. It is also observed that male participants demonstrated a higher susceptibility to diabetes onset compared to female participants. Shallow models outperformed the DL models by presenting improved accuracy (91.08%), AUROC (96%), and recall values (91%). This pivotal approach utilizing DXA scans highlights significant potential for the rapid and minimally invasive early detection of diabetes.

Deep learning-assisted comparison of different models for predicting maxillary canine impaction on panoramic radiography.

Zhang C, Zhu H, Long H, Shi Y, Guo J, You M

pubmed logopapersJul 16 2025
The panoramic radiograph is the most commonly used imaging modality for predicting maxillary canine impaction. Several prediction models have been constructed based on panoramic radiographs. This study aimed to compare the prediction accuracy of existing models in an external validation facilitated by an automatic landmark detection system based on deep learning. Patients aged 7-14 years who underwent panoramic radiographic examinations and received a diagnosis of impacted canines were included in the study. An automatic landmark localization system was employed to assist the measurement of geometric parameters on the panoramic radiographs, followed by the calculated prediction of the canine impaction. Three prediction models constructed by Arnautska, Alqerban et al, and Margot et al were evaluated. The metrics of accuracy, sensitivity, specificity, precision, and area under the receiver operating characteristic curve (AUC) were used to compare the performance of different models. A total of 102 panoramic radiographs with 102 impacted canines and 102 nonimpacted canines were analyzed in this study. The prediction outcomes indicated that the model by Margot et al achieved the highest performance, with a sensitivity of 95% and a specificity of 86% (AUC, 0.97), followed by the model by Arnautska, with a sensitivity of 93% and a specificity of 71% (AUC, 0.94). The model by Alqerban et al showed poor performance with an AUC of only 0.20. Two of the existing predictive models exhibited good diagnostic accuracy, whereas the third model demonstrated suboptimal performance. Nonetheless, even the most effective model is constrained by several limitations, such as logical and computational challenges, which necessitate further refinement.

Site-Level Fine-Tuning with Progressive Layer Freezing: Towards Robust Prediction of Bronchopulmonary Dysplasia from Day-1 Chest Radiographs in Extremely Preterm Infants

Sybelle Goedicke-Fritz, Michelle Bous, Annika Engel, Matthias Flotho, Pascal Hirsch, Hannah Wittig, Dino Milanovic, Dominik Mohr, Mathias Kaspar, Sogand Nemat, Dorothea Kerner, Arno Bücker, Andreas Keller, Sascha Meyer, Michael Zemlin, Philipp Flotho

arxiv logopreprintJul 16 2025
Bronchopulmonary dysplasia (BPD) is a chronic lung disease affecting 35% of extremely low birth weight infants. Defined by oxygen dependence at 36 weeks postmenstrual age, it causes lifelong respiratory complications. However, preventive interventions carry severe risks, including neurodevelopmental impairment, ventilator-induced lung injury, and systemic complications. Therefore, early BPD prognosis and prediction of BPD outcome is crucial to avoid unnecessary toxicity in low risk infants. Admission radiographs of extremely preterm infants are routinely acquired within 24h of life and could serve as a non-invasive prognostic tool. In this work, we developed and investigated a deep learning approach using chest X-rays from 163 extremely low-birth-weight infants ($\leq$32 weeks gestation, 401-999g) obtained within 24 hours of birth. We fine-tuned a ResNet-50 pretrained specifically on adult chest radiographs, employing progressive layer freezing with discriminative learning rates to prevent overfitting and evaluated a CutMix augmentation and linear probing. For moderate/severe BPD outcome prediction, our best performing model with progressive freezing, linear probing and CutMix achieved an AUROC of 0.78 $\pm$ 0.10, balanced accuracy of 0.69 $\pm$ 0.10, and an F1-score of 0.67 $\pm$ 0.11. In-domain pre-training significantly outperformed ImageNet initialization (p = 0.031) which confirms domain-specific pretraining to be important for BPD outcome prediction. Routine IRDS grades showed limited prognostic value (AUROC 0.57 $\pm$ 0.11), confirming the need of learned markers. Our approach demonstrates that domain-specific pretraining enables accurate BPD prediction from routine day-1 radiographs. Through progressive freezing and linear probing, the method remains computationally feasible for site-level implementation and future federated learning deployments.

Generate to Ground: Multimodal Text Conditioning Boosts Phrase Grounding in Medical Vision-Language Models

Felix Nützel, Mischa Dombrowski, Bernhard Kainz

arxiv logopreprintJul 16 2025
Phrase grounding, i.e., mapping natural language phrases to specific image regions, holds significant potential for disease localization in medical imaging through clinical reports. While current state-of-the-art methods rely on discriminative, self-supervised contrastive models, we demonstrate that generative text-to-image diffusion models, leveraging cross-attention maps, can achieve superior zero-shot phrase grounding performance. Contrary to prior assumptions, we show that fine-tuning diffusion models with a frozen, domain-specific language model, such as CXR-BERT, substantially outperforms domain-agnostic counterparts. This setup achieves remarkable improvements, with mIoU scores doubling those of current discriminative methods. These findings highlight the underexplored potential of generative models for phrase grounding tasks. To further enhance performance, we introduce Bimodal Bias Merging (BBM), a novel post-processing technique that aligns text and image biases to identify regions of high certainty. BBM refines cross-attention maps, achieving even greater localization accuracy. Our results establish generative approaches as a more effective paradigm for phrase grounding in the medical imaging domain, paving the way for more robust and interpretable applications in clinical practice. The source code and model weights are available at https://github.com/Felix-012/generate_to_ground.

Site-Level Fine-Tuning with Progressive Layer Freezing: Towards Robust Prediction of Bronchopulmonary Dysplasia from Day-1 Chest Radiographs in Extremely Preterm Infants

Sybelle Goedicke-Fritz, Michelle Bous, Annika Engel, Matthias Flotho, Pascal Hirsch, Hannah Wittig, Dino Milanovic, Dominik Mohr, Mathias Kaspar, Sogand Nemat, Dorothea Kerner, Arno Bücker, Andreas Keller, Sascha Meyer, Michael Zemlin, Philipp Flotho

arxiv logopreprintJul 16 2025
Bronchopulmonary dysplasia (BPD) is a chronic lung disease affecting 35% of extremely low birth weight infants. Defined by oxygen dependence at 36 weeks postmenstrual age, it causes lifelong respiratory complications. However, preventive interventions carry severe risks, including neurodevelopmental impairment, ventilator-induced lung injury, and systemic complications. Therefore, early BPD prognosis and prediction of BPD outcome is crucial to avoid unnecessary toxicity in low risk infants. Admission radiographs of extremely preterm infants are routinely acquired within 24h of life and could serve as a non-invasive prognostic tool. In this work, we developed and investigated a deep learning approach using chest X-rays from 163 extremely low-birth-weight infants ($\leq$32 weeks gestation, 401-999g) obtained within 24 hours of birth. We fine-tuned a ResNet-50 pretrained specifically on adult chest radiographs, employing progressive layer freezing with discriminative learning rates to prevent overfitting and evaluated a CutMix augmentation and linear probing. For moderate/severe BPD outcome prediction, our best performing model with progressive freezing, linear probing and CutMix achieved an AUROC of 0.78 $\pm$ 0.10, balanced accuracy of 0.69 $\pm$ 0.10, and an F1-score of 0.67 $\pm$ 0.11. In-domain pre-training significantly outperformed ImageNet initialization (p = 0.031) which confirms domain-specific pretraining to be important for BPD outcome prediction. Routine IRDS grades showed limited prognostic value (AUROC 0.57 $\pm$ 0.11), confirming the need of learned markers. Our approach demonstrates that domain-specific pretraining enables accurate BPD prediction from routine day-1 radiographs. Through progressive freezing and linear probing, the method remains computationally feasible for site-level implementation and future federated learning deployments.

Evaluation of Artificial Intelligence-based diagnosis for facial fractures, advantages compared with conventional imaging diagnosis: a systematic review and meta-analysis.

Ju J, Qu Z, Qing H, Ding Y, Peng L

pubmed logopapersJul 15 2025
Currently, the application of convolutional neural networks (CNNs) in artificial intelligence (AI) for medical imaging diagnosis has emerged as a highly promising tool. In particular, AI-assisted diagnosis holds significant potential for orthopedic and emergency department physicians by improving diagnostic efficiency and enhancing the overall patient experience. This systematic review and meta-analysis has the objective of assessing the application of AI in diagnosing facial fractures and evaluating its diagnostic performance. This study adhered to the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) and PRISMA-Diagnostic Test Accuracy (PRISMA-DTA). A comprehensive literature search was conducted in the PubMed, Cochrane Library, and Web of Science databases to identify original articles published up to December 2024. The risk of bias and applicability of the included studies were assessed using the QUADAS-2 tool. The results were analyzed using a Summary Receiver Operating Characteristic (SROC) curve. A total of 16 studies were included in the analysis, with contingency tables extracted from 11 of them. The pooled sensitivity was 0.889 (95% CI: 0.844-0.922), and the pooled specificity was 0.888 (95% CI: 0.834-0.926). The area under the Summary Receiver Operating Characteristic (SROC) curve was 0.911. In the subgroup analysis of nasal and mandibular fractures, the pooled sensitivity for nasal fractures was 0.851 (95% CI: 0.806-0.887), and the pooled specificity was 0.883 (95% CI: 0.862-0.902). For mandibular fractures, the pooled sensitivity was 0.905 (95% CI: 0.836-0.947), and the pooled specificity was 0.895 (95% CI: 0.824-0.940). AI can be developed as an auxiliary tool to assist clinicians in diagnosing facial fractures. The results demonstrate high overall sensitivity and specificity, along with a robust performance reflected by the high area under the SROC curve. This study has been prospectively registered on Prospero, ID:CRD42024618650, Creat Date:10 Dec 2024. https://www.crd.york.ac.uk/PROSPERO/view/CRD42024618650 .

Semantically Informed Salient Regions Guided Radiology Report Generation

Zeyi Hou, Zeqiang Wei, Ruixin Yan, Ning Lang, Xiuzhuang Zhou

arxiv logopreprintJul 15 2025
Recent advances in automated radiology report generation from chest X-rays using deep learning algorithms have the potential to significantly reduce the arduous workload of radiologists. However, due to the inherent massive data bias in radiology images, where abnormalities are typically subtle and sparsely distributed, existing methods often produce fluent yet medically inaccurate reports, limiting their applicability in clinical practice. To address this issue effectively, we propose a Semantically Informed Salient Regions-guided (SISRNet) report generation method. Specifically, our approach explicitly identifies salient regions with medically critical characteristics using fine-grained cross-modal semantics. Then, SISRNet systematically focuses on these high-information regions during both image modeling and report generation, effectively capturing subtle abnormal findings, mitigating the negative impact of data bias, and ultimately generating clinically accurate reports. Compared to its peers, SISRNet demonstrates superior performance on widely used IU-Xray and MIMIC-CXR datasets.
Page 9 of 35341 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.