Sort by:
Page 32 of 92915 results

AI-powered disease progression prediction in multiple sclerosis using magnetic resonance imaging: a systematic review and meta-analysis.

Houshi S, Khodakarami Z, Shaygannejad A, Khosravi F, Shaygannejad V

pubmed logopapersJul 12 2025
Disability progression despite disease-modifying therapy remains a major challenge in multiple sclerosis (MS). Artificial intelligence (AI) models exploiting magnetic resonance imaging (MRI) promise personalized prognostication, yet their real-world accuracy is uncertain. To systematically review and meta-analyze MRI-based AI studies predicting future disability progression in MS. Five databases were searched from inception to 17 May 2025 following PRISMA. Eligible studies used MRI in an AI model to forecast changes in the Expanded Disability Status Scale (EDSS) or equivalent metrics. Two reviewers conducted study selection, data extraction, and QUADAS-2 assessment. Random-effects meta-analysis was applied when ≥3 studies reported compatible regression statistics. Twenty-one studies with 12,252 MS patients met inclusion criteria. Five used regression on continuous EDSS, fourteen classification, one time-to-event, and one both. Conventional machine learning predominated (57%), and deep learning (38%). Median classification area under the curve (AUC) was 0.78 (range 0.57-0.86); median regression root-mean-square-error (RMSE) 1.08 EDSS points. Pooled RMSE across regression studies was 1.31 (95% CI 1.02-1.60; I<sup>2</sup> = 95%). Deep learning conferred only marginal, non-significant gains over classical algorithms. External validation appeared in six studies; calibration, decision-curve analysis and code releases were seldom reported. QUADAS-2 indicated generally low patient-selection bias but frequent index-test concerns. MRI-driven AI models predict MS disability progression with moderate accuracy, but error margins that exceed one EDSS point limit individual-level utility. Harmonized endpoints, larger multicenter cohorts, rigorous external validation, and prospective clinician-in-the-loop trials are essential before routine clinical adoption.

Accuracy of large language models in generating differential diagnosis from clinical presentation and imaging findings in pediatric cases.

Jung J, Phillipi M, Tran B, Chen K, Chan N, Ho E, Sun S, Houshyar R

pubmed logopapersJul 12 2025
Large language models (LLM) have shown promise in assisting medical decision-making. However, there is limited literature exploring the diagnostic accuracy of LLMs in generating differential diagnoses from text-based image descriptions and clinical presentations in pediatric radiology. To examine the performance of multiple proprietary LLMs in producing accurate differential diagnoses for text-based pediatric radiological cases without imaging. One hundred sixty-four cases were retrospectively selected from a pediatric radiology textbook and converted into two formats: (1) image description only, and (2) image description with clinical presentation. The ChatGPT-4 V, Claude 3.5 Sonnet, and Gemini 1.5 Pro algorithms were given these inputs and tasked with providing a top 1 diagnosis and a top 3 differential diagnoses. Accuracy of responses was assessed by comparison with the original literature. Top 1 accuracy was defined as whether the top 1 diagnosis matched the textbook, and top 3 differential accuracy was defined as the number of diagnoses in the model-generated top 3 differential that matched any of the top 3 diagnoses in the textbook. McNemar's test, Cochran's Q test, Friedman test, and Wilcoxon signed-rank test were used to compare algorithms and assess the impact of added clinical information, respectively. There was no significant difference in top 1 accuracy between ChatGPT-4 V, Claude 3.5 Sonnet, and Gemini 1.5 Pro when only image descriptions were provided (56.1% [95% CI 48.4-63.5], 64.6% [95% CI 57.1-71.5], 61.6% [95% CI 54.0-68.7]; P = 0.11). Adding clinical presentation to image description significantly improved top 1 accuracy for ChatGPT-4 V (64.0% [95% CI 56.4-71.0], P = 0.02) and Claude 3.5 Sonnet (80.5% [95% CI 73.8-85.8], P < 0.001). For image description and clinical presentation cases, Claude 3.5 Sonnet significantly outperformed both ChatGPT-4 V and Gemini 1.5 Pro (P < 0.001). For top 3 differential accuracy, no significant differences were observed between ChatGPT-4 V, Claude 3.5 Sonnet, and Gemini 1.5 Pro, regardless of whether the cases included only image descriptions (1.29 [95% CI 1.16-1.41], 1.35 [95% CI 1.23-1.48], 1.37 [95% CI 1.25-1.49]; P = 0.60) or both image descriptions and clinical presentations (1.33 [95% CI 1.20-1.45], 1.52 [95% CI 1.41-1.64], 1.48 [95% 1.36-1.59]; P = 0.72). Only Claude 3.5 Sonnet performed significantly better when clinical presentation was added (P < 0.001). Commercial LLMs performed similarly on pediatric radiology cases in providing top 1 accuracy and top 3 differential accuracy when only a text-based image description was used. Adding clinical presentation significantly improved top 1 accuracy for ChatGPT-4 V and Claude 3.5 Sonnet, with Claude showing the largest improvement. Claude 3.5 Sonnet outperformed both ChatGPT-4 V and Gemini 1.5 Pro in top 1 accuracy when both image and clinical data were provided. No significant differences were found in top 3 differential accuracy across models in any condition.

Vision-language model for report generation and outcome prediction in CT pulmonary angiogram.

Zhong Z, Wang Y, Wu J, Hsu WC, Somasundaram V, Bi L, Kulkarni S, Ma Z, Collins S, Baird G, Ahn SH, Feng X, Kamel I, Lin CT, Greineder C, Atalay M, Jiao Z, Bai H

pubmed logopapersJul 12 2025
Accurate and comprehensive interpretation of pulmonary embolism (PE) from Computed Tomography Pulmonary Angiography (CTPA) scans remains a clinical challenge due to the limited specificity and structure of existing AI tools. We propose an agent-based framework that integrates Vision-Language Models (VLMs) for detecting 32 PE-related abnormalities and Large Language Models (LLMs) for structured report generation. Trained on over 69,000 CTPA studies from 24,890 patients across Brown University Health (BUH), Johns Hopkins University (JHU), and the INSPECT dataset from Stanford, the model demonstrates strong performance in abnormality classification and report generation. For abnormality classification, it achieved AUROC scores of 0.788 (BUH), 0.754 (INSPECT), and 0.710 (JHU), with corresponding BERT-F1 scores of 0.891, 0.829, and 0.842. The abnormality-guided reporting strategy consistently outperformed the organ-based and holistic captioning baselines. For survival prediction, a multimodal fusion model that incorporates imaging, clinical variables, diagnostic outputs, and generated reports achieved concordance indices of 0.863 (BUH) and 0.731 (JHU), outperforming traditional PESI scores. This framework provides a clinically meaningful and interpretable solution for end-to-end PE diagnosis, structured reporting, and outcome prediction.

Semi-supervised Medical Image Segmentation Using Heterogeneous Complementary Correction Network and Confidence Contrastive Learning.

Li L, Xue M, Li S, Dong Z, Liao T, Li P

pubmed logopapersJul 11 2025
Semi-supervised medical image segmentation techniques have demonstrated significant potential and effectiveness in clinical diagnosis. The prevailing approaches using the mean-teacher (MT) framework achieve promising image segmentation results. However, due to the unreliability of the pseudo labels generated by the teacher model, existing methods still have some inherent limitations that must be considered and addressed. In this paper, we propose an innovative semi-supervised method for medical image segmentation by combining the heterogeneous complementary correction network and confidence contrastive learning (HC-CCL). Specifically, we develop a triple-branch framework by integrating a heterogeneous complementary correction (HCC) network into the MT framework. HCC serves as an auxiliary branch that corrects prediction errors in the student model and provides complementary information. To improve the capacity for feature learning in our proposed model, we introduce a confidence contrastive learning (CCL) approach with a novel sampling strategy. Furthermore, we develop a momentum style transfer (MST) method to narrow the gap between labeled and unlabeled data distributions. In addition, we introduce a Cutout-style augmentation for unsupervised learning to enhance performance. Three medical image datasets (including left atrial (LA) dataset, NIH pancreas dataset, Brats-2019 dataset) were employed to rigorously evaluate HC-CCL. Quantitative results demonstrate significant performance advantages over existing approaches, achieving state-of-the-art performance across all metrics. The implementation will be released at https://github.com/xxmmss/HC-CCL .

Incremental diagnostic value of AI-derived coronary artery calcium in 18F-flurpiridaz PET Myocardial Perfusion Imaging

Barrett, O., Shanbhag, A., Zaid, R., Miller, R. J., Lemley, M., Builoff, V., Liang, J., Kavanagh, P., Buckley, C., Dey, D., Berman, D. S., Slomka, P.

medrxiv logopreprintJul 11 2025
BackgroundPositron Emission Tomography (PET) myocardial perfusion imaging (MPI) is a powerful tool for predicting coronary artery disease (CAD). Coronary artery calcium (CAC) provides incremental risk stratification to PET-MPI and enhances diagnostic accuracy. We assessed additive value of CAC score, derived from PET/CT attenuation maps to stress TPD results using the novel 18F-flurpiridaz tracer in detecting significant CAD. Methods and ResultsPatients from 18F-flurpiridaz phase III clinical trial who underwent PET/CT MPI with 18F-flurpiridaz tracer, had available CT attenuation correction (CTAC) scans for CAC scoring, and underwent invasive coronary angiography (ICA) within a 6-month period between 2011 and 2013, were included. Total perfusion deficit (TPD) was quantified automatically, and CAC scores from CTAC scans were assessed using artificial intelligence (AI)-derived segmentation and manual scoring. Obstructive CAD was defined as [&ge;]50% stenosis in Left Main (LM) artery, or 70% or more stenosis in any of the other major epicardial vessels. Prediction performance for CAD was assessed by comparing the area under receiver operating characteristic curve (AUC) for stress TPD alone and in combination with CAC score. Among 498 patients (72% males, median age 63 years) 30.1% had CAD. Incorporating CAC score resulted in a greater AUC: manual scoring (AUC=0.87, 95% Confidence Interval [CI] 0.34-0.90; p=0.015) and AI-based scoring (AUC=0.88, 95%CI 0.85-0.90; p=0.002) compared to stress TPD alone (AUC 0.84, 95% CI 0.80-0.92). ConclusionsCombining automatically derived TPD and CAC score enhances 18F-flurpiridaz PET MPI accuracy in detecting significant CAD, offering a method that can be routinely used with PET/CT scanners without additional scanning or technologist time. CONDENSED ABSTRACTO_ST_ABSBackgroundC_ST_ABSWe assessed the added value of CAC score from hybrid PET/CT CTAC scans combined with stress TPD for detecting significant CAD using novel 18F-flurpiridaz tracer Methods and resultsPatients from the 18F-flurpiridaz phase III clinical trial (n=498, 72% male, median age 63) who underwent PET/CT MPI and ICA within 6-months were included. TPD was quantified automatically, and CAC scores were assessed by AI and manual methods. Adding CAC score to TPD improved AUC for manual (0.87) and AI-based (0.88) scoring versus TPD alone (0.84). ConclusionsCombining TPD and CAC score enhances 18F-flurpiridaz PET MPI accuracy for CAD detection O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=110 SRC="FIGDIR/small/25330013v1_ufig1.gif" ALT="Figure 1"> View larger version (37K): [email protected]@ba93d1org.highwire.dtl.DTLVardef@13eabd9org.highwire.dtl.DTLVardef@1845505_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOGraphical Abstract.C_FLOATNO Overview of the study design. C_FIG

An integrated strategy based on radiomics and quantum machine learning: diagnosis and clinical interpretation of pulmonary ground-glass nodules.

Huang X, Xu F, Zhu W, Yao L, He J, Su J, Zhao W, Hu H

pubmed logopapersJul 11 2025
Accurate classification of pulmonary pure ground-glass nodules (pGGNs) is essential for distinguishing invasive adenocarcinoma (IVA) from adenocarcinoma in situ (AIS) and minimally invasive adenocarcinoma (MIA), which significantly influences treatment decisions. This study aims to develop a high-precision integrated strategy by combining radiomics-based feature extraction, Quantum Machine Learning (QML) models, and SHapley Additive exPlanations (SHAP) analysis to improve diagnostic accuracy and interpretability in pGGN classification. A total of 322 pGGNs from 275 patients were retrospectively analyzed. The CT images was randomly divided into training and testing cohorts (80:20), with radiomic features extracted from the training cohort. Three QML models-Quantum Support Vector Classifier (QSVC), Pegasos QSVC, and Quantum Neural Network (QNN)-were developed and compared with a classical Support Vector Machine (SVM). SHAP analysis was applied to interpret the contribution of radiomic features to the models' predictions. All three QML models outperformed the classical SVM, with the QNN model achieving the highest improvements ([Formula: see text]) in classification metrics, including accuracy (89.23%, 95% CI: 81.54% - 95.38%), sensitivity (96.55%, 95% CI: 89.66% - 100.00%), specificity (83.33%, 95% CI: 69.44% - 94.44%), and area under the curve (AUC) (0.937, 95% CI: 0.871 - 0.983), respectively. SHAP analysis identified Low Gray Level Run Emphasis (LGLRE), Gray Level Non-uniformity (GLN), and Size Zone Non-uniformity (SZN) as the most critical features influencing classification. This study demonstrates that the proposed integrated strategy, combining radiomics, QML models, and SHAP analysis, significantly enhances the accuracy and interpretability of pGGN classification, particularly in small-sample datasets. It offers a promising tool for early, non-invasive lung cancer diagnosis and helps clinicians make more informed treatment decisions. Not applicable.

Interpretable MRI Subregional Radiomics-Deep Learning Model for Preoperative Lymphovascular Invasion Prediction in Rectal Cancer: A Dual-Center Study.

Huang T, Zeng Y, Jiang R, Zhou Q, Wu G, Zhong J

pubmed logopapersJul 11 2025
Develop a fusion model based on explainable machine learning, combining multiparametric MRI subregional radiomics and deep learning, to preoperatively predict the lymphovascular invasion status in rectal cancer. We collected data from RC patients with histopathological confirmation from two medical centers, with 301 patients used as a training set and 75 patients as an external validation set. Using K-means clustering techniques, we meticulously divided the tumor areas into multiple subregions and extracted crucial radiomic features from them. Additionally, we employed an advanced Vision Transformer (ViT) deep learning model to extract features. These features were integrated to construct the SubViT model. To better understand the decision-making process of the model, we used the Shapley Additive Properties (SHAP) tool to evaluate the model's interpretability. Finally, we comprehensively assessed the performance of the SubViT model through receiver operating characteristic (ROC) curves, decision curve analysis (DCA), and the Delong test, comparing it with other models. In this study, the SubViT model demonstrated outstanding predictive performance in the training set, achieving an area under the curve (AUC) of 0.934 (95% confidence interval: 0.9074 to 0.9603). It also performed well in the external validation set, with an AUC of 0.884 (95% confidence interval: 0.8055 to 0.9616), outperforming both subregion radiomics and imaging-based models. Furthermore, decision curve analysis (DCA) indicated that the SubViT model provides higher clinical utility compared to other models. As an advanced composite model, the SubViT model demonstrated its efficiency in the non-invasive assessment of local vascular invasion (LVI) in rectal cancer.

Advancing Rare Neurological Disorder Diagnosis: Addressing Challenges with Systematic Reviews and AI-Driven MRI Meta-Trans Learning Framework for NeuroDegenerative Disorders.

Gupta A, Malhotra D

pubmed logopapersJul 11 2025
Neurological Disorders (ND) affect a large portion of the global population, impacting the brain, spinal cord, and nerves. These disorders fall into categories such as NeuroDevelopmental (NDD), NeuroBiological (NBD), and NeuroDegenerative (ND<sub>e</sub>) disorders, which range from common to rare conditions. While Artificial Intelligence (AI) has advanced healthcare diagnostics, training Machine Learning (ML) and Deep Learning (DL) models for early detection of rare neurological disorders remains a challenge due to limited patient data. This data scarcity poses a significant public health issue. Meta_Trans Learning (M<sub>TA</sub>L), which integrates Meta-Learning (M<sub>t</sub>L) and Transfer Learning (TL), offers a promising solution by leveraging small datasets to extract expert patterns, generalize findings, and reduce AI bias in healthcare. This research systematically reviews studies from 2018 to 2024 to explore how ML and M<sub>TA</sub>L techniques are applied in diagnosing NDD, NBD, and ND<sub>e</sub> disorders. It also provides statistical and parametric analysis of ML and DL methods for neurological disorder diagnosis. Lastly, the study introduces a MRI-based ND<sub>e</sub>-M<sub>TA</sub>L framework to aid healthcare professionals in early detection of rare neuro disorders, aiming to enhance diagnostic accuracy and advance healthcare practices.

Generalizable 7T T1-map Synthesis from 1.5T and 3T T1 MRI with an Efficient Transformer Model

Zach Eidex, Mojtaba Safari, Tonghe Wang, Vanessa Wildman, David S. Yu, Hui Mao, Erik Middlebrooks, Aparna Kesewala, Xiaofeng Yang

arxiv logopreprintJul 11 2025
Purpose: Ultra-high-field 7T MRI offers improved resolution and contrast over standard clinical field strengths (1.5T, 3T). However, 7T scanners are costly, scarce, and introduce additional challenges such as susceptibility artifacts. We propose an efficient transformer-based model (7T-Restormer) to synthesize 7T-quality T1-maps from routine 1.5T or 3T T1-weighted (T1W) images. Methods: Our model was validated on 35 1.5T and 108 3T T1w MRI paired with corresponding 7T T1 maps of patients with confirmed MS. A total of 141 patient cases (32,128 slices) were randomly divided into 105 (25; 80) training cases (19,204 slices), 19 (5; 14) validation cases (3,476 slices), and 17 (5; 14) test cases (3,145 slices) where (X; Y) denotes the patients with 1.5T and 3T T1W scans, respectively. The synthetic 7T T1 maps were compared against the ResViT and ResShift models. Results: The 7T-Restormer model achieved a PSNR of 26.0 +/- 4.6 dB, SSIM of 0.861 +/- 0.072, and NMSE of 0.019 +/- 0.011 for 1.5T inputs, and 25.9 +/- 4.9 dB, and 0.866 +/- 0.077 for 3T inputs, respectively. Using 10.5 M parameters, our model reduced NMSE by 64 % relative to 56.7M parameter ResShift (0.019 vs 0.052, p = <.001 and by 41 % relative to 70.4M parameter ResViT (0.019 vs 0.032, p = <.001) at 1.5T, with similar advantages at 3T (0.021 vs 0.060 and 0.033; p < .001). Training with a mixed 1.5 T + 3 T corpus was superior to single-field strategies. Restricting the model to 1.5T increased the 1.5T NMSE from 0.019 to 0.021 (p = 1.1E-3) while training solely on 3T resulted in lower performance on input 1.5T T1W MRI. Conclusion: We propose a novel method for predicting quantitative 7T MP2RAGE maps from 1.5T and 3T T1W scans with higher quality than existing state-of-the-art methods. Our approach makes the benefits of 7T MRI more accessible to standard clinical workflows.

Breast lesion classification via colorized mammograms and transfer learning in a novel CAD framework.

Hussein AA, Valizadeh M, Amirani MC, Mirbolouk S

pubmed logopapersJul 11 2025
Medical imaging sciences and diagnostic techniques for Breast Cancer (BC) imaging have advanced tremendously, particularly with the use of mammography images; however, radiologists may still misinterpret medical images of the breast, resulting in limitations and flaws in the screening process. As a result, Computer-Aided Design (CAD) systems have become increasingly popular due to their ability to operate independently of human analysis. Current CAD systems use grayscale analysis, which lacks the contrast needed to differentiate benign from malignant lesions. As part of this study, an innovative CAD system is presented that transforms standard grayscale mammography images into RGB colored through a three-path preprocessing framework developed for noise reduction, lesion highlighting, and tumor-centric intensity adjustment using a data-driven transfer function. In contrast to a generic approach, this approach statistically tailors colorization in order to emphasize malignant regions, thus enhancing the ability of both machines and humans to recognize cancerous areas. As a consequence of this conversion, breast tumors with anomalies become more visible, which allows us to extract more accurate features about them. In a subsequent step, Machine Learning (ML) algorithms are employed to classify these tumors as malign or benign cases. A pre-trained model is developed to extract comprehensive features from colored mammography images by employing this approach. A variety of techniques are implemented in the pre-processing section to minimize noise and improve image perception; however, the most challenging methodology is the application of creative techniques to adjust pixels' intensity values in mammography images using a data-driven transfer function derived from tumor intensity histograms. This adjustment serves to draw attention to tumors while reducing the brightness of other areas in the breast image. Measuring criteria such as accuracy, sensitivity, specificity, precision, F1-Score, and Area Under the Curve (AUC) are used to evaluate the efficacy of the employed methodologies. This work employed and tested a variety of pre-training and ML techniques. However, the combination of EfficientNetB0 pre-training with ML Support Vector Machines (SVM) produced optimal results with accuracy, sensitivity, specificity, precision, F1-Score, and AUC, of 99.4%, 98.7%, 99.1%, 99%, 98.8%, and 100%, respectively. It is clear from these results that the developed method does not only advance the state-of-the-art in technical terms, but also provides radiologists with a practical tool to aid in the reduction of diagnostic errors and increase the detection of early breast cancer.
Page 32 of 92915 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.