Latest Papers on Radiology AI. Tags: GenAI

Fusion-Based Brain Tumor Classification Using Deep Learning and Explainable AI, and Rule-Based Reasoning

Melika Filvantorkaman, Mohsen Piri, Maral Filvan Torkaman, Ashkan Zabihi, Hamidreza Moradi

•preprint•Aug 9 2025

Accurate and interpretable classification of brain tumors from magnetic resonance imaging (MRI) is critical for effective diagnosis and treatment planning. This study presents an ensemble-based deep learning framework that combines MobileNetV2 and DenseNet121 convolutional neural networks (CNNs) using a soft voting strategy to classify three common brain tumor types: glioma, meningioma, and pituitary adenoma. The models were trained and evaluated on the Figshare dataset using a stratified 5-fold cross-validation protocol. To enhance transparency and clinical trust, the framework integrates an Explainable AI (XAI) module employing Grad-CAM++ for class-specific saliency visualization, alongside a symbolic Clinical Decision Rule Overlay (CDRO) that maps predictions to established radiological heuristics. The ensemble classifier achieved superior performance compared to individual CNNs, with an accuracy of 91.7%, precision of 91.9%, recall of 91.7%, and F1-score of 91.6%. Grad-CAM++ visualizations revealed strong spatial alignment between model attention and expert-annotated tumor regions, supported by Dice coefficients up to 0.88 and IoU scores up to 0.78. Clinical rule activation further validated model predictions in cases with distinct morphological features. A human-centered interpretability assessment involving five board-certified radiologists yielded high Likert-scale scores for both explanation usefulness (mean = 4.4) and heatmap-region correspondence (mean = 4.0), reinforcing the framework's clinical relevance. Overall, the proposed approach offers a robust, interpretable, and generalizable solution for automated brain tumor classification, advancing the integration of deep learning into clinical neurodiagnostics.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab GenAI

Ultrasound-Based Machine Learning and SHapley Additive exPlanations Method Evaluating Risk of Gallbladder Cancer: A Bicentric and Validation Study.

Chen B, Zhong H, Lin J, Lyu G, Su S

•papers•Aug 9 2025

This study aims to construct and evaluate 8 machine learning models by integrating ultrasound imaging features, clinical characteristics, and serological features to assess the risk of gallbladder cancer (GBC) occurrence in patients. A retrospective analysis was conducted on ultrasound and clinical data of 300 suspected GBC patients who visited the Second Affiliated Hospital of Fujian Medical University from January 2020 to January 2024 and 69 patients who visited the Zhongshan Hospital Affiliated to Xiamen University from January 2024 to January 2025. Key relevant features were selected using Least Absolute Shrinkage and Selection Operator (LASSO) regression. Predictive models were constructed using XGBoost, logistic regression, support vector machine, k-nearest neighbors, random forest, decision tree, naive Bayes, and neural network, with the SHapley Additive exPlanations (SHAP) method employed to explain model interpretability. The LASSO regression demonstrated that gender, age, alkaline phosphatase (ALP), clarity of interface with liver, stratification of the gallbladder wall, intracapsular anechoic lesions, and intracapsular punctiform strong lesions were key features for GBC. The XGBoost model demonstrated an area under receiver operating characteristic curve (AUC) of 0.934, 0.916, and 0.813 in the training, validating, and test sets. SHAP analysis revealed the importance ranking of factors as clarity of interface with liver, stratification of the gallbladder wall, intracapsular anechoic lesions, and intracapsular punctiform strong lesions, ALP, gender, and age. Personalized prediction explanations through SHAP values demonstrated the contribution of each feature to the final prediction, enhancing result interpretability. Furthermore, decision plots were generated to display the influence trajectory of each feature on model predictions, aiding in analyzing which features had the greatest impact on these mispredictions; thereby facilitating further model optimization or feature adjustment. This study proposed a GBC ML model based on ultrasound, clinical, and serological characteristics, indicating the superior performance of the XGBoost model and enhancing the interpretability of the model through the SHAP method.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Emerging trends in NanoTheranostics: Integrating imaging and therapy for precision health care.

Fahmy HM, Bayoumi L, Helal NF, Mohamed NRA, Emarh Y, Ahmed AM

•papers•Aug 9 2025

Nanotheranostics has garnered significant interest for its capacity to improve customized healthcare via targeted and efficient treatment alternatives. Nanotheranostics promises an innovative approach to precision medicine by integrating therapeutic and diagnostic capabilities into nanoscale devices. Nanotheranostics provides an integrated approach that improves diagnosis and facilitates real-time, tailored treatment, revolutionizing patient care. Through the application of nanotheranostic devices, outcomes can be modified for patients on an individualized therapeutic level by taking into consideration individual differences in disease manifestation as well as treatment response. In this review, no aspect of imaging in nanotheranostics is excluded, thus including MRI and CT as well as PET and OI, which are essential for comprehensive analysis needed in medical decision making. Integration of AI and ML into theranostics facilitates predicting treatment outcomes and personalizing the approaches to the methods, which significantly enhances reproducibility in medicine. In addition, several nanoparticles such as lipid-based and polymeric particles, iron oxide, quantum dots, and mesoporous silica have shown promise in diagnosis and targeted drug delivery. These nanoparticles are capable of treating multiple diseases such as cancers, some other neurological disorders, and infectious diseases. While having potential, the field of nanotheranostics still encounters issues regarding clinical applicability, alongside some regulatory hurdles pertaining to new therapeutic agents. Advanced research in this sphere is bound to enhance existing perspectives and fundamentally aid the integration of nanomedicine into conventional health procedures, especially relating to efficacy and the growing emphasis on safe, personalized healthcare.

Mixed Modality Classification Review Concept GenAI

Transformer-Based Explainable Deep Learning for Breast Cancer Detection in Mammography: The MammoFormer Framework

Ojonugwa Oluwafemi Ejiga Peter, Daniel Emakporuena, Bamidele Dayo Tunde, Maryam Abdulkarim, Abdullahi Bn Umar

•preprint•Aug 8 2025

Breast cancer detection through mammography interpretation remains difficult because of the minimal nature of abnormalities that experts need to identify alongside the variable interpretations between readers. The potential of CNNs for medical image analysis faces two limitations: they fail to process both local information and wide contextual data adequately, and do not provide explainable AI (XAI) operations that doctors need to accept them in clinics. The researcher developed the MammoFormer framework, which unites transformer-based architecture with multi-feature enhancement components and XAI functionalities within one framework. Seven different architectures consisting of CNNs, Vision Transformer, Swin Transformer, and ConvNext were tested alongside four enhancement techniques, including original images, negative transformation, adaptive histogram equalization, and histogram of oriented gradients. The MammoFormer framework addresses critical clinical adoption barriers of AI mammography systems through: (1) systematic optimization of transformer architectures via architecture-specific feature enhancement, achieving up to 13% performance improvement, (2) comprehensive explainable AI integration providing multi-perspective diagnostic interpretability, and (3) a clinically deployable ensemble system combining CNN reliability with transformer global context modeling. The combination of transformer models with suitable feature enhancements enables them to achieve equal or better results than CNN approaches. ViT achieves 98.3% accuracy alongside AHE while Swin Transformer gains a 13.0% advantage through HOG enhancements

Mammography Classification Breast Methodology In Silico GenAI Ethics

GPT-4 vs. Radiologists: who advances mediastinal tumor classification better across report quality levels? A cohort study.

Wen R, Li X, Chen K, Sun M, Zhu C, Xu P, Chen F, Ji C, Mi P, Li X, Deng X, Yang Q, Song W, Shang Y, Huang S, Zhou M, Wang J, Zhou C, Chen W, Liu C

•papers•Aug 8 2025

Accurate mediastinal tumor classification is crucial for treatment planning, but diagnostic performance varies with radiologists' experience and report quality. To evaluate GPT-4's diagnostic accuracy in classifying mediastinal tumors from radiological reports compared to radiologists of different experience levels using radiological reports of varying quality. We conducted a retrospective study of 1,494 patients from five tertiary hospitals with mediastinal tumors diagnosed via chest CT and pathology. Radiological reports were categorized into low-, medium-, and high-quality based on predefined criteria assessed by experienced radiologists. Six radiologists (two residents, two attending radiologists, and two associate senior radiologists) and GPT-4 evaluated the chest CT reports. Diagnostic performance was analyzed overall, by report quality, and by tumor type using Wald χ2 tests and 95% CIs calculated via the Wilson method. GPT-4 achieved an overall diagnostic accuracy of 73.3% (95% CI: 71.0-75.5), comparable to associate senior radiologists (74.3%, 95% CI: 72.0-76.5; p >0.05). For low-quality reports, GPT-4 outperformed associate senior radiologists (60.8% vs. 51.1%, p<0.001). In high-quality reports, GPT-4 was comparable to attending radiologists (80.6% vs.79.4%, p>0.05). Diagnostic performance varied by tumor type: GPT-4 was comparable to radiology residents for neurogenic tumors (44.9% vs. 50.3%, p>0.05), similar to associate senior radiologists for teratomas (68.1% vs. 65.9%, p>0.05), and superior in diagnosing lymphoma (75.4% vs. 60.4%, p<0.001). GPT-4 demonstrated interpretation accuracy comparable to Associate Senior Radiologists, excelling in low-quality reports and outperforming them in diagnosing lymphoma. These findings underscore GPT-4's potential to enhance diagnostic performance in challenging diagnostic scenarios.

CT Classification Chest Retrospective Clinical In Silico Academic Lab GenAI

Synthesized myelin and iron stainings from 7T multi-contrast MRI via deep learning.

Pittayapong S, Hametner S, Bachrata B, Endmayr V, Bogner W, Höftberger R, Grabner G

•papers•Aug 8 2025

Iron and myelin are key biomarkers for studying neurodegenerative and demyelinating brain diseases. Multi-contrast MRI techniques, such as R2* and QSM, are commonly used for iron assessment, with histology as the reference standard, but non-invasive myelin assessment remains challenging. To address this, we developed a deep learning model to generate iron and myelin staining images from in vivo multi-contrast MRI data, with a resolution comparable to ex vivo histology macro-scans. A cadaver head was scanned using a 7T MR scanner to acquire T1-weighted and multi-echo GRE data for R2*, and QSM processing, followed by histological staining for myelin and iron. To evaluate the generalizability of the model, a second cadaver head and two in vivo MRI datasets were included. After MRI-to-histology registration in the training subject, a self-attention generative adversarial network (GAN) was trained to synthesize myelin and iron staining images using various combinations of MRI contrast. The model achieved optimal myelin prediction when combining T1w, R2*, and QSM images. Incorporating the synthesized myelin images improved the subsequent prediction of iron staining. The generated images displayed fine details similar to those in histology data and demonstrated generalizability across healthy control subjects. Synthesized myelin images clearly differentiated myelin concentration between white and gray matter, while synthesized iron staining presented distinct patterns such as particularly high deposition in deep gray matter. This study shows that deep learning can transform MRI data into histological feature images, offering ex vivo insights from in vivo data and contributing to advancements in brain histology research.

MRI Image Synthesis Neurological Methodology In Silico Academic Lab GenAI

GPT-4 for automated sequence-level determination of MRI protocols based on radiology request forms from clinical routine.

Terzis R, Kaya K, Schömig T, Janssen JP, Iuga AI, Kottlors J, Lennartz S, Gietzen C, Gözdas C, Müller L, Hahnfeldt R, Maintz D, Dratsch T, Pennig L

•papers•Aug 8 2025

This study evaluated GPT-4's accuracy in MRI sequence selection based on radiology request forms (RRFs), comparing its performance to radiology residents. This retrospective study included 100 RRFs across four subspecialties (cardiac imaging, neuroradiology, musculoskeletal, and oncology). GPT-4 and two radiology residents (R1: 2 years, R2: 5 years MRI experience) selected sequences based on each patient's medical history and clinical questions. Considering imaging society guidelines, five board-certified specialized radiologists assessed protocols based on completeness, quality, and utility in consensus, using 5-point Likert scales. Clinical applicability was rated binarily by the institution's lead radiographer. GPT-4 achieved median scores of 3 (1-5) for completeness, 4 (1-5) for quality, and 4 (1-5) for utility, comparable to R1 (3 (1-5), 4 (1-5), 4 (1-5); each p > 0.05) but inferior to R2 (4 (1-5), 5 (1-5); p < 0.01, respectively, and 5 (1-5); p < 0.001). Subspecialty protocol quality varied: GPT-4 matched R1 (4 (2-4) vs. 4 (2-5), p = 0.20) and R2 (4 (2-5); p = 0.47) in cardiac imaging; showed no differences in neuroradiology (all 5 (1-5), p > 0.05); scored lower than R1 and R2 in musculoskeletal imaging (3 (2-5) vs. 4 (3-5); p < 0.01, and 5 (3-5); p < 0.001); and matched R1 (4 (1-5) vs. 2 (1-4), p = 0.12) as well as R2 (5 (2-5); p = 0.20) in oncology. GPT-4-based protocols were clinically applicable in 95% of cases, comparable to R1 (95%) and R2 (96%). GPT-4 generated MRI protocols with notable completeness, quality, utility, and clinical applicability, excelling in standardized subspecialties like cardiac and neuroradiology imaging while yielding lower accuracy in musculoskeletal examinations. Question Long MRI acquisition times limit patient access, making accurate protocol selection crucial for efficient diagnostics, though it's time-consuming and error-prone, especially for inexperienced residents. Findings GPT-4 generated MRI protocols of remarkable yet inconsistent quality, performing on par with an experienced resident in standardized fields, but moderately in musculoskeletal examinations. Clinical relevance The large language model can assist less experienced radiologists in determining detailed MRI protocols and counteract increasing workloads. The model could function as a semi-automatic tool, generating MRI protocols for radiologists' confirmation, optimizing resource allocation, and improving diagnostics and cost-effectiveness.

MRI Triage Retrospective Clinical In Silico Academic Lab GenAI

Advanced dynamic ensemble framework with explainability driven insights for precision brain tumor classification across datasets.

Singh R, Gupta S, Ibrahim AO, Gabralla LA, Bharany S, Rehman AU, Hussen S

•papers•Aug 8 2025

Accurate detection of brain tumors remains a significant challenge due to the diversity of tumor types along with human interventions during diagnostic process. This study proposes a novel ensemble deep learning system for accurate brain tumor classification using MRI data. The proposed system integrates fine-tuned Convolutional Neural Network (CNN), ResNet-50 and EfficientNet-B5 to create a dynamic ensemble framework that addresses existing challenges. An adaptive dynamic weight distribution strategy is employed during training to optimize the contribution of each networks in the framework. To address class imbalance and improve model generalization, a customized weighted cross-entropy loss function is incorporated. The model obtains improved interpretability through explainabile artificial intelligence (XAI) techniques, including Grad-CAM, SHAP, SmoothGrad, and LIME, providing deeper insights into prediction rationale. The proposed system achieves a classification accuracy of 99.4% on the test set, 99.48% on the validation set, and 99.31% in cross-dataset validation. Furthermore, entropy-based uncertainty analysis quantifies prediction confidence, yielding an average entropy of 0.3093 and effectively identifying uncertain predictions to mitigate diagnostic errors. Overall, the proposed framework demonstrates high accuracy, robustness, and interpretability, highlighting its potential for integration into automated brain tumor diagnosis systems.

MRI Classification Neurological Methodology In Silico GenAI

Enhancing B-mode-based breast cancer diagnosis via cross-attention fusion of H-scan and Nakagami imaging with multi-CAM-QUS-driven XAI.

Mondol SS, Hasan MK

•papers•Aug 8 2025

B-mode ultrasound is widely employed for breast lesion diagnosis due to its affordability, widespread availability, and effectiveness, particularly in cases of dense breast tissue where mammography may be less sensitive. However, it disregards critical tissue information embedded in raw radiofrequency (RF) data. While both modalities have demonstrated promise in Computer-Aided Diagnosis (CAD), their combined potential remains largely unexplored.Approach.This paper presents an automated breast lesion classification network that utilizes H-scan and Nakagami parametric images derived from RF ultrasound signals, combined with machine-generated B-mode images, seamlessly integrated through a Multi Modal Cross Attention Fusion (MM-CAF) mechanism to extract complementary information. The proposed architecture also incorporates an attention-guided modified InceptionV3 for feature extraction, a Knowledge-Guided Cross-Modality Learning (KGCML) module for inter‑modal knowledge sharing, and Attention-Driven Context Enhancement (ADCE) modules to improve contextual understanding and fusion with the classification network. The network employs categorical cross-entropy loss, a Multi-CAM-based loss to guide learning toward accurate lesion-specific features, and a Multi-QUS-based loss to embed clinically meaningful domain knowledge and effectively distinguishing between benign and malignant lesions, all while supporting explainable AI (XAI) principles.Main results. Experiments conducted on multi-center breast ultrasound datasets--BUET-BUSD, ATL, and OASBUD--characterized by demographic diversity, demonstrate the effectiveness of the proposed approach, achieving classification accuracies of 92.54%, 89.93%, and 90.0%, respectively, along with high interpretability and trustworthiness. These results surpass those of existing methods based on B-mode and/or RF data, highlighting the superior performance and robustness of the proposed technique. By integrating complementary RF‑derived information with B‑mode imaging with pseudo‑segmentation and domain‑informed loss functions, our method significantly boosts lesion classification accuracy-enabling fully automated, explainable CAD and paving the way for widespread clinical adoption of AI‑driven breast screening.

Ultrasound Classification Breast Retrospective Clinical In Silico Academic Lab GenAI

LLM-Based Extraction of Imaging Features from Radiology Reports: Automating Disease Activity Scoring in Crohn's Disease.

Dehdab R, Mankertz F, Brendel JM, Maalouf N, Kaya K, Afat S, Kolahdoozan S, Radmard AR

•papers•Aug 8 2025

Large Language Models (LLMs) offer a promising solution for extracting structured clinical information from free-text radiology reports. The Simplified Magnetic Resonance Index of Activity (sMARIA) is a validated scoring system used to quantify Crohn's disease (CD) activity based on Magnetic Resonance Enterography (MRE) findings. This study aims to evaluate the performance of two advanced LLMs in extracting key imaging features and computing sMARIA scores from free-text MRE reports. This retrospective study included 117 anonymized free-text MRE reports from patients with confirmed CD. ChatGPT (GPT-4o) and DeepSeek (DeepSeek-R1) were prompted using a structured input designed to extract four key radiologic features relevant to sMARIA: bowel wall thickness, mural edema, perienteric fat stranding, and ulceration. LLM outputs were evaluated against radiologist annotations at both the segment and feature levels. Segment-level agreement was assessed using accuracy, mean absolute error (MAE) and Pearson correlation. Feature-level performance was evaluated using sensitivity, specificity, precision, and F1-score. Errors including confabulations were recorded descriptively. ChatGPT achieved a segment-level accuracy of 98.6%, MAE of 0.17, and Pearson correlation of 0.99. DeepSeek achieved 97.3% accuracy, MAE of 0.51, and correlation of 0.96. At the feature level, ChatGPT yielded an F1-score of 98.8% (precision 97.8%, sensitivity 99.9%), while DeepSeek achieved 97.9% (precision 96.0%, sensitivity 99.8%). LLMs demonstrate near-human accuracy in extracting structured information and computing sMARIA scores from free-text MRE reports. This enables automated assessment of CD activity without altering current reporting workflows, supporting longitudinal monitoring and large-scale research. Integration into clinical decision support systems may be feasible in the future, provided appropriate human oversight and validation are ensured.

MRI LLM Radiology Report Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Filter Papers

Tags

Fusion-Based Brain Tumor Classification Using Deep Learning and Explainable AI, and Rule-Based Reasoning

Ultrasound-Based Machine Learning and SHapley Additive exPlanations Method Evaluating Risk of Gallbladder Cancer: A Bicentric and Validation Study.

Emerging trends in NanoTheranostics: Integrating imaging and therapy for precision health care.

Transformer-Based Explainable Deep Learning for Breast Cancer Detection in Mammography: The MammoFormer Framework

GPT-4 vs. Radiologists: who advances mediastinal tumor classification better across report quality levels? A cohort study.

Synthesized myelin and iron stainings from 7T multi-contrast MRI via deep learning.

GPT-4 for automated sequence-level determination of MRI protocols based on radiology request forms from clinical routine.

Advanced dynamic ensemble framework with explainability driven insights for precision brain tumor classification across datasets.

Enhancing B-mode-based breast cancer diagnosis via cross-attention fusion of H-scan and Nakagami imaging with multi-CAM-QUS-driven XAI.

LLM-Based Extraction of Imaging Features from Radiology Reports: Automating Disease Activity Scoring in Crohn's Disease.

Ready to Sharpen Your Edge?