Latest Papers on Radiology AI. Tags: Other, Order: Best Match, Limit: 10.

Dual-Domain deep prior guided sparse-view CT reconstruction with multi-scale fusion attention.

Wu J, Lin J, Jiang X, Zheng W, Zhong L, Pang Y, Meng H, Li Z

•papers•May 15 2025

Sparse-view CT reconstruction is a challenging ill-posed inverse problem, where insufficient projection data leads to degraded image quality with increased noise and artifacts. Recent deep learning approaches have shown promising results in CT reconstruction. However, existing methods often neglect projection data constraints and rely heavily on convolutional neural networks, resulting in limited feature extraction capabilities and inadequate adaptability. To address these limitations, we propose a Dual-domain deep Prior-guided Multi-scale fusion Attention (DPMA) model for sparse-view CT reconstruction, aiming to enhance reconstruction accuracy while ensuring data consistency and stability. First, we establish a residual regularization strategy that applies constraints on the difference between the prior image and target image, effectively integrating deep learning-based priors with model-based optimization. Second, we develop a multi-scale fusion attention mechanism that employs parallel pathways to simultaneously model global context, regional dependencies, and local details in a unified framework. Third, we incorporate a physics-informed consistency module based on range-null space decomposition to ensure adherence to projection data constraints. Experimental results demonstrate that DPMA achieves improved reconstruction quality compared to existing approaches, particularly in noise suppression, artifact reduction, and fine detail preservation.

CT Reconstruction Methodology In Silico Academic Lab

Advancing Multiple Instance Learning with Continual Learning for Whole Slide Imaging

Xianrui Li, Yufei Cui, Jun Li, Antoni B. Chan

•preprint•May 15 2025

Advances in medical imaging and deep learning have propelled progress in whole slide image (WSI) analysis, with multiple instance learning (MIL) showing promise for efficient and accurate diagnostics. However, conventional MIL models often lack adaptability to evolving datasets, as they rely on static training that cannot incorporate new information without extensive retraining. Applying continual learning (CL) to MIL models is a possible solution, but often sees limited improvements. In this paper, we analyze CL in the context of attention MIL models and find that the model forgetting is mainly concentrated in the attention layers of the MIL model. Using the results of this analysis we propose two components for improving CL on MIL: Attention Knowledge Distillation (AKD) and the Pseudo-Bag Memory Pool (PMP). AKD mitigates catastrophic forgetting by focusing on retaining attention layer knowledge between learning sessions, while PMP reduces the memory footprint by selectively storing only the most informative patches, or ``pseudo-bags'' from WSIs. Experimental evaluations demonstrate that our method significantly improves both accuracy and memory efficiency on diverse WSI datasets, outperforming current state-of-the-art CL methods. This work provides a foundation for CL in large-scale, weakly annotated clinical datasets, paving the way for more adaptable and resilient diagnostic models.

Mixed Modality Classification Methodology In Silico Academic Lab Benchmark SOTA

Scientific Evidence for Clinical Text Summarization Using Large Language Models: Scoping Review.

Bednarczyk L, Reichenpfader D, Gaudet-Blavignac C, Ette AK, Zaghir J, Zheng Y, Bensahla A, Bjelogrlic M, Lovis C

•papers•May 15 2025

Information overload in electronic health records requires effective solutions to alleviate clinicians' administrative tasks. Automatically summarizing clinical text has gained significant attention with the rise of large language models. While individual studies show optimism, a structured overview of the research landscape is lacking. This study aims to present the current state of the art on clinical text summarization using large language models, evaluate the level of evidence in existing research and assess the applicability of performance findings in clinical settings. This scoping review complied with the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines. Literature published between January 1, 2019, and June 18, 2024, was identified from 5 databases: PubMed, Embase, Web of Science, IEEE Xplore, and ACM Digital Library. Studies were excluded if they did not describe transformer-based models, did not focus on clinical text summarization, did not engage with free-text data, were not original research, were nonretrievable, were not peer-reviewed, or were not in English, French, Spanish, or German. Data related to study context and characteristics, scope of research, and evaluation methodologies were systematically collected and analyzed by 3 authors independently. A total of 30 original studies were included in the analysis. All used observational retrospective designs, mainly using real patient data (n=28, 93%). The research landscape demonstrated a narrow research focus, often centered on summarizing radiology reports (n=17, 57%), primarily involving data from the intensive care unit (n=15, 50%) of US-based institutions (n=19, 73%), in English (n=26, 87%). This focus aligned with the frequent reliance on the open-source Medical Information Mart for Intensive Care dataset (n=15, 50%). Summarization methodologies predominantly involved abstractive approaches (n=17, 57%) on single-document inputs (n=4, 13%) with unstructured data (n=13, 43%), yet reporting on methodological details remained inconsistent across studies. Model selection involved both open-source models (n=26, 87%) and proprietary models (n=7, 23%). Evaluation frameworks were highly heterogeneous. All studies conducted internal validation, but external validation (n=2, 7%), failure analysis (n=6, 20%), and patient safety risks analysis (n=1, 3%) were infrequent, and none reported bias assessment. Most studies used both automated metrics and human evaluation (n=16, 53%), while 10 (33%) used only automated metrics, and 4 (13%) only human evaluation. Key barriers hinder the translation of current research into trustworthy, clinically valid applications. Current research remains exploratory and limited in scope, with many applications yet to be explored. Performance assessments often lack reliability, and clinical impact evaluations are insufficient raising concerns about model utility, safety, fairness, and data privacy. Advancing the field requires more robust evaluation frameworks, a broader research scope, and a stronger focus on real-world applicability.

Mixed Modality LLM Radiology Report Review Concept Academic Lab GenAI Policy Benchmark SOTA

Leveraging Vision Transformers in Multimodal Models for Retinal OCT Analysis.

Feretzakis G, Karakosta C, Gkoulalas-Divanis A, Bisoukis A, Boufeas IZ, Bazakidou E, Sakagianni A, Kalles D, Verykios VS

•papers•May 15 2025

Optical Coherence Tomography (OCT) has become an indispensable imaging modality in ophthalmology, providing high-resolution cross-sectional images of the retina. Accurate classification of OCT images is crucial for diagnosing retinal diseases such as Age-related Macular Degeneration (AMD) and Diabetic Macular Edema (DME). This study explores the efficacy of various deep learning models, including convolutional neural networks (CNNs) and Vision Transformers (ViTs), in classifying OCT images. We also investigate the impact of integrating metadata (patient age, sex, eye laterality, and year) into the classification process, even when a significant portion of metadata is missing. Our results demonstrate that multimodal models leveraging both image and metadata inputs, such as the Multimodal ResNet18, can achieve competitive performance compared to image-only models, such as DenseNet121. Notably, DenseNet121 and Multimodal ResNet18 achieved the highest accuracy of 95.16%, with DenseNet121 showing a slightly higher F1-score of 0.9313. The multimodal ViT-based model also demonstrated promising results, achieving an accuracy of 93.22%, indicating the potential of Vision Transformers (ViTs) in medical image analysis, especially for handling complex multimodal data.

OCT Classification Methodology In Silico Academic Lab GenAI

Participatory Co-Creation of an AI-Supported Patient Information System: A Multi-Method Qualitative Study.

Heizmann C, Gleim P, Kellmeyer P

•papers•May 15 2025

In radiology and other medical fields, informed consent often rely on paper-based forms, which can overwhelm patients with complex terminology. These forms are also resource-intensive. The KIPA project addresses these challenges by developing an AI-assisted patient information system to streamline the consent process, improve patient understanding, and reduce healthcare workload. The KIPA system uses natural language processing (NLP) to provide real-time, accessible explanations, answer questions, and support informed consent. KIPA follows an 'ethics-by-design' approach, integrating user feedback to align with patient and clinician needs. Interviews and usability testing identified requirements, such as simplified language and support for varying digital literacy. The study presented here explores the participatory co-creation of the KIPA system, focusing on improving informed consent in radiology through a multi-method qualitative approach. Preliminary results suggest that KIPA improves patient engagement and reduces insecurities by providing proactive guidance and tailored information. Future work will extend testing to other stakeholders and assess the impact of the system on clinical workflow.

Mixed Modality Report Generation Methodology Prototype Academic Lab Ethics

Deep learning MRI-based radiomic models for predicting recurrence in locally advanced nasopharyngeal carcinoma after neoadjuvant chemoradiotherapy: a multi-center study.

Hu C, Xu C, Chen J, Huang Y, Meng Q, Lin Z, Huang X, Chen L

•papers•May 15 2025

Local recurrence and distant metastasis were a common manifestation of locoregionally advanced nasopharyngeal carcinoma (LA-NPC) after neoadjuvant chemoradiotherapy (NACT). To validate the clinical value of MRI radiomic models based on deep learning for predicting the recurrence of LA-NPC patients. A total of 328 NPC patients from four hospitals were retrospectively included and divided into the training(n = 229) and validation (n = 99) cohorts randomly. Extracting 975 traditional radiomic features and 1000 deep radiomic features from contrast enhanced T1-weighted (T1WI + C) and T2-weighted (T2WI) sequences, respectively. Least absolute shrinkage and selection operator (LASSO) was applied for feature selection. Five machine learning classifiers were conducted to develop three models for LA-NPC prediction in training cohort, namely Model I: traditional radiomic features, Model II: combined the deep radiomic features with Model I, and Model III: combined Model II with clinical features. The predictive performance of these models were evaluated by receive operating characteristic (ROC) curve analysis, area under the curve (AUC), accuracy, sensitivity and specificity in both cohorts. The clinical characteristics in two cohorts showed no significant differences. Choosing 15 radiomic features and 6 deep radiomic features from T1WI + C. Choosing 9 radiomic features and 6 deep radiomic features from T2WI. In T2WI, the Model II based on Random forest (RF) (AUC = 0.87) performed best compared with other models in validation cohort. Traditional radiomic model combined with deep radiomic features shows excellent predictive performance. It could be used assist clinical doctors to predict curative effect for LA-NPC patients after NACT.

MRI Classification Retrospective Clinical In Silico Academic Lab

Uncertainty Co-estimator for Improving Semi-Supervised Medical Image Segmentation.

Zeng X, Xiong S, Xu J, Du G, Rong Y

•papers•May 15 2025

Recently, combining the strategy of consistency regularization with uncertainty estimation has shown promising performance on semi-supervised medical image segmentation tasks. However, most existing methods estimate the uncertainty solely based on the outputs of a single neural network, which results in imprecise uncertainty estimations and eventually degrades the segmentation performance. In this paper, we propose a novel Uncertainty Co-estimator (UnCo) framework to deal with this problem. Inspired by the co-training technique, UnCo establishes two different mean-teacher modules (i.e., two pairs of teacher and student models), and estimates three types of uncertainty from the multi-source predictions generated by these models. Through combining these uncertainties, their differences will help to filter out incorrect noise in each estimate, thus allowing the final fused uncertainty maps to be more accurate. These resulting maps are then used to enhance a cross-consistency regularization imposed between the two modules. In addition, UnCo also designs an internal consistency regularization within each module, so that the student models can aggregate diverse feature information from both modules, thus promoting the semi-supervised segmentation performance. Finally, an adversarial constraint is introduced to maintain the model diversity. Experimental results on four medical image datasets indicate that UnCo can achieve new state-of-the-art performance on both 2D and 3D semi-supervised segmentation tasks. The source code will be available at https://github.com/z1010x/UnCo.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA Open Code

An Annotated Multi-Site and Multi-Contrast Magnetic Resonance Imaging Dataset for the study of the Human Tongue Musculature.

Ribeiro FL, Zhu X, Ye X, Tu S, Ngo ST, Henderson RD, Steyn FJ, Kiernan MC, Barth M, Bollmann S, Shaw TB

•papers•May 14 2025

This dataset provides the first annotated, openly available MRI-based imaging dataset for investigations of tongue musculature, including multi-contrast and multi-site MRI data from non-disease participants. The present dataset includes 47 participants collated from three studies: BeLong (four participants; T2-weighted images), EATT4MND (19 participants; T2-weighted images), and BMC (24 participants; T1-weighted images). We provide manually corrected segmentations of five key tongue muscles: the superior longitudinal, combined transverse/vertical, genioglossus, and inferior longitudinal muscles. Other phenotypic measures, including age, sex, weight, height, and tongue muscle volume, are also available for use. This dataset will benefit researchers across domains interested in the structure and function of the tongue in health and disease. For instance, researchers can use this data to train new machine learning models for tongue segmentation, which can be leveraged for segmentation and tracking of different tongue muscles engaged in speech formation in health and disease. Altogether, this dataset provides the means to the scientific community for investigation of the intricate tongue musculature and its role in physiological processes and speech production.

MRI Segmentation Dataset Release In Silico Academic Lab Open Dataset

Predicting response to anti-VEGF therapy in neovascular age-related macular degeneration using random forest and SHAP algorithms.

Zhang P, Duan J, Wang C, Li X, Su J, Shang Q

•papers•May 14 2025

This study aimed to establish and validate a prediction model based on machine learning methods and SHAP algorithm to predict response to anti-vascular endothelial growth factor (VEGF) therapy in neovascular age-related macular degeneration (AMD). In this retrospective study, we extracted data including demographic characteristics, laboratory test results, and imaging features from optical coherence tomography (OCT) and optical coherence tomography angiography (OCTA). Eight machine learning methods, including Logistic Regression, Gradient Boosting Decision Tree, Random Forest, CatBoost, Support Vector Machine, XGboost, LightGBM, K Nearest Neighbors were employed to develop the predictive model. The machine learning method with optimal performance was selected for further interpretation. Finally, the SHAP algorithm was applied to explain the model's predictions. The study included 145 patients with neovascular AMD. Among the eight models developed, the Random Forest model demonstrated general optimal performance, achieving a high accuracy of 75.86% and the highest area under the receiver operating characteristic curve (AUC) value of 0.91. In this model, important features identified as significant contributors to the response to anti-VEGF therapy in neovascular AMD patients included fractal dimension, total number of end points, total number of junctions, total vessels length, vessels area, average lacunarity, choroidal neovascularization (CNV) type, age, duration and logMAR BCVA. SHAP analysis and visualization provided interpretation at both the factor level and individual level. The Random Forest model for predicting response to anti-VEGF therapy in neovascular AMD using SHAP algorithm proved to be feasible and effective. OCTA imaging features, such as fractal dimension, total number of end points et al, were the most effective predictive factors.

OCT Classification Retrospective Clinical In Silico Academic Lab

Multi-Task Deep Learning for Predicting Metabolic Syndrome from Retinal Fundus Images in a Japanese Health Checkup Dataset

Itoh, T., Nishitsuka, K., Fukuma, Y., Wada, S.

•preprint•May 14 2025

BackgroundRetinal fundus images provide a noninvasive window into systemic health, offering opportunities for early detection of metabolic disorders such as metabolic syndrome (METS). ObjectiveThis study aimed to develop a deep learning model to predict METS from fundus images obtained during routine health checkups, leveraging a multi-task learning approach. MethodsWe retrospectively analyzed 5,000 fundus images from Japanese health checkup participants. Convolutional neural network (CNN) models were trained to classify METS status, incorporating fundus-specific data augmentation strategies and auxiliary regression tasks targeting clinical parameters such as abdominal circumference (AC). Model performance was evaluated using validation accuracy, test accuracy, and the area under the receiver operating characteristic curve (AUC). ResultsModels employing fundus-specific augmentation demonstrated more stable convergence and superior validation accuracy compared to general-purpose augmentation. Incorporating AC as an auxiliary task further enhanced performance across architectures. The final ensemble model with test-time augmentation achieved a test accuracy of 0.696 and an AUC of 0.73178. ConclusionCombining multi-task learning, fundus-specific data augmentation, and ensemble prediction substantially improves deep learning-based METS classification from fundus images. This approach may offer a practical, noninvasive screening tool for metabolic syndrome in general health checkup settings.

OCT Classification Retrospective Clinical In Silico Academic Lab

Dual-Domain deep prior guided sparse-view CT reconstruction with multi-scale fusion attention.

Advancing Multiple Instance Learning with Continual Learning for Whole Slide Imaging

Scientific Evidence for Clinical Text Summarization Using Large Language Models: Scoping Review.

Leveraging Vision Transformers in Multimodal Models for Retinal OCT Analysis.

Participatory Co-Creation of an AI-Supported Patient Information System: A Multi-Method Qualitative Study.

Deep learning MRI-based radiomic models for predicting recurrence in locally advanced nasopharyngeal carcinoma after neoadjuvant chemoradiotherapy: a multi-center study.

Uncertainty Co-estimator for Improving Semi-Supervised Medical Image Segmentation.

An Annotated Multi-Site and Multi-Contrast Magnetic Resonance Imaging Dataset for the study of the Human Tongue Musculature.

Predicting response to anti-VEGF therapy in neovascular age-related macular degeneration using random forest and SHAP algorithms.

Multi-Task Deep Learning for Predicting Metabolic Syndrome from Retinal Fundus Images in a Japanese Health Checkup Dataset

Ready to Sharpen Your Edge?