Latest Papers on Radiology AI. Tags: Other, Order: Best Match, Limit: 10.

LiteMIL: A Computationally Efficient Transformer-Based MIL for Cancer Subtyping on Whole Slide Images.

Kussaibi, H.

•preprint•May 12 2025

PurposeAccurate cancer subtyping is crucial for effective treatment; however, it presents challenges due to overlapping morphology and variability among pathologists. Although deep learning (DL) methods have shown potential, their application to gigapixel whole slide images (WSIs) is often hindered by high computational demands and the need for efficient, context-aware feature aggregation. This study introduces LiteMIL, a computationally efficient transformer-based multiple instance learning (MIL) network combined with Phikon, a pathology-tuned self-supervised feature extractor, for robust and scalable cancer subtyping on WSIs. MethodsInitially, patches were extracted from TCGA-THYM dataset (242 WSIs, six subtypes) and subsequently fed in real-time to Phikon for feature extraction. To train MILs, features were arranged into uniform bags using a chunking strategy that maintains tissue context while increasing training data. LiteMIL utilizes a learnable query vector within an optimized multi-head attention module for effective feature aggregation. The models performance was evaluated against established MIL methods on the Thymic Dataset and three additional TCGA datasets (breast, lung, and kidney cancer). ResultsLiteMIL achieved 0.89 {+/-} 0.01 F1 score and 0.99 AUC on Thymic dataset, outperforming other MILs. LiteMIL demonstrated strong generalizability across the external datasets, scoring the best on breast and kidney cancer datasets. Compared to TransMIL, LiteMIL significantly reduces training time and GPU memory usage. Ablation studies confirmed the critical role of the learnable query and layer normalization in enhancing performance and stability. ConclusionLiteMIL offers a resource-efficient, robust solution. Its streamlined architecture, combined with the compact Phikon features, makes it suitable for integrating into routine histopathological workflows, particularly in resource-limited settings.

Mixed Modality Classification Methodology In Silico Academic Lab Benchmark SOTA

Automated scout-image-based estimation of contrast agent dosing: a deep learning approach

Schirrmeister, R., Taleb, L., Friemel, P., Reisert, M., Bamberg, F., Weiss, J., Rau, A.

•preprint•May 12 2025

We developed and tested a deep-learning-based algorithm for the approximation of contrast agent dosage based on computed tomography (CT) scout images. We prospectively enrolled 817 patients undergoing clinically indicated CT imaging, predominantly of the thorax and/or abdomen. Patient weight was collected by study staff prior to the examination 1) with a weight scale and 2) as self-reported. Based on the scout images, we developed an EfficientNet convolutional neural network pipeline to estimate the optimal contrast agent dose based on patient weight and provide a browser-based user interface as a versatile open-source tool to account for different contrast agent compounds. We additionally analyzed the body-weight-informative CT features by synthesizing representative examples for different weights using in-context learning and dataset distillation. The cohort consisted of 533 thoracic, 70 abdominal and 229 thoracic-abdominal CT scout scans. Self-reported patient weight was statistically significantly lower than manual measurements (75.13 kg vs. 77.06 kg; p < 10-5, Wilcoxon signed-rank test). Our pipeline predicted patient weight with a mean absolute error of 3.90 {+/-} 0.20 kg (corresponding to a roughly 4.48 - 11.70 ml difference in contrast agent depending on the agent) in 5-fold cross-validation and is publicly available at https://tinyurl.com/ct-scout-weight. Interpretability analysis revealed that both larger anatomical shape and higher overall attenuation were predictive of body weight. Our open-source deep learning pipeline allows for the automatic estimation of accurate contrast agent dosing based on scout images in routine CT imaging studies. This approach has the potential to streamline contrast agent dosing workflows, improve efficiency, and enhance patient safety by providing quick and accurate weight estimates without additional measurements or reliance on potentially outdated records. The models performance may vary depending on patient positioning and scout image quality and the approach requires validation on larger patient cohorts and other clinical centers. Author SummaryAutomation of medical workflows using AI has the potential to increase reproducibility while saving costs and time. Here, we investigated automating the estimation of the required contrast agent dosage for CT examinations. We trained a deep neural network to predict the body weight from the initial 2D CT Scout images that are required prior to the actual CT examination. The predicted weight is then converted to a contrast agent dosage based on contrast-agent-specific conversion factors. To facilitate application in clinical routine, we developed a user-friendly browser-based user interface that allows clinicians to select a contrast agent or input a custom conversion factor to receive dosage suggestions, with local data processing in the browser. We also investigate what image characteristics predict body weight and find plausible relationships such as higher attenuation and larger anatomical shapes correlating with higher body weights. Our work goes beyond prior work by implementing a single model for a variety of anatomical regions, providing an accessible user interface and investigating the predictive characteristics of the images.

CT Classification Prospective In Silico Academic Lab Open Code

Benchmarking Radiology Report Generation From Noisy Free-Texts.

Yuan Y, Zheng Y, Qu L

•papers•May 12 2025

Automatic radiology report generation can enhance diagnostic efficiency and accuracy. However, clean open-source imaging scan-report pairs are limited in scale and variety. Moreover, the vast amount of radiological texts available online is often too noisy to be directly employed. To address this challenge, we introduce a novel task called Noisy Report Refinement (NRR), which generates radiology reports from noisy free-texts. To achieve this, we propose a report refinement pipeline that leverages large language models (LLMs) enhanced with guided self-critique and report selection strategies. To address the inability of existing radiology report generation metrics in measuring cleanliness, radiological usefulness, and factual correctness across various modalities of reports in NRR task, we introduce a new benchmark, NRRBench, for NRR evaluation. This benchmark includes two online-sourced datasets and four clinically explainable LLM-based metrics: two metrics evaluate the matching rate of radiology entities and modality-specific template attributes respectively, one metric assesses report cleanliness, and a combined metric evaluates overall NRR performance. Experiments demonstrate that guided self-critique and report selection strategies significantly improve the quality of refined reports. Additionally, our proposed metrics show a much higher correlation with noisy rate and error count of reports than radiology report generation metrics in evaluating NRR.

Mixed Modality LLM Radiology Report Methodology In Silico Benchmark SOTA GenAI

Biological markers and psychosocial factors predict chronic pain conditions.

Fillingim M, Tanguay-Sabourin C, Parisien M, Zare A, Guglietti GV, Norman J, Petre B, Bortsov A, Ware M, Perez J, Roy M, Diatchenko L, Vachon-Presseau E

•papers•May 12 2025

Chronic pain is a multifactorial condition presenting significant diagnostic and prognostic challenges. Biomarkers for the classification and the prediction of chronic pain are therefore critically needed. Here, in this multidataset study of over 523,000 participants, we applied machine learning to multidimensional biological data from the UK Biobank to identify biomarkers for 35 medical conditions associated with pain (for example, rheumatoid arthritis and gout) or self-reported chronic pain (for example, back pain and knee pain). Biomarkers derived from blood immunoassays, brain and bone imaging, and genetics were effective in predicting medical conditions associated with chronic pain (area under the curve (AUC) 0.62-0.87) but not self-reported pain (AUC 0.50-0.62). Notably, all biomarkers worked in synergy with psychosocial factors, accurately predicting both medical conditions (AUC 0.69-0.91) and self-reported pain (AUC 0.71-0.92). These findings underscore the necessity of adopting a holistic approach in the development of biomarkers to enhance their clinical utility.

Mixed Modality Classification Retrospective Clinical In Silico Academic Lab

Evaluating the reference accuracy of large language models in radiology: a comparative study across subspecialties.

Güneş YC, Cesur T, Çamur E

•papers•May 12 2025

This study aimed to compare six large language models (LLMs) [Chat Generative Pre-trained Transformer (ChatGPT)o1-preview, ChatGPT-4o, ChatGPT-4o with canvas, Google Gemini 1.5 Pro, Claude 3.5 Sonnet, and Claude 3 Opus] in generating radiology references, assessing accuracy, fabrication, and bibliographic completeness. In this cross-sectional observational study, 120 open-ended questions were administered across eight radiology subspecialties (neuroradiology, abdominal, musculoskeletal, thoracic, pediatric, cardiac, head and neck, and interventional radiology), with 15 questions per subspecialty. Each question prompted the LLMs to provide responses containing four references with in-text citations and complete bibliographic details (authors, title, journal, publication year/month, volume, issue, page numbers, and PubMed Identifier). References were verified using Medline, Google Scholar, the Directory of Open Access Journals, and web searches. Each bibliographic element was scored for correctness, and a composite final score [(FS): 0-36] was calculated by summing the correct elements and multiplying this by a 5-point verification score for content relevance. The FS values were then categorized into a 5-point Likert scale reference accuracy score (RAS: 0 = fabricated; 4 = fully accurate). Non-parametric tests (Kruskal-Wallis, Tamhane's T2, Wilcoxon signed-rank test with Bonferroni correction) were used for statistical comparisons. Claude 3.5 Sonnet demonstrated the highest reference accuracy, with 80.8% fully accurate references (RAS 4) and a fabrication rate of 3.1%, significantly outperforming all other models (P < 0.001). Claude 3 Opus ranked second, achieving 59.6% fully accurate references and a fabrication rate of 18.3% (P < 0.001). ChatGPT-based models (ChatGPT-4o, ChatGPT-4o with canvas, and ChatGPT o1-preview) exhibited moderate accuracy, with fabrication rates ranging from 27.7% to 52.9% and <8% fully accurate references. Google Gemini 1.5 Pro had the lowest performance, achieving only 2.7% fully accurate references and the highest fabrication rate of 60.6% (P < 0.001). Reference accuracy also varied by subspecialty, with neuroradiology and cardiac radiology outperforming pediatric and head and neck radiology. Claude 3.5 Sonnet significantly outperformed all other models in generating verifiable radiology references, and Claude 3 Opus showed moderate performance. In contrast, ChatGPT models and Google Gemini 1.5 Pro delivered substantially lower accuracy with higher rates of fabricated references, highlighting current limitations in automated academic citation generation. The high accuracy of Claude 3.5 Sonnet can improve radiology literature reviews, research, and education with dependable references. The poor performance of other models, with high fabrication rates, risks misinformation in clinical and academic settings and highlights the need for refinement to ensure safe and effective use.

LLM Radiology Report Retrospective Clinical In Silico Academic Lab GenAI Policy

Preoperative prediction of malignant transformation in sinonasal inverted papilloma: a novel MRI-based deep learning approach.

Ding C, Wen B, Han Q, Hu N, Kang Y, Wang Y, Wang C, Zhang L, Xian J

•papers•May 12 2025

To develop a novel MRI-based deep learning (DL) diagnostic model, utilizing multicenter large-sample data, for the preoperative differentiation of sinonasal inverted papilloma (SIP) from SIP-transformed squamous cell carcinoma (SIP-SCC). This study included 568 patients from four centers with confirmed SIP (n = 421) and SIP-SCC (n = 147). Deep learning models were built using T1WI, T2WI, and CE-T1WI. A combined model was constructed by integrating these features through an attention mechanism. The diagnostic performance of radiologists, both with and without the model's assistance, was compared. Model performance was evaluated through receiver operating characteristic (ROC) analysis, calibration curves, and decision curve analysis (DCA). The combined model demonstrated superior performance in differentiating SIP from SIP-SCC, achieving AUCs of 0.954, 0.897, and 0.859 in the training, internal validation, and external validation cohorts, respectively. It showed optimal accuracy, stability, and clinical benefit, as confirmed by Brier scores and calibration curves. The diagnostic performance of radiologists, especially for less experienced ones, was significantly improved with model assistance. The MRI-based deep learning model enhances the capability to predict malignant transformation of sinonasal inverted papilloma before surgery. By facilitating earlier diagnosis and promoting timely pathological examination or surgical intervention, this approach holds the potential to enhance patient prognosis. Questions Sinonasal inverted papilloma (SIP) is prone to malignant transformation locally, leading to poor prognosis; current diagnostic methods are invasive and inaccurate, necessitating effective preoperative differentiation. Findings The MRI-based deep learning model accurately diagnoses malignant transformations of SIP, enabling junior radiologists to achieve greater clinical benefits with the assistance of the model. Clinical relevance A novel MRI-based deep learning model enhances the capability of preoperative diagnosis of malignant transformation in sinonasal inverted papilloma, providing a non-invasive tool for personalized treatment planning.

MRI Classification Retrospective Clinical In Silico Academic Lab

Learning-based multi-material CBCT image reconstruction with ultra-slow kV switching.

Ma C, Zhu J, Zhang X, Cui H, Tan Y, Guo J, Zheng H, Liang D, Su T, Sun Y, Ge Y

•papers•May 11 2025

ObjectiveThe purpose of this study is to perform multiple (<math xmlns="http://www.w3.org/1998/Math/MathML"><mo>≥</mo><mn>3</mn></math>) material decomposition with deep learning method for spectral cone-beam CT (CBCT) imaging based on ultra-slow kV switching.ApproachIn this work, a novel deep neural network called SkV-Net is developed to reconstruct multiple material density images from the ultra-sparse spectral CBCT projections acquired using the ultra-slow kV switching technique. In particular, the SkV-Net has a backbone structure of U-Net, and a multi-head axial attention module is adopted to enlarge the perceptual field. It takes the CT images reconstructed from each kV as input, and output the basis material images automatically based on their energy-dependent attenuation characteristics. Numerical simulations and experimental studies are carried out to evaluate the performance of this new approach.Main ResultsIt is demonstrated that the SkV-Net is able to generate four different material density images, i.e., fat, muscle, bone and iodine, from five spans of kV switched spectral projections. Physical experiments show that the decomposition errors of iodine and CaCl<math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow></mrow><mn>2</mn></msub></math> are less than 6<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>%</mi></math>, indicating high precision of this novel approach in distinguishing materials.SignificanceSkV-Net provides a promising multi-material decomposition approach for spectral CBCT imaging systems implemented with the ultra-slow kV switching scheme.

CT Reconstruction Methodology In Silico Academic Lab

The March to Harmonized Imaging Standards for Retinal Imaging.

Gim N, Ferguson AN, Blazes M, Lee CS, Lee AY

•papers•May 11 2025

The adoption of standardized imaging protocols in retinal imaging is critical to overcoming challenges posed by fragmented data formats across devices and manufacturers. The lack of standardization hinders clinical interoperability, collaborative research, and the development of artificial intelligence (AI) models that depend on large, high-quality datasets. The Digital Imaging and Communication in Medicine (DICOM) standard offers a robust solution for ensuring interoperability in medical imaging. Although DICOM is widely utilized in radiology and cardiology, its adoption in ophthalmology remains limited. Retinal imaging modalities such as optical coherence tomography (OCT), fundus photography, and OCT angiography (OCTA) have revolutionized retinal disease management but are constrained by proprietary and non-standardized formats. This review underscores the necessity for harmonized imaging standards in ophthalmology, detailing DICOM standards for retinal imaging including ophthalmic photography (OP), OCT, and OCTA, and their requisite metadata information. Additionally, the potential of DICOM standardization for advancing AI applications in ophthalmology is explored. A notable example is the Artificial Intelligence Ready and Equitable Atlas for Diabetes Insights (AI-READI) dataset, the first publicly available standards-compliant DICOM retinal imaging dataset. This dataset encompasses diverse retinal imaging modalities, including color fundus photography, infrared, autofluorescence, OCT, and OCTA. By leveraging multimodal retinal imaging, AI-READI provides a transformative resource for studying diabetes and its complications, setting a blueprint for future datasets aimed at harmonizing imaging formats and enabling AI-driven breakthroughs in ophthalmology. Our manuscript also addresses challenges in retinal imaging for diabetic patients, retinal imaging-based AI applications for studying diabetes, and potential advancements in retinal imaging standardization.

OCT Review Academic Lab Open Dataset

Improving Generalization of Medical Image Registration Foundation Model

Jing Hu, Kaiwei Yu, Hongjiang Xian, Shu Hu, Xin Wang

•preprint•May 10 2025

Deformable registration is a fundamental task in medical image processing, aiming to achieve precise alignment by establishing nonlinear correspondences between images. Traditional methods offer good adaptability and interpretability but are limited by computational efficiency. Although deep learning approaches have significantly improved registration speed and accuracy, they often lack flexibility and generalizability across different datasets and tasks. In recent years, foundation models have emerged as a promising direction, leveraging large and diverse datasets to learn universal features and transformation patterns for image registration, thus demonstrating strong cross-task transferability. However, these models still face challenges in generalization and robustness when encountering novel anatomical structures, varying imaging conditions, or unseen modalities. To address these limitations, this paper incorporates Sharpness-Aware Minimization (SAM) into foundation models to enhance their generalization and robustness in medical image registration. By optimizing the flatness of the loss landscape, SAM improves model stability across diverse data distributions and strengthens its ability to handle complex clinical scenarios. Experimental results show that foundation models integrated with SAM achieve significant improvements in cross-dataset registration performance, offering new insights for the advancement of medical image registration technology. Our code is available at https://github.com/Promise13/fm_sam}{https://github.com/Promise13/fm\_sam.

Mixed Modality Registration Methodology In Silico Open Code

Deeply Explainable Artificial Neural Network

David Zucker

•preprint•May 10 2025

While deep learning models have demonstrated remarkable success in numerous domains, their black-box nature remains a significant limitation, especially in critical fields such as medical image analysis and inference. Existing explainability methods, such as SHAP, LIME, and Grad-CAM, are typically applied post hoc, adding computational overhead and sometimes producing inconsistent or ambiguous results. In this paper, we present the Deeply Explainable Artificial Neural Network (DxANN), a novel deep learning architecture that embeds explainability ante hoc, directly into the training process. Unlike conventional models that require external interpretation methods, DxANN is designed to produce per-sample, per-feature explanations as part of the forward pass. Built on a flow-based framework, it enables both accurate predictions and transparent decision-making, and is particularly well-suited for image-based tasks. While our focus is on medical imaging, the DxANN architecture is readily adaptable to other data modalities, including tabular and sequential data. DxANN marks a step forward toward intrinsically interpretable deep learning, offering a practical solution for applications where trust and accountability are essential.

Mixed Modality Classification Methodology Concept Ethics

LiteMIL: A Computationally Efficient Transformer-Based MIL for Cancer Subtyping on Whole Slide Images.

Automated scout-image-based estimation of contrast agent dosing: a deep learning approach

Benchmarking Radiology Report Generation From Noisy Free-Texts.

Biological markers and psychosocial factors predict chronic pain conditions.

Evaluating the reference accuracy of large language models in radiology: a comparative study across subspecialties.

Preoperative prediction of malignant transformation in sinonasal inverted papilloma: a novel MRI-based deep learning approach.

Learning-based multi-material CBCT image reconstruction with ultra-slow kV switching.

The March to Harmonized Imaging Standards for Retinal Imaging.

Improving Generalization of Medical Image Registration Foundation Model

Deeply Explainable Artificial Neural Network

Ready to Sharpen Your Edge?