Latest Papers on Radiology AI. Tags: Other, Order: Best Match, Limit: 10.

An incremental algorithm for non-convex AI-enhanced medical image processing

Elena Morotti

•preprint•May 13 2025

Solving non-convex regularized inverse problems is challenging due to their complex optimization landscapes and multiple local minima. However, these models remain widely studied as they often yield high-quality, task-oriented solutions, particularly in medical imaging, where the goal is to enhance clinically relevant features rather than merely minimizing global error. We propose incDG, a hybrid framework that integrates deep learning with incremental model-based optimization to efficiently approximate the $\ell_0$-optimal solution of imaging inverse problems. Built on the Deep Guess strategy, incDG exploits a deep neural network to generate effective initializations for a non-convex variational solver, which refines the reconstruction through regularized incremental iterations. This design combines the efficiency of Artificial Intelligence (AI) tools with the theoretical guarantees of model-based optimization, ensuring robustness and stability. We validate incDG on TpV-regularized optimization tasks, demonstrating its effectiveness in medical image deblurring and tomographic reconstruction across diverse datasets, including synthetic images, brain CT slices, and chest-abdomen scans. Results show that incDG outperforms both conventional iterative solvers and deep learning-based methods, achieving superior accuracy and stability. Moreover, we confirm that training incDG without ground truth does not significantly degrade performance, making it a practical and powerful tool for solving non-convex inverse problems in imaging and beyond.

CT Reconstruction Methodology In Silico Reproducibility

Unsupervised Out-of-Distribution Detection in Medical Imaging Using Multi-Exit Class Activation Maps and Feature Masking

Yu-Jen Chen, Xueyang Li, Yiyu Shi, Tsung-Yi Ho

•preprint•May 13 2025

Out-of-distribution (OOD) detection is essential for ensuring the reliability of deep learning models in medical imaging applications. This work is motivated by the observation that class activation maps (CAMs) for in-distribution (ID) data typically emphasize regions that are highly relevant to the model's predictions, whereas OOD data often lacks such focused activations. By masking input images with inverted CAMs, the feature representations of ID data undergo more substantial changes compared to those of OOD data, offering a robust criterion for differentiation. In this paper, we introduce a novel unsupervised OOD detection framework, Multi-Exit Class Activation Map (MECAM), which leverages multi-exit CAMs and feature masking. By utilizing mult-exit networks that combine CAMs from varying resolutions and depths, our method captures both global and local feature representations, thereby enhancing the robustness of OOD detection. We evaluate MECAM on multiple ID datasets, including ISIC19 and PathMNIST, and test its performance against three medical OOD datasets, RSNA Pneumonia, COVID-19, and HeadCT, and one natural image OOD dataset, iSUN. Comprehensive comparisons with state-of-the-art OOD detection methods validate the effectiveness of our approach. Our findings emphasize the potential of multi-exit networks and feature masking for advancing unsupervised OOD detection in medical imaging, paving the way for more reliable and interpretable models in clinical practice.

Mixed Modality Classification Methodology In Silico Benchmark SOTA

Development of a deep learning method for phase retrieval image enhancement in phase contrast microcomputed tomography.

Ding XF, Duan X, Li N, Khoz Z, Wu FX, Chen X, Zhu N

•papers•May 13 2025

Propagation-based imaging (one method of X-ray phase contrast imaging) with microcomputed tomography (PBI-µCT) offers the potential to visualise low-density materials, such as soft tissues and hydrogel constructs, which are difficult to be identified by conventional absorption-based contrast µCT. Conventional µCT reconstruction produces edge-enhanced contrast (EEC) images which preserve sharp boundaries but are susceptible to noise and do not provide consistent grey value representation for the same material. Meanwhile, phase retrieval (PR) algorithms can convert edge enhanced contrast to area contrast to improve signal-to-noise ratio (SNR), and contrast-to-noise ratio (CNR) but usually results to over-smoothing, thus creating inaccuracies in quantitative analysis. To alleviate these problems, this study developed a deep learning-based method called edge view enhanced phase retrieval (EVEPR), by strategically integrating the complementary spatial features of denoised EEC and PR images, and further applied this method to segment the hydrogel constructs in vivo and ex vivo. EVEPR used paired denoised EEC and PR images to train a deep convolutional neural network (CNN) on a dataset-to-dataset basis. The CNN had been trained on important high-frequency details, for example, edges and boundaries from the EEC image and area contrast from PR images. The CNN predicted result showed enhanced area contrast beyond conventional PR algorithms while improving SNR and CNR. The enhanced CNR especially allowed for the image to be segmented with greater efficiency. EVEPR was applied to in vitro and ex vivo PBI-µCT images of low-density hydrogel constructs. The enhanced visibility and consistency of hydrogel constructs was essential for segmenting such material which usually exhibit extremely poor contrast. The EVEPR images allowed for more accurate segmentation with reduced manual adjustments. The efficiency in segmentation allowed for the generation of a sizeable database of segmented hydrogel scaffolds which were used in conventional data-driven segmentation applications. EVEPR was demonstrated to be a robust post-image processing method capable of significantly enhancing image quality by training a CNN on paired denoised EEC and PR images. This method not only addressed the common issues of over-smoothing and noise susceptibility in conventional PBI-µCT image processing but also allowed for efficient and accurate in vitro and ex vivo image processing applications of low-density materials.

CT Segmentation Methodology In Silico Academic Lab

Diagnosis of thyroid cartilage invasion by laryngeal and hypopharyngeal cancers based on CT with deep learning.

Takano Y, Fujima N, Nakagawa J, Dobashi H, Shimizu Y, Kanaya M, Kano S, Homma A, Kudo K

•papers•May 13 2025

To develop a convolutional neural network (CNN) model to diagnose thyroid cartilage invasion by laryngeal and hypopharyngeal cancers observed on computed tomography (CT) images and evaluate the model's diagnostic performance. We retrospectively analyzed 91 cases of laryngeal or hypopharyngeal cancer treated surgically at our hospital during the period April 2010 through May 2023, and we divided the cases into datasets for training (n = 61) and testing (n = 30). We reviewed the CT images and pathological diagnoses in all cases to determine the invasion positive- or negative-status as a ground truth. We trained the new CNN model to classify thyroid cartilage invasion-positive or -negative status from the pre-treatment axial CT images by transfer learning from Residual Network 101 (ResNet101), using the training dataset. We then used the test dataset to evaluate the model's performance. Two radiologists, one with extensive head and neck imaging experience (senior reader) and the other with less experience (junior reader) reviewed the CT images of the test dataset to determine whether thyroid cartilage invasion was present. The following were obtained by the CNN model with the test dataset: area under the curve (AUC), 0.82; 90 % accuracy, 80 % sensitivity, and 95 % specificity. The CNN model showed a significant difference in AUCs compared to the junior reader (p = 0.035) but not the senior reader (p = 0.61). The CNN-based diagnostic model can be a useful supportive tool for the assessment of thyroid cartilage invasion in patients with laryngeal or hypopharyngeal cancer.

CT Classification Retrospective Clinical In Silico Academic Lab

DEMAC-Net: A Dual-Encoder Multiattention Collaborative Network for Cervical Nerve Pathway and Adjacent Anatomical Structure Segmentation.

Cui H, Duan J, Lin L, Wu Q, Guo W, Zang Q, Zhou M, Fang W, Hu Y, Zou Z

•papers•May 13 2025

Currently, cervical anesthesia is performed using three main approaches: superficial cervical plexus block, deep cervical plexus block, and intermediate plexus nerve block. However, each technique carries inherent risks and demands significant clinical expertise. Ultrasound imaging, known for its real-time visualization capabilities and accessibility, is widely used in both diagnostic and interventional procedures. Nevertheless, accurate segmentation of small and irregularly shaped structures such as the cervical and brachial plexuses remains challenging due to image noise, complex anatomical morphology, and limited annotated training data. This study introduces DEMAC-Net-a dual-encoder, multiattention collaborative network-to significantly improve the segmentation accuracy of these neural structures. By precisely identifying the cervical nerve pathway (CNP) and adjacent anatomical tissues, DEMAC-Net aims to assist clinicians, especially those less experienced, in effectively guiding anesthesia procedures and accurately identifying optimal needle insertion points. Consequently, this improvement is expected to enhance clinical safety, reduce procedural risks, and streamline decision-making efficiency during ultrasound-guided regional anesthesia. DEMAC-Net combines a dual-encoder architecture with the Spatial Understanding Convolution Kernel (SUCK) and the Spatial-Channel Attention Module (SCAM) to extract multi-scale features effectively. Additionally, a Global Attention Gate (GAG) and inter-layer fusion modules refine relevant features while suppressing noise. A novel dataset, Neck Ultrasound Dataset (NUSD), was introduced, containing 1,500 annotated ultrasound images across seven anatomical regions. Extensive experiments were conducted on both NUSD and the BUSI public dataset, comparing DEMAC-Net to state-of-the-art models using metrics such as Dice Similarity Coefficient (DSC) and Intersection over Union (IoU). On the NUSD dataset, DEMAC-Net achieved a mean DSC of 93.3%, outperforming existing models. For external validation on the BUSI dataset, it demonstrated superior generalization, achieving a DSC of 87.2% and a mean IoU of 77.4%, surpassing other advanced methods. Notably, DEMAC-Net displayed consistent segmentation stability across all tested structures. The proposed DEMAC-Net significantly improves segmentation accuracy for small nerves and complex anatomical structures in ultrasound images, outperforming existing methods in terms of accuracy and computational efficiency. This framework holds great potential for enhancing ultrasound-guided procedures, such as peripheral nerve blocks, by providing more precise anatomical localization, ultimately improving clinical outcomes.

Ultrasound Segmentation Methodology In Silico Academic Lab Open Dataset Benchmark SOTA

Automatic deep learning segmentation of mandibular periodontal bone topography on cone-beam computed tomography images.

Palkovics D, Molnar B, Pinter C, García-Mato D, Diaz-Pinto A, Windisch P, Ramseier CA

•papers•May 13 2025

This study evaluated the performance of a multi-stage Segmentation Residual Network (SegResNet)-based deep learning (DL) model for the automatic segmentation of cone-beam computed tomography (CBCT) images of patients with stage III and IV periodontitis. Seventy pre-processed CBCT scans from patients undergoing periodontal rehabilitation were used for training and validation. The model was tested on 10 CBCT scans independent from the training dataset by comparing results with semi-automatic (SA) segmentations. Segmentation accuracy was assessed using the Dice similarity coefficient (DSC), Intersection over Union (IoU), and Hausdorff distance 95th percentile (HD95). Linear periodontal measurements were performed on four tooth surfaces to assess the validity of the DL segmentation in the periodontal region. The DL model achieved a mean DSC of 0.9650 ± 0.0097, with an IoU of 0.9340 ± 0.0180 and HD95 of 0.4820 mm ± 0.1269 mm, showing strong agreement with SA segmentation. Linear measurements revealed high statistical correlations between the mesial, distal, and lingual surfaces, with intraclass correlation coefficients (ICC) of 0.9442 (p<0.0001), 0.9232 (p<0.0001), and 0.9598(p<0.0001), respectively, while buccal measurements revealed lower consistency, with an ICC of 0.7481 (p<0.0001). The DL method reduced the segmentation time by 47 times compared to the SA method. Acquired 3D models may enable precise treatment planning in cases where conventional diagnostic modalities are insufficient. However, the robustness of the model must be increased to improve its general reliability and consistency at the buccal aspect of the periodontal region. This study presents a DL model for the CBCT-based segmentation of periodontal defects, demonstrating high accuracy and a 47-fold time reduction compared to SA methods, thus improving the feasibility of 3D diagnostics for advanced periodontitis.

CT Segmentation Retrospective Clinical In Silico Startup

Automated scout-image-based estimation of contrast agent dosing: a deep learning approach

Schirrmeister, R., Taleb, L., Friemel, P., Reisert, M., Bamberg, F., Weiss, J., Rau, A.

•preprint•May 12 2025

We developed and tested a deep-learning-based algorithm for the approximation of contrast agent dosage based on computed tomography (CT) scout images. We prospectively enrolled 817 patients undergoing clinically indicated CT imaging, predominantly of the thorax and/or abdomen. Patient weight was collected by study staff prior to the examination 1) with a weight scale and 2) as self-reported. Based on the scout images, we developed an EfficientNet convolutional neural network pipeline to estimate the optimal contrast agent dose based on patient weight and provide a browser-based user interface as a versatile open-source tool to account for different contrast agent compounds. We additionally analyzed the body-weight-informative CT features by synthesizing representative examples for different weights using in-context learning and dataset distillation. The cohort consisted of 533 thoracic, 70 abdominal and 229 thoracic-abdominal CT scout scans. Self-reported patient weight was statistically significantly lower than manual measurements (75.13 kg vs. 77.06 kg; p < 10-5, Wilcoxon signed-rank test). Our pipeline predicted patient weight with a mean absolute error of 3.90 {+/-} 0.20 kg (corresponding to a roughly 4.48 - 11.70 ml difference in contrast agent depending on the agent) in 5-fold cross-validation and is publicly available at https://tinyurl.com/ct-scout-weight. Interpretability analysis revealed that both larger anatomical shape and higher overall attenuation were predictive of body weight. Our open-source deep learning pipeline allows for the automatic estimation of accurate contrast agent dosing based on scout images in routine CT imaging studies. This approach has the potential to streamline contrast agent dosing workflows, improve efficiency, and enhance patient safety by providing quick and accurate weight estimates without additional measurements or reliance on potentially outdated records. The models performance may vary depending on patient positioning and scout image quality and the approach requires validation on larger patient cohorts and other clinical centers. Author SummaryAutomation of medical workflows using AI has the potential to increase reproducibility while saving costs and time. Here, we investigated automating the estimation of the required contrast agent dosage for CT examinations. We trained a deep neural network to predict the body weight from the initial 2D CT Scout images that are required prior to the actual CT examination. The predicted weight is then converted to a contrast agent dosage based on contrast-agent-specific conversion factors. To facilitate application in clinical routine, we developed a user-friendly browser-based user interface that allows clinicians to select a contrast agent or input a custom conversion factor to receive dosage suggestions, with local data processing in the browser. We also investigate what image characteristics predict body weight and find plausible relationships such as higher attenuation and larger anatomical shapes correlating with higher body weights. Our work goes beyond prior work by implementing a single model for a variety of anatomical regions, providing an accessible user interface and investigating the predictive characteristics of the images.

CT Classification Prospective In Silico Academic Lab Open Code

LiteMIL: A Computationally Efficient Transformer-Based MIL for Cancer Subtyping on Whole Slide Images.

Kussaibi, H.

•preprint•May 12 2025

PurposeAccurate cancer subtyping is crucial for effective treatment; however, it presents challenges due to overlapping morphology and variability among pathologists. Although deep learning (DL) methods have shown potential, their application to gigapixel whole slide images (WSIs) is often hindered by high computational demands and the need for efficient, context-aware feature aggregation. This study introduces LiteMIL, a computationally efficient transformer-based multiple instance learning (MIL) network combined with Phikon, a pathology-tuned self-supervised feature extractor, for robust and scalable cancer subtyping on WSIs. MethodsInitially, patches were extracted from TCGA-THYM dataset (242 WSIs, six subtypes) and subsequently fed in real-time to Phikon for feature extraction. To train MILs, features were arranged into uniform bags using a chunking strategy that maintains tissue context while increasing training data. LiteMIL utilizes a learnable query vector within an optimized multi-head attention module for effective feature aggregation. The models performance was evaluated against established MIL methods on the Thymic Dataset and three additional TCGA datasets (breast, lung, and kidney cancer). ResultsLiteMIL achieved 0.89 {+/-} 0.01 F1 score and 0.99 AUC on Thymic dataset, outperforming other MILs. LiteMIL demonstrated strong generalizability across the external datasets, scoring the best on breast and kidney cancer datasets. Compared to TransMIL, LiteMIL significantly reduces training time and GPU memory usage. Ablation studies confirmed the critical role of the learnable query and layer normalization in enhancing performance and stability. ConclusionLiteMIL offers a resource-efficient, robust solution. Its streamlined architecture, combined with the compact Phikon features, makes it suitable for integrating into routine histopathological workflows, particularly in resource-limited settings.

Mixed Modality Classification Methodology In Silico Academic Lab Benchmark SOTA

Evaluating the reference accuracy of large language models in radiology: a comparative study across subspecialties.

Güneş YC, Cesur T, Çamur E

•papers•May 12 2025

This study aimed to compare six large language models (LLMs) [Chat Generative Pre-trained Transformer (ChatGPT)o1-preview, ChatGPT-4o, ChatGPT-4o with canvas, Google Gemini 1.5 Pro, Claude 3.5 Sonnet, and Claude 3 Opus] in generating radiology references, assessing accuracy, fabrication, and bibliographic completeness. In this cross-sectional observational study, 120 open-ended questions were administered across eight radiology subspecialties (neuroradiology, abdominal, musculoskeletal, thoracic, pediatric, cardiac, head and neck, and interventional radiology), with 15 questions per subspecialty. Each question prompted the LLMs to provide responses containing four references with in-text citations and complete bibliographic details (authors, title, journal, publication year/month, volume, issue, page numbers, and PubMed Identifier). References were verified using Medline, Google Scholar, the Directory of Open Access Journals, and web searches. Each bibliographic element was scored for correctness, and a composite final score [(FS): 0-36] was calculated by summing the correct elements and multiplying this by a 5-point verification score for content relevance. The FS values were then categorized into a 5-point Likert scale reference accuracy score (RAS: 0 = fabricated; 4 = fully accurate). Non-parametric tests (Kruskal-Wallis, Tamhane's T2, Wilcoxon signed-rank test with Bonferroni correction) were used for statistical comparisons. Claude 3.5 Sonnet demonstrated the highest reference accuracy, with 80.8% fully accurate references (RAS 4) and a fabrication rate of 3.1%, significantly outperforming all other models (P < 0.001). Claude 3 Opus ranked second, achieving 59.6% fully accurate references and a fabrication rate of 18.3% (P < 0.001). ChatGPT-based models (ChatGPT-4o, ChatGPT-4o with canvas, and ChatGPT o1-preview) exhibited moderate accuracy, with fabrication rates ranging from 27.7% to 52.9% and <8% fully accurate references. Google Gemini 1.5 Pro had the lowest performance, achieving only 2.7% fully accurate references and the highest fabrication rate of 60.6% (P < 0.001). Reference accuracy also varied by subspecialty, with neuroradiology and cardiac radiology outperforming pediatric and head and neck radiology. Claude 3.5 Sonnet significantly outperformed all other models in generating verifiable radiology references, and Claude 3 Opus showed moderate performance. In contrast, ChatGPT models and Google Gemini 1.5 Pro delivered substantially lower accuracy with higher rates of fabricated references, highlighting current limitations in automated academic citation generation. The high accuracy of Claude 3.5 Sonnet can improve radiology literature reviews, research, and education with dependable references. The poor performance of other models, with high fabrication rates, risks misinformation in clinical and academic settings and highlights the need for refinement to ensure safe and effective use.

LLM Radiology Report Retrospective Clinical In Silico Academic Lab GenAI Policy

Benchmarking Radiology Report Generation From Noisy Free-Texts.

Yuan Y, Zheng Y, Qu L

•papers•May 12 2025

Automatic radiology report generation can enhance diagnostic efficiency and accuracy. However, clean open-source imaging scan-report pairs are limited in scale and variety. Moreover, the vast amount of radiological texts available online is often too noisy to be directly employed. To address this challenge, we introduce a novel task called Noisy Report Refinement (NRR), which generates radiology reports from noisy free-texts. To achieve this, we propose a report refinement pipeline that leverages large language models (LLMs) enhanced with guided self-critique and report selection strategies. To address the inability of existing radiology report generation metrics in measuring cleanliness, radiological usefulness, and factual correctness across various modalities of reports in NRR task, we introduce a new benchmark, NRRBench, for NRR evaluation. This benchmark includes two online-sourced datasets and four clinically explainable LLM-based metrics: two metrics evaluate the matching rate of radiology entities and modality-specific template attributes respectively, one metric assesses report cleanliness, and a combined metric evaluates overall NRR performance. Experiments demonstrate that guided self-critique and report selection strategies significantly improve the quality of refined reports. Additionally, our proposed metrics show a much higher correlation with noisy rate and error count of reports than radiology report generation metrics in evaluating NRR.

Mixed Modality LLM Radiology Report Methodology In Silico Benchmark SOTA GenAI

An incremental algorithm for non-convex AI-enhanced medical image processing

Unsupervised Out-of-Distribution Detection in Medical Imaging Using Multi-Exit Class Activation Maps and Feature Masking

Development of a deep learning method for phase retrieval image enhancement in phase contrast microcomputed tomography.

Diagnosis of thyroid cartilage invasion by laryngeal and hypopharyngeal cancers based on CT with deep learning.

DEMAC-Net: A Dual-Encoder Multiattention Collaborative Network for Cervical Nerve Pathway and Adjacent Anatomical Structure Segmentation.

Automatic deep learning segmentation of mandibular periodontal bone topography on cone-beam computed tomography images.

Automated scout-image-based estimation of contrast agent dosing: a deep learning approach

LiteMIL: A Computationally Efficient Transformer-Based MIL for Cancer Subtyping on Whole Slide Images.

Evaluating the reference accuracy of large language models in radiology: a comparative study across subspecialties.

Benchmarking Radiology Report Generation From Noisy Free-Texts.

Ready to Sharpen Your Edge?