Latest Papers on Radiology AI. Tags: None

MMIS-Net for Retinal Fluid Segmentation and Detection

Nchongmaje Ndipenocha, Alina Mirona, Kezhi Wanga, Yongmin Li

•preprint•Aug 19 2025

Purpose: Deep learning methods have shown promising results in the segmentation, and detection of diseases in medical images. However, most methods are trained and tested on data from a single source, modality, organ, or disease type, overlooking the combined potential of other available annotated data. Numerous small annotated medical image datasets from various modalities, organs, and diseases are publicly available. In this work, we aim to leverage the synergistic potential of these datasets to improve performance on unseen data. Approach: To this end, we propose a novel algorithm called MMIS-Net (MultiModal Medical Image Segmentation Network), which features Similarity Fusion blocks that utilize supervision and pixel-wise similarity knowledge selection for feature map fusion. Additionally, to address inconsistent class definitions and label contradictions, we created a one-hot label space to handle classes absent in one dataset but annotated in another. MMIS-Net was trained on 10 datasets encompassing 19 organs across 2 modalities to build a single model. Results: The algorithm was evaluated on the RETOUCH grand challenge hidden test set, outperforming large foundation models for medical image segmentation and other state-of-the-art algorithms. We achieved the best mean Dice score of 0.83 and an absolute volume difference of 0.035 for the fluids segmentation task, as well as a perfect Area Under the Curve of 1 for the fluid detection task. Conclusion: The quantitative results highlight the effectiveness of our proposed model due to the incorporation of Similarity Fusion blocks into the network's backbone for supervision and similarity knowledge selection, and the use of a one-hot label space to address label class inconsistencies and contradictions.

OCT Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Improving Deep Learning for Accelerated MRI With Data Filtering

Kang Lin, Anselm Krainovic, Kun Wang, Reinhard Heckel

•preprint•Aug 19 2025

Deep neural networks achieve state-of-the-art results for accelerated MRI reconstruction. Most research on deep learning based imaging focuses on improving neural network architectures trained and evaluated on fixed and homogeneous training and evaluation data. In this work, we investigate data curation strategies for improving MRI reconstruction. We assemble a large dataset of raw k-space data from 18 public sources consisting of 1.1M images and construct a diverse evaluation set comprising 48 test sets, capturing variations in anatomy, contrast, number of coils, and other key factors. We propose and study different data filtering strategies to enhance performance of current state-of-the-art neural networks for accelerated MRI reconstruction. Our experiments show that filtering the training data leads to consistent, albeit modest, performance gains. These performance gains are robust across different training set sizes and accelerations, and we find that filtering is particularly beneficial when the proportion of in-distribution data in the unfiltered training set is low.

MRI Reconstruction Methodology In Silico Academic Lab Benchmark SOTA Open Dataset

Pixels Under Pressure: Exploring Fine-Tuning Paradigms for Foundation Models in High-Resolution Medical Imaging

Zahra TehraniNasab, Amar Kumar, Tal Arbel

•preprint•Aug 19 2025

Advancements in diffusion-based foundation models have improved text-to-image generation, yet most efforts have been limited to low-resolution settings. As high-resolution image synthesis becomes increasingly essential for various applications, particularly in medical imaging domains, fine-tuning emerges as a crucial mechanism for adapting these powerful pre-trained models to task-specific requirements and data distributions. In this work, we present a systematic study, examining the impact of various fine-tuning techniques on image generation quality when scaling to high resolution 512x512 pixels. We benchmark a diverse set of fine-tuning methods, including full fine-tuning strategies and parameter-efficient fine-tuning (PEFT). We dissect how different fine-tuning methods influence key quality metrics, including Fr\'echet Inception Distance (FID), Vendi score, and prompt-image alignment. We also evaluate the utility of generated images in a downstream classification task under data-scarce conditions, demonstrating that specific fine-tuning strategies improve both generation fidelity and downstream performance when synthetic images are used for classifier training and evaluation on real images. Our code is accessible through the project website - https://tehraninasab.github.io/PixelUPressure/.

Mixed Modality Image Synthesis Methodology In Silico Academic Lab Open Code

Improving risk stratification of PI-RADS 3 + 1 lesions of the peripheral zone: expert lexicon of terms, multi-reader performance and contribution of artificial intelligence.

A Glemser P, Netzer N, H Ziener C, Wilhelm M, Hielscher T, Sun Zhang K, Görtz M, Schütz V, Stenzinger A, Hohenfellner M, Schlemmer HP, Bonekamp D

•papers•Aug 19 2025

According to PI-RADS v2.1, peripheral PI-RADS 3 lesions are upgraded to PI-RADS 4 if dynamic contrast-enhanced MRI is positive (3+1 lesions), however those lesions are radiologically challenging. We aimed to define criteria by expert consensus and test applicability by other radiologists for sPC prediction of PI-RADS 3+1 lesions and determine their value in integrated regression models. From consecutive 3 Tesla MR examinations performed between 08/2016 to 12/2018 we identified 85 MRI examinations from 83 patients with a total of 94 PI-RADS 3+1 lesions in the official clinical report. Lesions were retrospectively assessed by expert consensus with construction of a newly devised feature catalogue which was utilized subsequently by two additional radiologists specialized in prostate MRI for independent lesion assessment. With reference to extended fused targeted and systematic TRUS/MRI-biopsy histopathological correlation, relevant catalogue features were identified by univariate analysis and put into context to typically available clinical features and automated AI image assessment utilizing lasso-penalized logistic regression models, also focusing on the contribution of DCE imaging (feature-based, bi- and multiparametric AI-enhanced and solely bi- and multiparametric AI-driven). The feature catalog enabled image-based lesional risk stratification for all readers. Expert consensus provided 3 significant features in univariate analysis (adj. p-value <0.05; most relevant feature T2w configuration: "irregular/microlobulated/spiculated", OR 9.0 (95%CI 2.3-44.3); adj. p-value: 0.016). These remained after lasso penalized regression based feature reduction, while the only selected clinical feature was prostate volume (OR<1), enabling nomogram construction. While DCE-derived consensus features did not enhance model performance (bootstrapped AUC), there was a trend for increased performance by including multiparametric AI, but not biparametric AI into models, both for combined and AI-only models. PI-RADS 3+1 lesions can be risk-stratified using lexicon terms and a key feature nomogram. AI potentially benefits more from DCE imaging than experienced prostate radiologists. Not applicable.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab

Multimodal imaging deep learning model for predicting extraprostatic extension in prostate cancer using MpMRI and 18 F-PSMA-PET/CT.

Yao F, Lin H, Xue YN, Zhuang YD, Bian SY, Zhang YY, Yang YJ, Pan KH

•papers•Aug 19 2025

This study aimed to construct a multimodal imaging deep learning (DL) model integrating mpMRI and <sup>18</sup>F-PSMA-PET/CT for the prediction of extraprostatic extension (EPE) in prostate cancer, and to assess its effectiveness in enhancing the diagnostic accuracy of radiologists. Clinical and imaging data were retrospectively collected from patients with pathologically confirmed prostate cancer (PCa) who underwent radical prostatectomy (RP). Data were collected from a primary institution (Center 1, n = 197) between January 2019 and June 2022 and an external institution (Center 2, n = 36) between July 2021 and November 2022. A multimodal DL model incorporating mpMRI and <sup>18</sup>F-PSMA-PET/CT was developed to support radiologists in assessing EPE using the EPE-grade scoring system. The predictive performance of the DL model was compared with that of single-modality models, as well as with radiologist assessments with and without model assistance. Clinical net benefit of the model was also assessed. For patients in Center 1, the area under the curve (AUC) for predicting EPE was 0.76 (0.72-0.80), 0.77 (0.70-0.82), and 0.82 (0.78-0.87) for the mpMRI-based DL model, PET/CT-based DL model, and the combined mpMRI + PET/CT multimodal DL model, respectively. In the external test set (Center 2), the AUCs for these models were 0.75 (0.60-0.88), 0.77 (0.72-0.88), and 0.81 (0.63-0.97), respectively. The multimodal DL model demonstrated superior predictive accuracy compared to single-modality models in both internal and external validations. The deep learning-assisted EPE-grade scoring model significantly improved AUC and sensitivity compared to radiologist EPE-grade scoring alone (P < 0.05), with a modest reduction in specificity. Additionally, the deep learning-assisted scoring model provided greater clinical net benefit than the radiologist EPE-grade score used by radiologists alone. The multimodal imaging deep learning model, integrating mpMRI and 18 F-PSMA PET/CT, demonstrates promising predictive performance for EPE in prostate cancer and enhances the accuracy of radiologists in EPE assessment. The model holds potential as a supportive tool for more individualized and precise therapeutic decision-making.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab

Application of deep learning reconstruction at prone position chest scanning of early interstitial lung disease.

Zhao R, Wang Y, Wang J, Wang Z, Xiao R, Ming Y, Piao S, Wang J, Song L, Xu Y, Ma Z, Fan P, Sui X, Song W

•papers•Aug 19 2025

Timely intervention of interstitial lung disease (ILD) was promising for attenuating the lung function decline and improving clinical outcomes. The prone position HRCT is essential for early diagnosis of ILD, but limited by its high radiation exposure. This study was aimed to explore whether deep learning reconstruction (DLR) could keep the image quality and reduce the radiation dose compared with hybrid iterative reconstruction (HIR) in prone position scanning for patients of early-stage ILD. This study prospectively enrolled 21 patients with early-stage ILD. All patients underwent high-resolution CT (HRCT) and low-dose CT (LDCT) scans. HRCT images were reconstructed with HIR using standard settings, and LDCT images were reconstructed with DLR (lung/bone kernel) in a mild, standard, or strong setting. Overall image quality, image noise, streak artifacts, and visualization of normal and abnormal ILD features were analysed. The effective dose of LDCT was 1.22 ± 0.09 mSv, 63.7% less than the HRCT dose. The objective noise of the LDCT DLR images was 35.9-112.6% that of the HRCT HIR images. The LDCT DLR was comparable to the HRCT HIR in terms of overall image quality. LDCT DLR (bone, strong) visualization of bronchiectasis and/or bronchiolectasis was significantly weaker than that of HRCT HIR (p = 0.046). The LDCT DLR (all settings) did not significantly differ from the HRCT HIR in the evaluation of other abnormal features, including ground glass opacities (GGOs), architectural distortion, reticulation and honeycombing. With 63.7% reduction of radiation dose, the overall image quality of LDCT DLR was comparable to HRCT HIR in prone scanning for early ILD patients. This study supported that DLR was promising for maintaining image quality under a lower radiation dose in prone scanning, and it offered valuable insights for the selection of images reconstruction algorithms for the diagnosis and follow-up of early ILD.

CT Reconstruction Chest Prospective Clinical Pilot

Lung adenocarcinoma subtype classification based on contrastive learning model with multimodal integration.

Wang C, Liu L, Fan C, Zhang Y, Mai Z, Li L, Liu Z, Tian Y, Hu J, Elazab A

•papers•Aug 19 2025

Accurately identifying the stages of lung adenocarcinoma is essential for selecting the most appropriate treatment plans. Nonetheless, this task is complicated due to challenges such as integrating diverse data, similarities among subtypes, and the need to capture contextual features, making precise differentiation difficult. We address these challenges and propose a multimodal deep neural network that integrates computed tomography (CT) images, annotated lesion bounding boxes, and electronic health records. Our model first combines bounding boxes with precise lesion location data and CT scans, generating a richer semantic representation through feature extraction from regions of interest to enhance localization accuracy using a vision transformer module. Beyond imaging data, the model also incorporates clinical information encoded using a fully connected encoder. Features extracted from both CT and clinical data are optimized for cosine similarity using a contrastive language-image pre-training module, ensuring they are cohesively integrated. In addition, we introduce an attention-based feature fusion module that harmonizes these features into a unified representation to fuse features from different modalities further. This integrated feature set is then fed into a classifier that effectively distinguishes among the three types of adenocarcinomas. Finally, we employ focal loss to mitigate the effects of unbalanced classes and contrastive learning loss to enhance feature representation and improve the model's performance. Our experiments on public and proprietary datasets demonstrate the efficiency of our model, achieving a superior validation accuracy of 81.42% and an area under the curve of 0.9120. These results significantly outperform recent multimodal classification approaches. The code is available at https://github.com/fancccc/LungCancerDC .

CT Classification Chest Methodology In Silico Academic Lab Open Code

A Multimodal Large Language Model as an End-to-End Classifier of Thyroid Nodule Malignancy Risk: Usability Study.

Sng GGR, Xiang Y, Lim DYZ, Tung JYM, Tan JH, Chng CL

•papers•Aug 19 2025

Thyroid nodules are common, with ultrasound imaging as the primary modality for their assessment. Risk stratification systems like the American College of Radiology Thyroid Imaging Reporting and Data System (ACR TI-RADS) have been developed but suffer from interobserver variability and low specificity. Artificial intelligence, particularly large language models (LLMs) with multimodal capabilities, presents opportunities for efficient end-to-end diagnostic processes. However, their clinical utility remains uncertain. This study evaluates the accuracy and consistency of multimodal LLMs for thyroid nodule risk stratification using the ACR TI-RADS system, examining the effects of model fine-tuning, image annotation, prompt engineering, and comparing open-source versus commercial models. In total, 3 multimodal vision-language models were evaluated: Microsoft's open-source Large Language and Visual Assistant (LLaVA) model, its medically fine-tuned variant (Large Language and Vision Assistant for bioMedicine [LLaVA-Med]), and OpenAI's commercial o3 model. A total of 192 thyroid nodules from publicly available ultrasound image datasets were assessed. Each model was evaluated using 2 prompts (basic and modified) and 2 image scenarios (unlabeled vs radiologist-annotated), yielding 6912 responses. Model outputs were compared with expert ratings for accuracy and consistency. Statistical comparisons included Chi-square tests, Mann-Whitney U tests, and Fleiss' kappa for interrater reliability. Overall, 88.4% (6110/6912) of responses were valid, with the o3 model producing the highest validity rate (2273/2304, 98.6%), followed by LLaVA (2108/2304, 91.5%) and LLaVA-Med (1729/2304, 75%; P<.001). The o3 model demonstrated the highest accuracy overall, achieving up to 57.3% accuracy in Thyroid Imaging Reporting and Data System (TI-RADS) classification, although still remaining suboptimal. Labeled images improved accuracy marginally in nodule margin assessment only when evaluating LLaVA models (407/768, 53% to 447/768, 58.2%; P=.04). Prompt engineering improved accuracy for composition (649/1,152, 56.3% vs 483/1152, 41.9%; P<.001), but significantly reduced accuracy for shape, margins, and overall classification. Consistency was the highest with the o3 model (up to 85.4%), but was comparable for LLaVA and significantly improved with image labeling and modified prompts across multiple TI-RADS categories (P<.001). Subgroup analysis for o3 alone showed prompt engineering did not affect accuracy significantly but markedly improved consistency across all TI-RADS categories (up to 97.1% for shape, P<.001). Interrater reliability was consistently poor across all combinations (Fleiss' kappa<0.60). The study demonstrates the comparative advantages and limitations of multimodal LLMs for thyroid nodule risk stratification. While the commercial model (o3) consistently outperformed open-source models in accuracy and consistency, even the best-performing model outputs remained suboptimal for direct clinical deployment. Prompt engineering significantly enhanced output consistency, particularly in the commercial model. These findings underline the importance of strategic model optimization techniques and highlight areas requiring further development before multimodal LLMs can be reliably used in clinical thyroid imaging workflows.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Machine Learning in Venous Thromboembolism - Why and What Next?

Gurumurthy G, Kisiel F, Reynolds L, Thomas W, Othman M, Arachchillage DJ, Thachil J

•papers•Aug 19 2025

Venous thromboembolism (VTE) remains a leading cause of cardiovascular morbidity and mortality, despite advances in imaging and anticoagulation. VTE arises from diverse and overlapping risk factors, such as inherited thrombophilia, immobility, malignancy, surgery or trauma, pregnancy, hormonal therapy, obesity, chronic medical conditions (e.g., heart failure, inflammatory disease), and advancing age. Clinicians, therefore, face challenges in balancing the benefits of thromboprophylaxis against the bleeding risk. Existing clinical risk scores often exhibit only modest discrimination and calibration across heterogeneous patient populations. Machine learning (ML) has emerged as a promising tool to address these limitations. In imaging, convolutional neural networks and hybrid algorithms can detect VTE on CT pulmonary angiography with areas under the curves (AUCs) of 0.85 to 0.96. In surgical cohorts, gradient-boosting models outperform traditional risk scores, achieving AUCs between 0.70 and 0.80 in predicting postoperative VTE. In cancer-associated venous thrombosis, advanced ML models demonstrate AUCs between 0.68 and 0.82. However, concerns about bias and external validation persist. Bleeding risk prediction models remain challenging in extended anticoagulation settings, often matching conventional models. Predicting recurrent VTE using neural networks showed AUCs of 0.93 to 0.99 in initial studies. However, these lack transparency and prospective validation. Most ML models suffer from limited external validation, "black box" algorithms, and integration hurdles within clinical workflows. Future efforts should focus on standardized reporting (e.g., Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis [TRIPOD]-ML), transparent model interpretation, prospective impact assessments, and seamless incorporation into electronic health records to realize the full potential of ML in VTE.

CT Detection Cardiac Review In Silico Academic Lab Ethics Policy

LGFFM: A Localized and Globalized Frequency Fusion Model for Ultrasound Image Segmentation.

Luo X, Wang Y, Ou-Yang L

•papers•Aug 19 2025

Accurate segmentation of ultrasound images plays a critical role in disease screening and diagnosis. Recently, neural network-based methods have garnered significant attention for their potential in improving ultrasound image segmentation. However, these methods still face significant challenges, primarily due to inherent issues in ultrasound images, such as low resolution, speckle noise, and artifacts. Additionally, ultrasound image segmentation encompasses a wide range of scenarios, including organ segmentation (e.g., cardiac and fetal head) and lesion segmentation (e.g., breast cancer and thyroid nodules), making the task highly diverse and complex. Existing methods are often designed for specific segmentation scenarios, which limits their flexibility and ability to meet the diverse needs across various scenarios. To address these challenges, we propose a novel Localized and Globalized Frequency Fusion Model (LGFFM) for ultrasound image segmentation. Specifically, we first design a Parallel Bi-Encoder (PBE) architecture that integrates Local Feature Blocks (LFB) and Global Feature Blocks (GLB) to enhance feature extraction. Additionally, we introduce a Frequency Domain Mapping Module (FDMM) to capture texture information, particularly high-frequency details such as edges. Finally, a Multi-Domain Fusion (MDF) method is developed to effectively integrate features across different domains. We conduct extensive experiments on eight representative public ultrasound datasets across four different types. The results demonstrate that LGFFM outperforms current state-of-the-art methods in both segmentation accuracy and generalization performance.

Ultrasound Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

MMIS-Net for Retinal Fluid Segmentation and Detection

Improving Deep Learning for Accelerated MRI With Data Filtering

Pixels Under Pressure: Exploring Fine-Tuning Paradigms for Foundation Models in High-Resolution Medical Imaging

Improving risk stratification of PI-RADS 3 + 1 lesions of the peripheral zone: expert lexicon of terms, multi-reader performance and contribution of artificial intelligence.

Multimodal imaging deep learning model for predicting extraprostatic extension in prostate cancer using MpMRI and 18 F-PSMA-PET/CT.

Application of deep learning reconstruction at prone position chest scanning of early interstitial lung disease.

Lung adenocarcinoma subtype classification based on contrastive learning model with multimodal integration.

A Multimodal Large Language Model as an End-to-End Classifier of Thyroid Nodule Malignancy Risk: Usability Study.

Machine Learning in Venous Thromboembolism - Why and What Next?

LGFFM: A Localized and Globalized Frequency Fusion Model for Ultrasound Image Segmentation.

Ready to Sharpen Your Edge?