Latest Papers on Radiology AI. Tags: Open Code

Rethinking boundary detection in deep learning-based medical image segmentation.

Lin Y, Zhang D, Fang X, Chen Y, Cheng KT, Chen H

•papers•Jul 1 2025

Medical image segmentation is a pivotal task within the realms of medical image analysis and computer vision. While current methods have shown promise in accurately segmenting major regions of interest, the precise segmentation of boundary areas remains challenging. In this study, we propose a novel network architecture named CTO, which combines Convolutional Neural Networks (CNNs), Vision Transformer (ViT) models, and explicit edge detection operators to tackle this challenge. CTO surpasses existing methods in terms of segmentation accuracy and strikes a better balance between accuracy and efficiency, without the need for additional data inputs or label injections. Specifically, CTO adheres to the canonical encoder-decoder network paradigm, with a dual-stream encoder network comprising a mainstream CNN stream for capturing local features and an auxiliary StitchViT stream for integrating long-range dependencies. Furthermore, to enhance the model's ability to learn boundary areas, we introduce a boundary-guided decoder network that employs binary boundary masks generated by dedicated edge detection operators to provide explicit guidance during the decoding process. We validate the performance of CTO through extensive experiments conducted on seven challenging medical image segmentation datasets, namely ISIC 2016, PH2, ISIC 2018, CoNIC, LiTS17, BraTS, and BTCV. Our experimental results unequivocally demonstrate that CTO achieves state-of-the-art accuracy on these datasets while maintaining competitive model complexity. The codes have been released at: CTO.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA Open Code

Towards Foundation Models and Few-Shot Parameter-Efficient Fine-Tuning for Volumetric Organ Segmentation.

Silva-Rodríguez J, Dolz J, Ben Ayed I

•papers•Jul 1 2025

The recent popularity of foundation models and the pre-train-and-adapt paradigm, where a large-scale model is transferred to downstream tasks, is gaining attention for volumetric medical image segmentation. However, current transfer learning strategies devoted to full fine-tuning for transfer learning may require significant resources and yield sub-optimal results when the labeled data of the target task is scarce. This makes its applicability in real clinical settings challenging since these institutions are usually constrained on data and computational resources to develop proprietary solutions. To address this challenge, we formalize Few-Shot Efficient Fine-Tuning (FSEFT), a novel and realistic scenario for adapting medical image segmentation foundation models. This setting considers the key role of both data- and parameter-efficiency during adaptation. Building on a foundation model pre-trained on open-access CT organ segmentation sources, we propose leveraging Parameter-Efficient Fine-Tuning and black-box Adapters to address such challenges. Furthermore, novel efficient adaptation methodologies are introduced in this work, which include Spatial black-box Adapters that are more appropriate for dense prediction tasks and constrained transductive inference, leveraging task-specific prior knowledge. Our comprehensive transfer learning experiments confirm the suitability of foundation models in medical image segmentation and unveil the limitations of popular fine-tuning strategies in few-shot scenarios. The project code is available: https://github.com/jusiro/fewshot-finetuning.

CT Segmentation Abdominal Methodology In Silico Academic Lab Open Code

CALIMAR-GAN: An unpaired mask-guided attention network for metal artifact reduction in CT scans.

Scardigno RM, Brunetti A, Marvulli PM, Carli R, Dotoli M, Bevilacqua V, Buongiorno D

•papers•Jul 1 2025

High-quality computed tomography (CT) scans are essential for accurate diagnostic and therapeutic decisions, but the presence of metal objects within the body can produce distortions that lower image quality. Deep learning (DL) approaches using image-to-image translation for metal artifact reduction (MAR) show promise over traditional methods but often introduce secondary artifacts. Additionally, most rely on paired simulated data due to limited availability of real paired clinical data, restricting evaluation on clinical scans to qualitative analysis. This work presents CALIMAR-GAN, a generative adversarial network (GAN) model that employs a guided attention mechanism and the linear interpolation algorithm to reduce artifacts using unpaired simulated and clinical data for targeted artifact reduction. Quantitative evaluations on simulated images demonstrated superior performance, achieving a PSNR of 31.7, SSIM of 0.877, and Fréchet inception distance (FID) of 22.1, outperforming state-of-the-art methods. On real clinical images, CALIMAR-GAN achieved the lowest FID (32.7), validated as a valuable complement to qualitative assessments through correlation with pixel-based metrics (r=-0.797 with PSNR, p<0.01; r=-0.767 with MS-SSIM, p<0.01). This work advances DL-based artifact reduction into clinical practice with high-fidelity reconstructions that enhance diagnostic accuracy and therapeutic outcomes. Code is available at https://github.com/roberto722/calimar-gan.

CT Reconstruction Methodology In Silico Academic Lab Open Code

MedDiff-FT: Data-Efficient Diffusion Model Fine-tuning with Structural Guidance for Controllable Medical Image Synthesis

Jianhao Xie, Ziang Zhang, Zhenyu Weng, Yuesheng Zhu, Guibo Luo

•preprint•Jul 1 2025

Recent advancements in deep learning for medical image segmentation are often limited by the scarcity of high-quality training data.While diffusion models provide a potential solution by generating synthetic images, their effectiveness in medical imaging remains constrained due to their reliance on large-scale medical datasets and the need for higher image quality. To address these challenges, we present MedDiff-FT, a controllable medical image generation method that fine-tunes a diffusion foundation model to produce medical images with structural dependency and domain specificity in a data-efficient manner. During inference, a dynamic adaptive guiding mask enforces spatial constraints to ensure anatomically coherent synthesis, while a lightweight stochastic mask generator enhances diversity through hierarchical randomness injection. Additionally, an automated quality assessment protocol filters suboptimal outputs using feature-space metrics, followed by mask corrosion to refine fidelity. Evaluated on five medical segmentation datasets,MedDiff-FT's synthetic image-mask pairs improve SOTA method's segmentation performance by an average of 1% in Dice score. The framework effectively balances generation quality, diversity, and computational efficiency, offering a practical solution for medical data augmentation. The code is available at https://github.com/JianhaoXie1/MedDiff-FT.

Mixed Modality Image Synthesis Methodology In Silico Academic Lab Open Code

Synthetic Versus Classic Data Augmentation: Impacts on Breast Ultrasound Image Classification.

Medghalchi Y, Zakariaei N, Rahmim A, Hacihaliloglu I

•papers•Jul 1 2025

The effectiveness of deep neural networks (DNNs) for the ultrasound image analysis depends on the availability and accuracy of the training data. However, the large-scale data collection and annotation, particularly in medical fields, is often costly and time consuming, especially when healthcare professionals are already burdened with their clinical responsibilities. Ensuring that a model remains robust across different imaging conditions-such as variations in ultrasound devices and manual transducer operation-is crucial in the ultrasound image analysis. The data augmentation is a widely used solution, as it increases both the size and diversity of datasets, thereby enhancing the generalization performance of DNNs. With the advent of generative networks such as generative adversarial networks (GANs) and diffusion-based models, the synthetic data generation has emerged as a promising augmentation technique. However, comprehensive studies comparing classic and generative method-based augmentation methods are lacking, particularly in ultrasound-based breast cancer imaging, where variability in breast density, tumor morphology, and operator skill poses significant challenges. This study aims to compare the effectiveness of classic and generative network-based data augmentation techniques in improving the performance and robustness of breast ultrasound image classification models. Specifically, we seek to determine whether the computational intensity of generative networks is justified in data augmentation. This analysis will provide valuable insights into the role and benefits of each technique in enhancing the diagnostic accuracy of DNN for breast cancer diagnosis. The code for this work will be available at: ht.tps://github.com/yasamin-med/SCDA.git.

Ultrasound Classification Breast Methodology In Silico Academic Lab Open Code

ADAptation: Reconstruction-based Unsupervised Active Learning for Breast Ultrasound Diagnosis

Yaofei Duan, Yuhao Huang, Xin Yang, Luyi Han, Xinyu Xie, Zhiyuan Zhu, Ping He, Ka-Hou Chan, Ligang Cui, Sio-Kei Im, Dong Ni, Tao Tan

•preprint•Jul 1 2025

Deep learning-based diagnostic models often suffer performance drops due to distribution shifts between training (source) and test (target) domains. Collecting and labeling sufficient target domain data for model retraining represents an optimal solution, yet is limited by time and scarce resources. Active learning (AL) offers an efficient approach to reduce annotation costs while maintaining performance, but struggles to handle the challenge posed by distribution variations across different datasets. In this study, we propose a novel unsupervised Active learning framework for Domain Adaptation, named ADAptation, which efficiently selects informative samples from multi-domain data pools under limited annotation budget. As a fundamental step, our method first utilizes the distribution homogenization capabilities of diffusion models to bridge cross-dataset gaps by translating target images into source-domain style. We then introduce two key innovations: (a) a hypersphere-constrained contrastive learning network for compact feature clustering, and (b) a dual-scoring mechanism that quantifies and balances sample uncertainty and representativeness. Extensive experiments on four breast ultrasound datasets (three public and one in-house/multi-center) across five common deep classifiers demonstrate that our method surpasses existing strong AL-based competitors, validating its effectiveness and generalization for clinical domain adaptation. The code is available at the anonymized link: https://github.com/miccai25-966/ADAptation.

Ultrasound Classification Breast Methodology In Silico Academic Lab Open Code

Mind the Detail: Uncovering Clinically Relevant Image Details in Accelerated MRI with Semantically Diverse Reconstructions

Jan Nikolas Morshuis, Christian Schlarmann, Thomas Küstner, Christian F. Baumgartner, Matthias Hein

•preprint•Jul 1 2025

In recent years, accelerated MRI reconstruction based on deep learning has led to significant improvements in image quality with impressive results for high acceleration factors. However, from a clinical perspective image quality is only secondary; much more important is that all clinically relevant information is preserved in the reconstruction from heavily undersampled data. In this paper, we show that existing techniques, even when considering resampling for diffusion-based reconstruction, can fail to reconstruct small and rare pathologies, thus leading to potentially wrong diagnosis decisions (false negatives). To uncover the potentially missing clinical information we propose ``Semantically Diverse Reconstructions'' (\SDR), a method which, given an original reconstruction, generates novel reconstructions with enhanced semantic variability while all of them are fully consistent with the measured data. To evaluate \SDR automatically we train an object detector on the fastMRI+ dataset. We show that \SDR significantly reduces the chance of false-negative diagnoses (higher recall) and improves mean average precision compared to the original reconstructions. The code is available on https://github.com/NikolasMorshuis/SDR

MRI Reconstruction Methodology In Silico Academic Lab Open Code Reproducibility

Unsupervised Cardiac Video Translation Via Motion Feature Guided Diffusion Model

Swakshar Deb, Nian Wu, Frederick H. Epstein, Miaomiao Zhang

•preprint•Jul 1 2025

This paper presents a novel motion feature guided diffusion model for unpaired video-to-video translation (MFD-V2V), designed to synthesize dynamic, high-contrast cine cardiac magnetic resonance (CMR) from lower-contrast, artifact-prone displacement encoding with stimulated echoes (DENSE) CMR sequences. To achieve this, we first introduce a Latent Temporal Multi-Attention (LTMA) registration network that effectively learns more accurate and consistent cardiac motions from cine CMR image videos. A multi-level motion feature guided diffusion model, equipped with a specialized Spatio-Temporal Motion Encoder (STME) to extract fine-grained motion conditioning, is then developed to improve synthesis quality and fidelity. We evaluate our method, MFD-V2V, on a comprehensive cardiac dataset, demonstrating superior performance over the state-of-the-art in both quantitative metrics and qualitative assessments. Furthermore, we show the benefits of our synthesized cine CMRs improving downstream clinical and analytical tasks, underscoring the broader impact of our approach. Our code is publicly available at https://github.com/SwaksharDeb/MFD-V2V.

MRI Image Synthesis Cardiac Methodology In Silico Academic Lab Open Code

Multi-site, multi-vendor development and validation of a deep learning model for liver stiffness prediction using abdominal biparametric MRI.

Ali R, Li H, Zhang H, Pan W, Reeder SB, Harris D, Masch W, Aslam A, Shanbhogue K, Bernieh A, Ranganathan S, Parikh N, Dillman JR, He L

•papers•Jul 1 2025

Chronic liver disease (CLD) is a substantial cause of morbidity and mortality worldwide. Liver stiffness, as measured by MR elastography (MRE), is well-accepted as a surrogate marker of liver fibrosis. To develop and validate deep learning (DL) models for predicting MRE-derived liver stiffness using routine clinical non-contrast abdominal T1-weighted (T1w) and T2-weighted (T2w) data from multiple institutions/system manufacturers in pediatric and adult patients. We identified pediatric and adult patients with known or suspected CLD from four institutions, who underwent clinical MRI with MRE from 2011 to 2022. We used T1w and T2w data to train DL models for liver stiffness classification. Patients were categorized into two groups for binary classification using liver stiffness thresholds (≥ 2.5 kPa, ≥ 3.0 kPa, ≥ 3.5 kPa, ≥ 4 kPa, or ≥ 5 kPa), reflecting various degrees of liver stiffening. We identified 4695 MRI examinations from 4295 patients (mean ± SD age, 47.6 ± 18.7 years; 428 (10.0%) pediatric; 2159 males [50.2%]). With a primary liver stiffness threshold of 3.0 kPa, our model correctly classified patients into no/minimal (< 3.0 kPa) vs moderate/severe (≥ 3.0 kPa) liver stiffness with AUROCs of 0.83 (95% CI: 0.82, 0.84) in our internal multi-site cross-validation (CV) experiment, 0.82 (95% CI: 0.80, 0.84) in our temporal hold-out validation experiment, and 0.79 (95% CI: 0.75, 0.81) in our external leave-one-site-out CV experiment. The developed model is publicly available ( https://github.com/almahdir1/Multi-channel-DeepLiverNet2.0.git ). Our DL models exhibited reasonable diagnostic performance for categorical classification of liver stiffness on a large diverse dataset using T1w and T2w MRI data. Question Can DL models accurately predict liver stiffness using routine clinical biparametric MRI in pediatric and adult patients with CLD? Findings DeepLiverNet2.0 used biparametric MRI data to classify liver stiffness, achieving AUROCs of 0.83, 0.82, and 0.79 for multi-site CV, hold-out validation, and external CV. Clinical relevance Our DeepLiverNet2.0 AI model can categorically classify the severity of liver stiffening using anatomic biparametric MR images in children and young adults. Model refinements and incorporation of clinical features may decrease the need for MRE.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab Open Code

CXR-LLaVA: a multimodal large language model for interpreting chest X-ray images.

Lee S, Youn J, Kim H, Kim M, Yoon SH

•papers•Jul 1 2025

This study aimed to develop an open-source multimodal large language model (CXR-LLaVA) for interpreting chest X-ray images (CXRs), leveraging recent advances in large language models (LLMs) to potentially replicate the image interpretation skills of human radiologists. For training, we collected 592,580 publicly available CXRs, of which 374,881 had labels for certain radiographic abnormalities (Dataset 1) and 217,699 provided free-text radiology reports (Dataset 2). After pre-training a vision transformer with Dataset 1, we integrated it with an LLM influenced by the LLaVA network. Then, the model was fine-tuned, primarily using Dataset 2. The model's diagnostic performance for major pathological findings was evaluated, along with the acceptability of radiologic reports by human radiologists, to gauge its potential for autonomous reporting. The model demonstrated impressive performance in test sets, achieving an average F1 score of 0.81 for six major pathological findings in the MIMIC internal test set and 0.56 for six major pathological findings in the external test set. The model's F1 scores surpassed those of GPT-4-vision and Gemini-Pro-Vision in both test sets. In human radiologist evaluations of the external test set, the model achieved a 72.7% success rate in autonomous reporting, slightly below the 84.0% rate of ground truth reports. This study highlights the significant potential of multimodal LLMs for CXR interpretation, while also acknowledging the performance limitations. Despite these challenges, we believe that making our model open-source will catalyze further research, expanding its effectiveness and applicability in various clinical contexts. Question How can a multimodal large language model be adapted to interpret chest X-rays and generate radiologic reports? Findings The developed CXR-LLaVA model effectively detects major pathological findings in chest X-rays and generates radiologic reports with a higher accuracy compared to general-purpose models. Clinical relevance This study demonstrates the potential of multimodal large language models to support radiologists by autonomously generating chest X-ray reports, potentially reducing diagnostic workloads and improving radiologist efficiency.

X-Ray LLM Radiology Report Chest Methodology In Silico Academic Lab Open Code GenAI

Filter Papers

Tags

Rethinking boundary detection in deep learning-based medical image segmentation.

Towards Foundation Models and Few-Shot Parameter-Efficient Fine-Tuning for Volumetric Organ Segmentation.

CALIMAR-GAN: An unpaired mask-guided attention network for metal artifact reduction in CT scans.

MedDiff-FT: Data-Efficient Diffusion Model Fine-tuning with Structural Guidance for Controllable Medical Image Synthesis

Synthetic Versus Classic Data Augmentation: Impacts on Breast Ultrasound Image Classification.

ADAptation: Reconstruction-based Unsupervised Active Learning for Breast Ultrasound Diagnosis

Mind the Detail: Uncovering Clinically Relevant Image Details in Accelerated MRI with Semantically Diverse Reconstructions

Unsupervised Cardiac Video Translation Via Motion Feature Guided Diffusion Model

Multi-site, multi-vendor development and validation of a deep learning model for liver stiffness prediction using abdominal biparametric MRI.

CXR-LLaVA: a multimodal large language model for interpreting chest X-ray images.

Ready to Sharpen Your Edge?