Latest Papers on Radiology AI. Tags: Open Code

Benchmarking GANs, Diffusion Models, and Flow Matching for T1w-to-T2w MRI Translation

Andrea Moschetto, Lemuel Puglisi, Alec Sargood, Pierluigi Dell'Acqua, Francesco Guarnera, Sebastiano Battiato, Daniele Ravì

•preprint•Jul 19 2025

Magnetic Resonance Imaging (MRI) enables the acquisition of multiple image contrasts, such as T1-weighted (T1w) and T2-weighted (T2w) scans, each offering distinct diagnostic insights. However, acquiring all desired modalities increases scan time and cost, motivating research into computational methods for cross-modal synthesis. To address this, recent approaches aim to synthesize missing MRI contrasts from those already acquired, reducing acquisition time while preserving diagnostic quality. Image-to-image (I2I) translation provides a promising framework for this task. In this paper, we present a comprehensive benchmark of generative models$\unicode{x2013}$specifically, Generative Adversarial Networks (GANs), diffusion models, and flow matching (FM) techniques$\unicode{x2013}$for T1w-to-T2w 2D MRI I2I translation. All frameworks are implemented with comparable settings and evaluated on three publicly available MRI datasets of healthy adults. Our quantitative and qualitative analyses show that the GAN-based Pix2Pix model outperforms diffusion and FM-based methods in terms of structural fidelity, image quality, and computational efficiency. Consistent with existing literature, these results suggest that flow-based models are prone to overfitting on small datasets and simpler tasks, and may require more data to match or surpass GAN performance. These findings offer practical guidance for deploying I2I translation techniques in real-world MRI workflows and highlight promising directions for future research in cross-modal medical image synthesis. Code and models are publicly available at https://github.com/AndreaMoschetto/medical-I2I-benchmark.

MRI Image Synthesis Neurological Methodology In Silico Academic Lab Open Code Benchmark SOTA

Depthwise-Dilated Convolutional Adapters for Medical Object Tracking and Segmentation Using the Segment Anything Model 2

Guoping Xu, Christopher Kabat, You Zhang

•preprint•Jul 19 2025

Recent advances in medical image segmentation have been driven by deep learning; however, most existing methods remain limited by modality-specific designs and exhibit poor adaptability to dynamic medical imaging scenarios. The Segment Anything Model 2 (SAM2) and its related variants, which introduce a streaming memory mechanism for real-time video segmentation, present new opportunities for prompt-based, generalizable solutions. Nevertheless, adapting these models to medical video scenarios typically requires large-scale datasets for retraining or transfer learning, leading to high computational costs and the risk of catastrophic forgetting. To address these challenges, we propose DD-SAM2, an efficient adaptation framework for SAM2 that incorporates a Depthwise-Dilated Adapter (DD-Adapter) to enhance multi-scale feature extraction with minimal parameter overhead. This design enables effective fine-tuning of SAM2 on medical videos with limited training data. Unlike existing adapter-based methods focused solely on static images, DD-SAM2 fully exploits SAM2's streaming memory for medical video object tracking and segmentation. Comprehensive evaluations on TrackRad2025 (tumor segmentation) and EchoNet-Dynamic (left ventricle tracking) datasets demonstrate superior performance, achieving Dice scores of 0.93 and 0.97, respectively. To the best of our knowledge, this work provides an initial attempt at systematically exploring adapter-based SAM2 fine-tuning for medical video segmentation and tracking. Code, datasets, and models will be publicly available at https://github.com/apple1986/DD-SAM2.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA Open Code Open Dataset

DUSTrack: Semi-automated point tracking in ultrasound videos

Praneeth Namburi, Roger Pallarès-López, Jessica Rosendorf, Duarte Folgado, Brian W. Anthony

•preprint•Jul 18 2025

Ultrasound technology enables safe, non-invasive imaging of dynamic tissue behavior, making it a valuable tool in medicine, biomechanics, and sports science. However, accurately tracking tissue motion in B-mode ultrasound remains challenging due to speckle noise, low edge contrast, and out-of-plane movement. These challenges complicate the task of tracking anatomical landmarks over time, which is essential for quantifying tissue dynamics in many clinical and research applications. This manuscript introduces DUSTrack (Deep learning and optical flow-based toolkit for UltraSound Tracking), a semi-automated framework for tracking arbitrary points in B-mode ultrasound videos. We combine deep learning with optical flow to deliver high-quality and robust tracking across diverse anatomical structures and motion patterns. The toolkit includes a graphical user interface that streamlines the generation of high-quality training data and supports iterative model refinement. It also implements a novel optical-flow-based filtering technique that reduces high-frequency frame-to-frame noise while preserving rapid tissue motion. DUSTrack demonstrates superior accuracy compared to contemporary zero-shot point trackers and performs on par with specialized methods, establishing its potential as a general and foundational tool for clinical and biomechanical research. We demonstrate DUSTrack's versatility through three use cases: cardiac wall motion tracking in echocardiograms, muscle deformation analysis during reaching tasks, and fascicle tracking during ankle plantarflexion. As an open-source solution, DUSTrack offers a powerful, flexible framework for point tracking to quantify tissue motion from ultrasound videos. DUSTrack is available at https://github.com/praneethnamburi/DUSTrack.

Ultrasound Detection Methodology In Silico Academic Lab Open Code

Software architecture and manual for novel versatile CT image analysis toolbox -- AnatomyArchive

Lei Xu, Torkel B Brismar

•preprint•Jul 18 2025

We have developed a novel CT image analysis package named AnatomyArchive, built on top of the recent full body segmentation model TotalSegmentator. It provides automatic target volume selection and deselection capabilities according to user-configured anatomies for volumetric upper- and lower-bounds. It has a knowledge graph-based and time efficient tool for anatomy segmentation mask management and medical image database maintenance. AnatomyArchive enables automatic body volume cropping, as well as automatic arm-detection and exclusion, for more precise body composition analysis in both 2D and 3D formats. It provides robust voxel-based radiomic feature extraction, feature visualization, and an integrated toolchain for statistical tests and analysis. A python-based GPU-accelerated nearly photo-realistic segmentation-integrated composite cinematic rendering is also included. We present here its software architecture design, illustrate its workflow and working principle of algorithms as well provide a few examples on how the software can be used to assist development of modern machine learning models. Open-source codes will be released at https://github.com/lxu-medai/AnatomyArchive for only research and educational purposes.

CT Segmentation Whole Body Methodology Prototype Academic Lab Open Code

Divide and Conquer: A Large-Scale Dataset and Model for Left-Right Breast MRI Segmentation

Maximilian Rokuss, Benjamin Hamm, Yannick Kirchhoff, Klaus Maier-Hein

•preprint•Jul 18 2025

We introduce the first publicly available breast MRI dataset with explicit left and right breast segmentation labels, encompassing more than 13,000 annotated cases. Alongside this dataset, we provide a robust deep-learning model trained for left-right breast segmentation. This work addresses a critical gap in breast MRI analysis and offers a valuable resource for the development of advanced tools in women's health. The dataset and trained model are publicly available at: www.github.com/MIC-DKFZ/BreastDivider

MRI Segmentation Breast Dataset Release In Silico Academic Lab Open Dataset Open Code

SegMamba-V2: Long-range Sequential Modeling Mamba For General 3D Medical Image Segmentation.

Xing Z, Ye T, Yang Y, Cai D, Gai B, Wu XJ, Gao F, Zhu L

•papers•Jul 18 2025

The Transformer architecture has demonstrated remarkable results in 3D medical image segmentation due to its capability of modeling global relationships. However, it poses a significant computational burden when processing high-dimensional medical images. Mamba, as a State Space Model (SSM), has recently emerged as a notable approach for modeling long-range dependencies in sequential data. Although a substantial amount of Mamba-based research has focused on natural language and 2D image processing, few studies explore the capability of Mamba on 3D medical images. In this paper, we propose SegMamba-V2, a novel 3D medical image segmentation model, to effectively capture long-range dependencies within whole-volume features at each scale. To achieve this goal, we first devise a hierarchical scale downsampling strategy to enhance the receptive field and mitigate information loss during downsampling. Furthermore, we design a novel tri-orientated spatial Mamba block that extends the global dependency modeling process from one plane to three orthogonal planes to improve feature representation capability. Moreover, we collect and annotate a large-scale dataset (named CRC-2000) with fine-grained categories to facilitate benchmarking evaluation in 3D colorectal cancer (CRC) segmentation. We evaluate the effectiveness of our SegMamba-V2 on CRC-2000 and three other large-scale 3D medical image segmentation datasets, covering various modalities, organs, and segmentation targets. Experimental results demonstrate that our Segmamba-V2 outperforms state-of-the-art methods by a significant margin, which indicates the universality and effectiveness of the proposed model on 3D medical image segmentation tasks. The code for SegMamba-V2 is publicly available at: https://github.com/ge-xing/SegMamba-V2.

Mixed Modality Segmentation Abdominal Methodology In Silico Academic Lab Open Code Open Dataset Benchmark SOTA

UGPL: Uncertainty-Guided Progressive Learning for Evidence-Based Classification in Computed Tomography

Shravan Venkatraman, Pavan Kumar S, Rakesh Raj Madavan, Chandrakala S

•preprint•Jul 18 2025

Accurate classification of computed tomography (CT) images is essential for diagnosis and treatment planning, but existing methods often struggle with the subtle and spatially diverse nature of pathological features. Current approaches typically process images uniformly, limiting their ability to detect localized abnormalities that require focused analysis. We introduce UGPL, an uncertainty-guided progressive learning framework that performs a global-to-local analysis by first identifying regions of diagnostic ambiguity and then conducting detailed examination of these critical areas. Our approach employs evidential deep learning to quantify predictive uncertainty, guiding the extraction of informative patches through a non-maximum suppression mechanism that maintains spatial diversity. This progressive refinement strategy, combined with an adaptive fusion mechanism, enables UGPL to integrate both contextual information and fine-grained details. Experiments across three CT datasets demonstrate that UGPL consistently outperforms state-of-the-art methods, achieving improvements of 3.29%, 2.46%, and 8.08% in accuracy for kidney abnormality, lung cancer, and COVID-19 detection, respectively. Our analysis shows that the uncertainty-guided component provides substantial benefits, with performance dramatically increasing when the full progressive learning pipeline is implemented. Our code is available at: https://github.com/shravan-18/UGPL

CT Classification Methodology In Silico Academic Lab Open Code Benchmark SOTA

Precision Diagnosis and Treatment Monitoring of Glioma via PET Radiomics.

Zhou C, Ji P, Gong B, Kou Y, Fan Z, Wang L

•papers•Jul 17 2025

Glioma, the most common primary intracranial tumor, poses significant challenges to precision diagnosis and treatment due to its heterogeneity and invasiveness. With the introduction of the 2021 WHO classification standard based on molecular biomarkers, the role of imaging in non-invasive subtyping and therapeutic monitoring of gliomas has become increasingly crucial. While conventional MRI shows limitations in assessing metabolic status and differentiating tumor recurrence, positron emission tomography (PET) combined with radiomics and artificial intelligence technologies offers a novel paradigm for precise diagnosis and treatment monitoring through quantitative extraction of multimodal imaging features (e.g., intensity, texture, dynamic parameters). This review systematically summarizes the technical workflow of PET radiomics (including tracer selection, image segmentation, feature extraction, and model construction) and its applications in predicting molecular subtypes (such as IDH mutation and MGMT methylation), distinguishing recurrence from treatment-related changes, and prognostic stratification. Studies demonstrate that amino acid tracers (e.g., <sup>18</sup>F-FET, <sup>11</sup>C-MET) combined with multimodal radiomics models significantly outperform traditional parametric analysis in diagnostic efficacy. Nevertheless, current research still faces challenges including data heterogeneity, insufficient model interpretability, and lack of clinical validation. Future advancements require multicenter standardized protocols, open-source algorithm frameworks, and multi-omics integration to facilitate the transformative clinical translation of PET radiomics from research to practice.

PET Classification Neurological Review Concept Academic Lab GenAI Open Code

Insights into a radiology-specialised multimodal large language model with sparse autoencoders

Kenza Bouzid, Shruthi Bannur, Felix Meissen, Daniel Coelho de Castro, Anton Schwaighofer, Javier Alvarez-Valle, Stephanie L. Hyland

•preprint•Jul 17 2025

Interpretability can improve the safety, transparency and trust of AI models, which is especially important in healthcare applications where decisions often carry significant consequences. Mechanistic interpretability, particularly through the use of sparse autoencoders (SAEs), offers a promising approach for uncovering human-interpretable features within large transformer-based models. In this study, we apply Matryoshka-SAE to the radiology-specialised multimodal large language model, MAIRA-2, to interpret its internal representations. Using large-scale automated interpretability of the SAE features, we identify a range of clinically relevant concepts - including medical devices (e.g., line and tube placements, pacemaker presence), pathologies such as pleural effusion and cardiomegaly, longitudinal changes and textual features. We further examine the influence of these features on model behaviour through steering, demonstrating directional control over generations with mixed success. Our results reveal practical and methodological challenges, yet they offer initial insights into the internal concepts learned by MAIRA-2 - marking a step toward deeper mechanistic understanding and interpretability of a radiology-adapted multimodal large language model, and paving the way for improved model transparency. We release the trained SAEs and interpretations: https://huggingface.co/microsoft/maira-2-sae.

Mixed Modality LLM Radiology Report Chest Methodology In Silico Big Tech Open Code GenAI

DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model

Han Zhang, Xiangde Luo, Yong Chen, Kang Li

•preprint•Jul 17 2025

Annotation variability remains a substantial challenge in medical image segmentation, stemming from ambiguous imaging boundaries and diverse clinical expertise. Traditional deep learning methods producing single deterministic segmentation predictions often fail to capture these annotator biases. Although recent studies have explored multi-rater segmentation, existing methods typically focus on a single perspective -- either generating a probabilistic ``gold standard'' consensus or preserving expert-specific preferences -- thus struggling to provide a more omni view. In this study, we propose DiffOSeg, a two-stage diffusion-based framework, which aims to simultaneously achieve both consensus-driven (combining all experts' opinions) and preference-driven (reflecting experts' individual assessments) segmentation. Stage I establishes population consensus through a probabilistic consensus strategy, while Stage II captures expert-specific preference via adaptive prompts. Demonstrated on two public datasets (LIDC-IDRI and NPC-170), our model outperforms existing state-of-the-art methods across all evaluated metrics. Source code is available at https://github.com/string-ellipses/DiffOSeg .

CT Segmentation Chest Methodology In Silico Academic Lab Open Code Benchmark SOTA

Filter Papers

Tags

Benchmarking GANs, Diffusion Models, and Flow Matching for T1w-to-T2w MRI Translation

Depthwise-Dilated Convolutional Adapters for Medical Object Tracking and Segmentation Using the Segment Anything Model 2

DUSTrack: Semi-automated point tracking in ultrasound videos

Software architecture and manual for novel versatile CT image analysis toolbox -- AnatomyArchive

Divide and Conquer: A Large-Scale Dataset and Model for Left-Right Breast MRI Segmentation

SegMamba-V2: Long-range Sequential Modeling Mamba For General 3D Medical Image Segmentation.

UGPL: Uncertainty-Guided Progressive Learning for Evidence-Based Classification in Computed Tomography

Precision Diagnosis and Treatment Monitoring of Glioma via PET Radiomics.

Insights into a radiology-specialised multimodal large language model with sparse autoencoders

DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model

Ready to Sharpen Your Edge?