Latest Papers on Radiology AI. Tags: Mixed Modality

Steps Adaptive Decay DPSGD: Enhancing Performance on Imbalanced Datasets with Differential Privacy with HAM10000

Xiaobo Huang, Fang Xie

•preprint•Jul 9 2025

When applying machine learning to medical image classification, data leakage is a critical issue. Previous methods, such as adding noise to gradients for differential privacy, work well on large datasets like MNIST and CIFAR-100, but fail on small, imbalanced medical datasets like HAM10000. This is because the imbalanced distribution causes gradients from minority classes to be clipped and lose crucial information, while majority classes dominate. This leads the model to fall into suboptimal solutions early. To address this, we propose SAD-DPSGD, which uses a linear decaying mechanism for noise and clipping thresholds. By allocating more privacy budget and using higher clipping thresholds in the initial training phases, the model avoids suboptimal solutions and enhances performance. Experiments show that SAD-DPSGD outperforms Auto-DPSGD on HAM10000, improving accuracy by 2.15% under $\epsilon = 3.0$ , $\delta = 10^{-3}$.

Mixed Modality Classification Methodology In Silico Academic Lab Benchmark SOTA

4KAgent: Agentic Any Image to 4K Super-Resolution

Yushen Zuo, Qi Zheng, Mingyang Wu, Xinrui Jiang, Renjie Li, Jian Wang, Yide Zhang, Gengchen Mai, Lihong V. Wang, James Zou, Xiaoyu Wang, Ming-Hsuan Yang, Zhengzhong Tu

•preprint•Jul 9 2025

We present 4KAgent, a unified agentic super-resolution generalist system designed to universally upscale any image to 4K resolution (and even higher, if applied iteratively). Our system can transform images from extremely low resolutions with severe degradations, for example, highly distorted inputs at 256x256, into crystal-clear, photorealistic 4K outputs. 4KAgent comprises three core components: (1) Profiling, a module that customizes the 4KAgent pipeline based on bespoke use cases; (2) A Perception Agent, which leverages vision-language models alongside image quality assessment experts to analyze the input image and make a tailored restoration plan; and (3) A Restoration Agent, which executes the plan, following a recursive execution-reflection paradigm, guided by a quality-driven mixture-of-expert policy to select the optimal output for each step. Additionally, 4KAgent embeds a specialized face restoration pipeline, significantly enhancing facial details in portrait and selfie photos. We rigorously evaluate our 4KAgent across 11 distinct task categories encompassing a total of 26 diverse benchmarks, setting new state-of-the-art on a broad spectrum of imaging domains. Our evaluations cover natural images, portrait photos, AI-generated content, satellite imagery, fluorescence microscopy, and medical imaging like fundoscopy, ultrasound, and X-ray, demonstrating superior performance in terms of both perceptual (e.g., NIQE, MUSIQ) and fidelity (e.g., PSNR) metrics. By establishing a novel agentic paradigm for low-level vision tasks, we aim to catalyze broader interest and innovation within vision-centric autonomous agents across diverse research communities. We will release all the code, models, and results at: https://4kagent.github.io.

Mixed Modality Reconstruction Methodology In Silico Academic Lab Benchmark SOTA Open Code

AI Revolution in Radiology, Radiation Oncology and Nuclear Medicine: Transforming and Innovating the Radiological Sciences.

Carriero S, Canella R, Cicchetti F, Angileri A, Bruno A, Biondetti P, Colciago RR, D'Antonio A, Della Pepa G, Grassi F, Granata V, Lanza C, Santicchia S, Miceli A, Piras A, Salvestrini V, Santo G, Pesapane F, Barile A, Carrafiello G, Giovagnoni A

•papers•Jul 9 2025

The integration of artificial intelligence (AI) into clinical practice, particularly within radiology, nuclear medicine and radiation oncology, is transforming diagnostic and therapeutic processes. AI-driven tools, especially in deep learning and machine learning, have shown remarkable potential in enhancing image recognition, analysis and decision-making. This technological advancement allows for the automation of routine tasks, improved diagnostic accuracy, and the reduction of human error, leading to more efficient workflows. Moreover, the successful implementation of AI in healthcare requires comprehensive education and training for young clinicians, with a pressing need to incorporate AI into residency programmes, ensuring that future specialists are equipped with traditional skills and a deep understanding of AI technologies and their clinical applications. This includes knowledge of software, data analysis, imaging informatics and ethical considerations surrounding AI use in medicine. By fostering interdisciplinary integration and emphasising AI education, healthcare professionals can fully harness AI's potential to improve patient outcomes and advance the field of medical imaging and therapy. This review aims to evaluate how AI influences radiology, nuclear medicine and radiation oncology, while highlighting the necessity for specialised AI training in medical education to ensure its successful clinical integration.

Mixed Modality Review Concept Ethics Policy

Assessment of T2-weighted MRI-derived synthetic CT for the detection of suspected lumbar facet arthritis: a comparative analysis with conventional CT.

Cao G, Wang H, Xie S, Cai D, Guo J, Zhu J, Ye K, Wang Y, Xia J

•papers•Jul 8 2025

We evaluated sCT generated from T2-weighted imaging (T2WI) using deep learning techniques to detect structural lesions in lumbar facet arthritis, with conventional CT as the reference standard. This single-center retrospective study included 40 patients who had lumbar MRI and CT with in 1 week (September 2020 to August 2021). A Pix2Pix-GAN framework generated CT images from MRI data, and image quality was assessed using structural similarity index (SSIM), mean absolute error (MAE), peak signal-to-noise ratio (PSNR), nd Dice similarity coefficient (DSC). Two senior radiologists evaluated 15 anatomical landmarks. Sensitivity, specificity, and accuracy for detecting bone erosion, osteosclerosis, and joint space alterations were analyzed for sCT, T2-weighted MRI, and conventional CT. Forty participants (21 men, 19 women) were enrolled, with a mean age of 39 ± 16.9 years. sCT showed strong agreement with conventional CT, with SSIM values of 0.888 for axial and 0.889 for sagittal views. PSNR and MAE values were 24.56 dB and 0.031 for axial and 23.75 dB and 0.038 for sagittal views, respectively. DSC values were 0.935 for axial and 0.876 for sagittal views. sCT showed excellent intra- and inter-reader reliability intraclass correlation coefficients (0.953-0.995 and 0.839-0.983, respectively). sCT had higher sensitivity (57.9% vs. 5.3%), specificity (98.8% vs. 84.6%), and accuracy (93.0% vs. 73.3%) for bone erosion than T2-weighted MRI and outperformed it for osteosclerosis and joint space changes. sCT outperformed conventional T2-weighted MRI in detecting structural lesions indicative of lumbar facet arthritis, with conventional CT as the reference standard.

Mixed Modality Image Synthesis Musculoskeletal Retrospective Clinical In Silico Academic Lab

Mamba Goes HoME: Hierarchical Soft Mixture-of-Experts for 3D Medical Image Segmentation

Szymon Płotka, Maciej Chrabaszcz, Gizem Mert, Ewa Szczurek, Arkadiusz Sitek

•preprint•Jul 8 2025

In recent years, artificial intelligence has significantly advanced medical image segmentation. However, challenges remain, including efficient 3D medical image processing across diverse modalities and handling data variability. In this work, we introduce Hierarchical Soft Mixture-of-Experts (HoME), a two-level token-routing layer for efficient long-context modeling, specifically designed for 3D medical image segmentation. Built on the Mamba state-space model (SSM) backbone, HoME enhances sequential modeling through sparse, adaptive expert routing. The first stage employs a Soft Mixture-of-Experts (SMoE) layer to partition input sequences into local groups, routing tokens to specialized per-group experts for localized feature extraction. The second stage aggregates these outputs via a global SMoE layer, enabling cross-group information fusion and global context refinement. This hierarchical design, combining local expert routing with global expert refinement improves generalizability and segmentation performance, surpassing state-of-the-art results across datasets from the three most commonly used 3D medical imaging modalities and data quality.

Mixed Modality Segmentation Methodology In Silico Benchmark SOTA

Just Say Better or Worse: A Human-AI Collaborative Framework for Medical Image Segmentation Without Manual Annotations

Yizhe Zhang

•preprint•Jul 8 2025

Manual annotation of medical images is a labor-intensive and time-consuming process, posing a significant bottleneck in the development and deployment of robust medical imaging AI systems. This paper introduces a novel Human-AI collaborative framework for medical image segmentation that substantially reduces the annotation burden by eliminating the need for explicit manual pixel-level labeling. The core innovation lies in a preference learning paradigm, where human experts provide minimal, intuitive feedback -- simply indicating whether an AI-generated segmentation is better or worse than a previous version. The framework comprises four key components: (1) an adaptable foundation model (FM) for feature extraction, (2) label propagation based on feature similarity, (3) a clicking agent that learns from human better-or-worse feedback to decide where to click and with which label, and (4) a multi-round segmentation learning procedure that trains a state-of-the-art segmentation network using pseudo-labels generated by the clicking agent and FM-based label propagation. Experiments on three public datasets demonstrate that the proposed approach achieves competitive segmentation performance using only binary preference feedback, without requiring experts to directly manually annotate the images.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

A novel framework for fully-automated co-registration of intravascular ultrasound and optical coherence tomography imaging data

Xingwei He, Kit Mills Bransby, Ahmet Emir Ulutas, Thamil Kumaran, Nathan Angelo Lecaros Yap, Gonul Zeren, Hesong Zeng, Yaojun Zhang, Andreas Baumbach, James Moon, Anthony Mathur, Jouke Dijkstra, Qianni Zhang, Lorenz Raber, Christos V Bourantas

•preprint•Jul 8 2025

Aims: To develop a deep-learning (DL) framework that will allow fully automated longitudinal and circumferential co-registration of intravascular ultrasound (IVUS) and optical coherence tomography (OCT) images. Methods and results: Data from 230 patients (714 vessels) with acute coronary syndrome that underwent near-infrared spectroscopy (NIRS)-IVUS and OCT imaging in their non-culprit vessels were included in the present analysis. The lumen borders annotated by expert analysts in 61,655 NIRS-IVUS and 62,334 OCT frames, and the side branches and calcific tissue identified in 10,000 NIRS-IVUS frames and 10,000 OCT frames, were used to train DL solutions for the automated extraction of these features. The trained DL solutions were used to process NIRS-IVUS and OCT images and their output was used by a dynamic time warping algorithm to co-register longitudinally the NIRS-IVUS and OCT images, while the circumferential registration of the IVUS and OCT was optimized through dynamic programming. On a test set of 77 vessels from 22 patients, the DL method showed high concordance with the expert analysts for the longitudinal and circumferential co-registration of the two imaging sets (concordance correlation coefficient >0.99 for the longitudinal and >0.90 for the circumferential co-registration). The Williams Index was 0.96 for longitudinal and 0.97 for circumferential co-registration, indicating a comparable performance to the analysts. The time needed for the DL pipeline to process imaging data from a vessel was <90s. Conclusion: The fully automated, DL-based framework introduced in this study for the co-registration of IVUS and OCT is fast and provides estimations that compare favorably to the expert analysts. These features renders it useful in research in the analysis of large-scale data collected in studies that incorporate multimodality imaging to characterize plaque composition.

Mixed Modality Registration Cardiac Methodology In Silico Academic Lab

The future of multimodal artificial intelligence models for integrating imaging and clinical metadata: a narrative review.

Simon BD, Ozyoruk KB, Gelikman DG, Harmon SA, Türkbey B

•papers•Jul 8 2025

With the ongoing revolution of artificial intelligence (AI) in medicine, the impact of AI in radiology is more pronounced than ever. An increasing number of technical and clinical AI-focused studies are published each day. As these tools inevitably affect patient care and physician practices, it is crucial that radiologists become more familiar with the leading strategies and underlying principles of AI. Multimodal AI models can combine both imaging and clinical metadata and are quickly becoming a popular approach that is being integrated into the medical ecosystem. This narrative review covers major concepts of multimodal AI through the lens of recent literature. We discuss emerging frameworks, including graph neural networks, which allow for explicit learning from non-Euclidean relationships, and transformers, which allow for parallel computation that scales, highlighting existing literature and advocating for a focus on emerging architectures. We also identify key pitfalls in current studies, including issues with taxonomy, data scarcity, and bias. By informing radiologists and biomedical AI experts about existing practices and challenges, we hope to guide the next wave of imaging-based multimodal AI research.

Mixed Modality Review Ethics GenAI

Integrating Machine Learning into Myositis Research: a Systematic Review.

Juarez-Gomez C, Aguilar-Vazquez A, Gonzalez-Gauna E, Garcia-Ordoñez GP, Martin-Marquez BT, Gomez-Rios CA, Becerra-Jimenez J, Gaspar-Ruiz A, Vazquez-Del Mercado M

•papers•Jul 8 2025

Idiopathic inflammatory myopathies (IIM) are a group of autoimmune rheumatic diseases characterized by proximal muscle weakness and extra muscular manifestations. Since 1975, these IIM have been classified into different clinical phenotypes. Each clinical phenotype is associated with a better or worse prognosis and a particular physiopathology. Machine learning (ML) is a fascinating field of knowledge with worldwide applications in different fields. In IIM, ML is an emerging tool assessed in very specific clinical contexts as a complementary tool for research purposes, including transcriptome profiles in muscle biopsies, differential diagnosis using magnetic resonance imaging (MRI), and ultrasound (US). With the cancer-associated risk and predisposing factors for interstitial lung disease (ILD) development, this systematic review evaluates 23 original studies using supervised learning models, including logistic regression (LR), random forest (RF), support vector machines (SVM), and convolutional neural networks (CNN), with performance assessed primarily through the area under the curve coupled with the receiver operating characteristic (AUC-ROC).

Mixed Modality Classification Musculoskeletal Review Concept Academic Lab Benchmark SOTA

Foundation models for radiology: fundamentals, applications, opportunities, challenges, risks, and prospects.

Akinci D'Antonoli T, Bluethgen C, Cuocolo R, Klontzas ME, Ponsiglione A, Kocak B

•papers•Jul 8 2025

Foundation models (FMs) represent a significant evolution in artificial intelligence (AI), impacting diverse fields. Within radiology, this evolution offers greater adaptability, multimodal integration, and improved generalizability compared with traditional narrow AI. Utilizing large-scale pre-training and efficient fine-tuning, FMs can support diverse applications, including image interpretation, report generation, integrative diagnostics combining imaging with clinical/laboratory data, and synthetic data creation, holding significant promise for advancements in precision medicine. However, clinical translation of FMs faces several substantial challenges. Key concerns include the inherent opacity of model decision-making processes, environmental and social sustainability issues, risks to data privacy, complex ethical considerations, such as bias and fairness, and navigating the uncertainty of regulatory frameworks. Moreover, rigorous validation is essential to address inherent stochasticity and the risk of hallucination. This international collaborative effort provides a comprehensive overview of the fundamentals, applications, opportunities, challenges, and prospects of FMs, aiming to guide their responsible and effective adoption in radiology and healthcare.

Mixed Modality LLM Radiology Report Whole Body Review Concept Consortium GenAI Ethics Policy

Filter Papers

Tags

Steps Adaptive Decay DPSGD: Enhancing Performance on Imbalanced Datasets with Differential Privacy with HAM10000

4KAgent: Agentic Any Image to 4K Super-Resolution

AI Revolution in Radiology, Radiation Oncology and Nuclear Medicine: Transforming and Innovating the Radiological Sciences.

Assessment of T2-weighted MRI-derived synthetic CT for the detection of suspected lumbar facet arthritis: a comparative analysis with conventional CT.

Mamba Goes HoME: Hierarchical Soft Mixture-of-Experts for 3D Medical Image Segmentation

Just Say Better or Worse: A Human-AI Collaborative Framework for Medical Image Segmentation Without Manual Annotations

A novel framework for fully-automated co-registration of intravascular ultrasound and optical coherence tomography imaging data

The future of multimodal artificial intelligence models for integrating imaging and clinical metadata: a narrative review.

Integrating Machine Learning into Myositis Research: a Systematic Review.

Foundation models for radiology: fundamentals, applications, opportunities, challenges, risks, and prospects.

Ready to Sharpen Your Edge?