Latest Papers on Radiology AI. Tags: Mixed Modality

Continual learning in medical image analysis: A comprehensive review of recent advancements and future prospects.

Kumari P, Chauhan J, Bozorgpour A, Huang B, Azad R, Merhof D

•papers•Jul 28 2025

Medical image analysis has witnessed remarkable advancements, even surpassing human-level performance in recent years, driven by the rapid development of advanced deep-learning algorithms. However, when the inference dataset slightly differs from what the model has seen during one-time training, the model performance is greatly compromised. The situation requires restarting the training process using both the old and the new data, which is computationally costly, does not align with the human learning process, and imposes storage constraints and privacy concerns. Alternatively, continual learning has emerged as a crucial approach for developing unified and sustainable deep models to deal with new classes, tasks, and the drifting nature of data in non-stationary environments for various application areas. Continual learning techniques enable models to adapt and accumulate knowledge over time, which is essential for maintaining performance on evolving datasets and novel tasks. Owing to its popularity and promising performance, it is an active and emerging research topic in the medical field and hence demands a survey and taxonomy to clarify the current research landscape of continual learning in medical image analysis. This systematic review paper provides a comprehensive overview of the state-of-the-art in continual learning techniques applied to medical image analysis. We present an extensive survey of existing research, covering topics including catastrophic forgetting, data drifts, stability, and plasticity requirements. Further, an in-depth discussion of key components of a continual learning framework, such as continual learning scenarios, techniques, evaluation schemes, and metrics, is provided. Continual learning techniques encompass various categories, including rehearsal, regularization, architectural, and hybrid strategies. We assess the popularity and applicability of continual learning categories in various medical sub-fields like radiology and histopathology. Our exploration considers unique challenges in the medical domain, including costly data annotation, temporal drift, and the crucial need for benchmarking datasets to ensure consistent model evaluation. The paper also addresses current challenges and looks ahead to potential future research directions.

Mixed Modality Classification Review Concept Benchmark SOTA Reproducibility

The evolving role of multimodal imaging, artificial intelligence and radiomics in the radiologic assessment of immune related adverse events.

Das JP, Ma HY, DeJong D, Prendergast C, Baniasadi A, Braumuller B, Giarratana A, Khonji S, Paily J, Shobeiri P, Yeh R, Dercle L, Capaccione KM

•papers•Jul 28 2025

Immunotherapy, in particular checkpoint blockade, has revolutionized the treatment of many advanced cancers. Imaging plays a critical role in assessing both treatment response and the development of immune toxicities. Both conventional imaging and molecular imaging techniques can be used to evaluate multisystemic immune related adverse events (irAEs), including thoracic, abdominal and neurologic irAEs. As artificial intelligence (AI) proliferates in medical imaging, radiologic assessment of irAEs will become more efficient, improving the diagnosis, prognosis, and management of patients affected by immune-related toxicities. This review addresses some of the advancements in medical imaging including the potential future role of radiomics in evaluating irAEs, which may facilitate clinical decision-making and improvements in patient care.

Mixed Modality Classification Whole Body Review Concept GenAI

Implicit Spatiotemporal Bandwidth Enhancement Filter by Sine-activated Deep Learning Model for Fast 3D Photoacoustic Tomography

I Gede Eka Sulistyawan, Takuro Ishii, Riku Suzuki, Yoshifumi Saijo

•preprint•Jul 28 2025

3D photoacoustic tomography (3D-PAT) using high-frequency hemispherical transducers offers near-omnidirectional reception and enhanced sensitivity to the finer structural details encoded in the high-frequency components of the broadband photoacoustic (PA) signal. However, practical constraints such as limited number of channels with bandlimited sampling rate often result in sparse and bandlimited sensors that degrade image quality. To address this, we revisit the 2D deep learning (DL) approach applied directly to sensor-wise PA radio-frequency (PARF) data. Specifically, we introduce sine activation into the DL model to restore the broadband nature of PARF signals given the observed band-limited and high-frequency PARF data. Given the scarcity of 3D training data, we employ simplified training strategies by simulating random spherical absorbers. This combination of sine-activated model and randomized training is designed to emphasize bandwidth learning over dataset memorization. Our model was evaluated on a leaf skeleton phantom, a micro-CT-verified 3D spiral phantom and in-vivo human palm vasculature. The results showed that the proposed training mechanism on sine-activated model was well-generalized across the different tests by effectively increasing the sensor density and recovering the spatiotemporal bandwidth. Qualitatively, the sine-activated model uniquely enhanced high-frequency content that produces clearer vascular structure with fewer artefacts. Quantitatively, the sine-activated model exhibits full bandwidth at -12 dB spectrum and significantly higher contrast-to-noise ratio with minimal loss of structural similarity index. Lastly, we optimized our approach to enable fast enhanced 3D-PAT at 2 volumes-per-second for better practical imaging of a free-moving targets.

Mixed Modality Reconstruction Vascular Methodology In Silico Academic Lab

A novel multimodal medical image fusion model for Alzheimer's and glioma disease detection based on hybrid fusion strategies in non-subsampled shearlet transform domain.

Alabduljabbar A, Khan SU, Altherwy YN, Almarshad F, Alsuhaibani A

•papers•Jul 27 2025

BackgroundMedical professionals may increase diagnostic accuracy using multimodal medical image fusion techniques to peer inside organs and tissues.ObjectiveThis research work aims to propose a solution for diverse medical diagnostic challenges.MethodsWe propose a dual-purpose model. Initially, we developed a pair of images using the intensity, hue, and saturation (IHS) approach. Next, we applied non-subsampled shearlet transform (NSST) decomposition to these images to obtain the low-frequency and high-frequency coefficients. We then enhanced the structure and background details of the low-frequency coefficients using a novel structure feature modification technique. For the high-frequency coefficients, we utilized the layer-weighted pulse coupled neural network fusion technique to acquire complementary pixel-level information. Finally, we employed reversed NSST and IHS to generate the fused resulting image.ResultsThe proposed approach has been verified on 1350 image sets from two different diseases, Alzheimer's and glioma, across numerous imaging modalities. Our proposed method beats existing cutting-edge models, as proven by both qualitative and quantitative evaluations, and provides valuable information for medical diagnosis. In the majority of cases, our proposed method performed well in terms of entropy, structure similarity index, standard deviation, average distance, and average pixel intensity due to the careful selection of unique fusion strategies in our model. However, in a few cases, NSSTSIPCA performs better than our proposed work in terms of intensity variations (mean absolute error and average distance).ConclusionsThis research work utilized various fusion strategies in the NSST domain to efficiently enhance structural, anatomical, and spectral information.

Mixed Modality Image Synthesis Neurological Methodology In Silico

Synomaly noise and multi-stage diffusion: A novel approach for unsupervised anomaly detection in medical images.

Bi Y, Huang L, Clarenbach R, Ghotbi R, Karlas A, Navab N, Jiang Z

•papers•Jul 26 2025

Anomaly detection in medical imaging plays a crucial role in identifying pathological regions across various imaging modalities, such as brain MRI, liver CT, and carotid ultrasound (US). However, training fully supervised segmentation models is often hindered by the scarcity of expert annotations and the complexity of diverse anatomical structures. To address these issues, we propose a novel unsupervised anomaly detection framework based on a diffusion model that incorporates a synthetic anomaly (Synomaly) noise function and a multi-stage diffusion process. Synomaly noise introduces synthetic anomalies into healthy images during training, allowing the model to effectively learn anomaly removal. The multi-stage diffusion process is introduced to progressively denoise images, preserving fine details while improving the quality of anomaly-free reconstructions. The generated high-fidelity counterfactual healthy images can further enhance the interpretability of the segmentation models, as well as provide a reliable baseline for evaluating the extent of anomalies and supporting clinical decision-making. Notably, the unsupervised anomaly detection model is trained purely on healthy images, eliminating the need for anomalous training samples and pixel-level annotations. We validate the proposed approach on brain MRI, liver CT datasets, and carotid US. The experimental results demonstrate that the proposed framework outperforms existing state-of-the-art unsupervised anomaly detection methods, achieving performance comparable to fully supervised segmentation models in the US dataset. Ablation studies further highlight the contributions of Synomaly noise and the multi-stage diffusion process in improving anomaly segmentation. These findings underscore the potential of our approach as a robust and annotation-efficient alternative for medical anomaly detection. Code:https://github.com/yuan-12138/Synomaly.

Mixed Modality Detection Methodology In Silico Open Code Benchmark SOTA

FaRMamba: Frequency-based learning and Reconstruction aided Mamba for Medical Segmentation

Ze Rong, ZiYue Zhao, Zhaoxin Wang, Lei Ma

•preprint•Jul 26 2025

Accurate medical image segmentation remains challenging due to blurred lesion boundaries (LBA), loss of high-frequency details (LHD), and difficulty in modeling long-range anatomical structures (DC-LRSS). Vision Mamba employs one-dimensional causal state-space recurrence to efficiently model global dependencies, thereby substantially mitigating DC-LRSS. However, its patch tokenization and 1D serialization disrupt local pixel adjacency and impose a low-pass filtering effect, resulting in Local High-frequency Information Capture Deficiency (LHICD) and two-dimensional Spatial Structure Degradation (2D-SSD), which in turn exacerbate LBA and LHD. In this work, we propose FaRMamba, a novel extension that explicitly addresses LHICD and 2D-SSD through two complementary modules. A Multi-Scale Frequency Transform Module (MSFM) restores attenuated high-frequency cues by isolating and reconstructing multi-band spectra via wavelet, cosine, and Fourier transforms. A Self-Supervised Reconstruction Auxiliary Encoder (SSRAE) enforces pixel-level reconstruction on the shared Mamba encoder to recover full 2D spatial correlations, enhancing both fine textures and global context. Extensive evaluations on CAMUS echocardiography, MRI-based Mouse-cochlea, and Kvasir-Seg endoscopy demonstrate that FaRMamba consistently outperforms competitive CNN-Transformer hybrids and existing Mamba variants, delivering superior boundary accuracy, detail preservation, and global coherence without prohibitive computational overhead. This work provides a flexible frequency-aware framework for future segmentation models that directly mitigates core challenges in medical imaging.

Mixed Modality Segmentation Methodology In Silico Benchmark SOTA

A triple pronged approach for ulcerative colitis severity classification using multimodal, meta, and transformer based learning.

Ahmed MN, Neogi D, Kabir MR, Rahman S, Momen S, Mohammed N

•papers•Jul 26 2025

Ulcerative colitis (UC) is a chronic inflammatory disorder necessitating precise severity stratification to facilitate optimal therapeutic interventions. This study harnesses a triple-pronged deep learning methodology-including multimodal inference pipelines that eliminate domain-specific training, few-shot meta-learning, and Vision Transformer (ViT)-based ensembling-to classify UC severity within the HyperKvasir dataset. We systematically evaluate multiple vision transformer architectures, discovering that a Swin-Base model achieves an accuracy of 90%, while a soft-voting ensemble of diverse ViT backbones boosts performance to 93%. In parallel, we leverage multimodal pre-trained frameworks (e.g., CLIP, BLIP, FLAVA) integrated with conventional machine learning algorithms, yielding an accuracy of 83%. To address limited annotated data, we deploy few-shot meta-learning approaches (e.g., Matching Networks), attaining 83% accuracy in a 5-shot context. Furthermore, interpretability is enhanced via SHapley Additive exPlanations (SHAP), which interpret both local and global model behaviors, thereby fostering clinical trust in the model's inferences. These findings underscore the potential of contemporary representation learning and ensemble strategies for robust UC severity classification, highlighting the pivotal role of model transparency in facilitating medical image analysis.

Mixed Modality Classification Abdominal Methodology In Silico Academic Lab Benchmark SOTA

Leveraging Fine-Tuned Large Language Models for Interpretable Pancreatic Cystic Lesion Feature Extraction and Risk Categorization

Ebrahim Rasromani, Stella K. Kang, Yanqi Xu, Beisong Liu, Garvit Luhadia, Wan Fung Chui, Felicia L. Pasadyn, Yu Chih Hung, Julie Y. An, Edwin Mathieu, Zehui Gu, Carlos Fernandez-Granda, Ammar A. Javed, Greg D. Sacks, Tamas Gonda, Chenchan Huang, Yiqiu Shen

•preprint•Jul 26 2025

Background: Manual extraction of pancreatic cystic lesion (PCL) features from radiology reports is labor-intensive, limiting large-scale studies needed to advance PCL research. Purpose: To develop and evaluate large language models (LLMs) that automatically extract PCL features from MRI/CT reports and assign risk categories based on guidelines. Materials and Methods: We curated a training dataset of 6,000 abdominal MRI/CT reports (2005-2024) from 5,134 patients that described PCLs. Labels were generated by GPT-4o using chain-of-thought (CoT) prompting to extract PCL and main pancreatic duct features. Two open-source LLMs were fine-tuned using QLoRA on GPT-4o-generated CoT data. Features were mapped to risk categories per institutional guideline based on the 2017 ACR White Paper. Evaluation was performed on 285 held-out human-annotated reports. Model outputs for 100 cases were independently reviewed by three radiologists. Feature extraction was evaluated using exact match accuracy, risk categorization with macro-averaged F1 score, and radiologist-model agreement with Fleiss' Kappa. Results: CoT fine-tuning improved feature extraction accuracy for LLaMA (80% to 97%) and DeepSeek (79% to 98%), matching GPT-4o (97%). Risk categorization F1 scores also improved (LLaMA: 0.95; DeepSeek: 0.94), closely matching GPT-4o (0.97), with no statistically significant differences. Radiologist inter-reader agreement was high (Fleiss' Kappa = 0.888) and showed no statistically significant difference with the addition of DeepSeek-FT-CoT (Fleiss' Kappa = 0.893) or GPT-CoT (Fleiss' Kappa = 0.897), indicating that both models achieved agreement levels on par with radiologists. Conclusion: Fine-tuned open-source LLMs with CoT supervision enable accurate, interpretable, and efficient phenotyping for large-scale PCL research, achieving performance comparable to GPT-4o.

Mixed Modality LLM Radiology Report Abdominal Methodology In Silico Academic Lab GenAI

All-in-One Medical Image Restoration with Latent Diffusion-Enhanced Vector-Quantized Codebook Prior

Haowei Chen, Zhiwen Yang, Haotian Hou, Hui Zhang, Bingzheng Wei, Gang Zhou, Yan Xu

•preprint•Jul 26 2025

All-in-one medical image restoration (MedIR) aims to address multiple MedIR tasks using a unified model, concurrently recovering various high-quality (HQ) medical images (e.g., MRI, CT, and PET) from low-quality (LQ) counterparts. However, all-in-one MedIR presents significant challenges due to the heterogeneity across different tasks. Each task involves distinct degradations, leading to diverse information losses in LQ images. Existing methods struggle to handle these diverse information losses associated with different tasks. To address these challenges, we propose a latent diffusion-enhanced vector-quantized codebook prior and develop \textbf{DiffCode}, a novel framework leveraging this prior for all-in-one MedIR. Specifically, to compensate for diverse information losses associated with different tasks, DiffCode constructs a task-adaptive codebook bank to integrate task-specific HQ prior features across tasks, capturing a comprehensive prior. Furthermore, to enhance prior retrieval from the codebook bank, DiffCode introduces a latent diffusion strategy that utilizes the diffusion model's powerful mapping capabilities to iteratively refine the latent feature distribution, estimating more accurate HQ prior features during restoration. With the help of the task-adaptive codebook bank and latent diffusion strategy, DiffCode achieves superior performance in both quantitative metrics and visual quality across three MedIR tasks: MRI super-resolution, CT denoising, and PET synthesis.

Mixed Modality Reconstruction Methodology In Silico Academic Lab GenAI

CLT-MambaSeg: An integrated model of Convolution, Linear Transformer and Multiscale Mamba for medical image segmentation.

Uppal D, Prakash S

•papers•Jul 26 2025

Recent advances in deep learning have significantly enhanced the performance of medical image segmentation. However, maintaining a balanced integration of feature localization, global context modeling, and computational efficiency remains a critical research challenge. Convolutional Neural Networks (CNNs) effectively capture fine-grained local features through hierarchical convolutions; however, they often struggle to model long-range dependencies due to their limited receptive field. Transformers address this limitation by leveraging self-attention mechanisms to capture global context, but they are computationally intensive and require large-scale data for effective training. The Mamba architecture has emerged as a promising approach, effectively capturing long-range dependencies while maintaining low computational overhead and high segmentation accuracy. Based on this, we propose a method named CLT-MambaSeg that integrates Convolution, Linear Transformer, and Multiscale Mamba architectures to capture local features, model global context, and improve computational efficiency for medical image segmentation. It utilizes a convolution-based Spatial Representation Extraction (SREx) module to capture intricate spatial relationships and dependencies. Further, it comprises a Mamba Vision Linear Transformer (MVLTrans) module to capture multiscale context, spatial and sequential dependencies, and enhanced global context. In addition, to address the problem of limited data, we propose a novel Memory-Guided Augmentation Generative Adversarial Network (MeGA-GAN) that generates synthetic realistic images to further enhance the segmentation performance. We conduct extensive experiments and ablation studies on the five benchmark datasets, namely CVC-ClinicDB, Breast UltraSound Images (BUSI), PH2, and two datasets from the International Skin Imaging Collaboration (ISIC), namely ISIC-2016 and ISIC-2017. Experimental results demonstrate the efficacy of the proposed CLT-MambaSeg compared to other state-of-the-art methods.

Mixed Modality Segmentation Methodology In Silico Benchmark SOTA

Filter Papers

Tags

Continual learning in medical image analysis: A comprehensive review of recent advancements and future prospects.

The evolving role of multimodal imaging, artificial intelligence and radiomics in the radiologic assessment of immune related adverse events.

Implicit Spatiotemporal Bandwidth Enhancement Filter by Sine-activated Deep Learning Model for Fast 3D Photoacoustic Tomography

A novel multimodal medical image fusion model for Alzheimer's and glioma disease detection based on hybrid fusion strategies in non-subsampled shearlet transform domain.

Synomaly noise and multi-stage diffusion: A novel approach for unsupervised anomaly detection in medical images.

FaRMamba: Frequency-based learning and Reconstruction aided Mamba for Medical Segmentation

A triple pronged approach for ulcerative colitis severity classification using multimodal, meta, and transformer based learning.

Leveraging Fine-Tuned Large Language Models for Interpretable Pancreatic Cystic Lesion Feature Extraction and Risk Categorization

All-in-One Medical Image Restoration with Latent Diffusion-Enhanced Vector-Quantized Codebook Prior

CLT-MambaSeg: An integrated model of Convolution, Linear Transformer and Multiscale Mamba for medical image segmentation.

Ready to Sharpen Your Edge?