Latest Papers on Radiology AI.

Integrating CT image reconstruction, segmentation, and large language models for enhanced diagnostic insight.

Abbasi AA, Farooqi AH

•papers•Sep 25 2025

Deep learning has significantly advanced medical imaging, particularly computed tomography (CT), which is vital for diagnosing heart and cancer patients, evaluating treatments, and tracking disease progression. High-quality CT images enhance clinical decision-making, making image reconstruction a key research focus. This study develops a framework to improve CT image quality while minimizing reconstruction time. The proposed four-step medical image analysis framework includes reconstruction, preprocessing, segmentation, and image description. Initially, raw projection data undergoes reconstruction via a Radon transform to generate a sinogram, which is then used to construct a CT image of the pelvis. A convolutional neural network (CNN) ensures high-quality reconstruction. A bilateral filter reduces noise while preserving critical anatomical features. If required, a medical expert can review the image. The K-means clustering algorithm segments the preprocessed image, isolating the pelvis and removing irrelevant structures. Finally, the FuseCap model generates an automated textual description to assist radiologists. The framework's effectiveness is evaluated using peak signal-to-noise ratio (PSNR), normalized mean square error (NMSE), and structural similarity index measure (SSIM). The achieved values-PSNR 30.784, NMSE 0.032, and SSIM 0.877-demonstrate superior performance compared to existing methods. The proposed framework reconstructs high-quality CT images from raw projection data, integrating segmentation and automated descriptions to provide a decision-support tool for medical experts. By enhancing image clarity, segmenting outputs, and providing descriptive insights, this research aims to reduce the workload of frontline medical professionals and improve diagnostic efficiency.

CT LLM Radiology Report Abdominal Methodology In Silico GenAI

LiLAW: Lightweight Learnable Adaptive Weighting to Meta-Learn Sample Difficulty and Improve Noisy Training

Abhishek Moturu, Anna Goldenberg, Babak Taati

•preprint•Sep 25 2025

Training deep neural networks in the presence of noisy labels and data heterogeneity is a major challenge. We introduce Lightweight Learnable Adaptive Weighting (LiLAW), a novel method that dynamically adjusts the loss weight of each training sample based on its evolving difficulty level, categorized as easy, moderate, or hard. Using only three learnable parameters, LiLAW adaptively prioritizes informative samples throughout training by updating these weights using a single mini-batch gradient descent step on the validation set after each training mini-batch, without requiring excessive hyperparameter tuning or a clean validation set. Extensive experiments across multiple general and medical imaging datasets, noise levels and types, loss functions, and architectures with and without pretraining demonstrate that LiLAW consistently enhances performance, even in high-noise environments. It is effective without heavy reliance on data augmentation or advanced regularization, highlighting its practicality. It offers a computationally efficient solution to boost model generalization and robustness in any neural network training setup.

Methodology In Silico Reproducibility

Mammo-CLIP Dissect: A Framework for Analysing Mammography Concepts in Vision-Language Models

Suaiba Amina Salahuddin, Teresa Dorszewski, Marit Almenning Martiniussen, Tone Hovda, Antonio Portaluri, Solveig Thrun, Michael Kampffmeyer, Elisabeth Wetzer, Kristoffer Wickstrøm, Robert Jenssen

•preprint•Sep 25 2025

Understanding what deep learning (DL) models learn is essential for the safe deployment of artificial intelligence (AI) in clinical settings. While previous work has focused on pixel-based explainability methods, less attention has been paid to the textual concepts learned by these models, which may better reflect the reasoning used by clinicians. We introduce Mammo-CLIP Dissect, the first concept-based explainability framework for systematically dissecting DL vision models trained for mammography. Leveraging a mammography-specific vision-language model (Mammo-CLIP) as a "dissector," our approach labels neurons at specified layers with human-interpretable textual concepts and quantifies their alignment to domain knowledge. Using Mammo-CLIP Dissect, we investigate three key questions: (1) how concept learning differs between DL vision models trained on general image datasets versus mammography-specific datasets; (2) how fine-tuning for downstream mammography tasks affects concept specialisation; and (3) which mammography-relevant concepts remain underrepresented. We show that models trained on mammography data capture more clinically relevant concepts and align more closely with radiologists' workflows than models not trained on mammography data. Fine-tuning for task-specific classification enhances the capture of certain concept categories (e.g., benign calcifications) but can reduce coverage of others (e.g., density-related features), indicating a trade-off between specialisation and generalisation. Our findings show that Mammo-CLIP Dissect provides insights into how convolutional neural networks (CNNs) capture mammography-specific knowledge. By comparing models across training data and fine-tuning regimes, we reveal how domain-specific training and task-specific adaptation shape concept learning. Code and concept set are available: https://github.com/Suaiba/Mammo-CLIP-Dissect.

Mammography Classification Breast Methodology In Silico Academic Lab Open Code

Deep-learning-based Radiomics on Mitigating Post-treatment Obesity for Pediatric Craniopharyngioma Patients after Surgery and Proton Therapy

Wenjun Yang, Chia-Ho Hua, Tina Davis, Jinsoo Uh, Thomas E. Merchant

•preprint•Sep 25 2025

Purpose: We developed an artificial neural network (ANN) combining radiomics with clinical and dosimetric features to predict the extent of body mass index (BMI) increase after surgery and proton therapy, with advantage of improved accuracy and integrated key feature selection. Methods and Materials: Uniform treatment protocol composing of limited surgery and proton radiotherapy was given to 84 pediatric craniopharyngioma patients (aged 1-20 years). Post-treatment obesity was classified into 3 groups (<10%, 10-20%, and >20%) based on the normalized BMI increase during a 5-year follow-up. We developed a densely connected 4-layer ANN with radiomics calculated from pre-surgery MRI (T1w, T2w, and FLAIR), combining clinical and dosimetric features as input. Accuracy, area under operative curve (AUC), and confusion matrices were compared with random forest (RF) models in a 5-fold cross-validation. The Group lasso regularization optimized a sparse connection to input neurons to identify key features from high-dimensional input. Results: Classification accuracy of the ANN reached above 0.9 for T1w, T2w, and FLAIR MRI. Confusion matrices showed high true positive rates of above 0.9 while the false positive rates were below 0.2. Approximately 10 key features selected for T1w, T2w, and FLAIR MRI, respectively. The ANN improved classification accuracy by 10% or 5% when compared to RF models without or with radiomic features. Conclusion: The ANN model improved classification accuracy on post-treatment obesity compared to conventional statistics models. The clinical features selected by Group lasso regularization confirmed our practical observation, while the additional radiomic and dosimetric features could serve as imaging markers and mitigation methods on post-treatment obesity for pediatric craniopharyngioma patients.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Decipher-MR: A Vision-Language Foundation Model for 3D MRI Representations

Zhijian Yang, Noel DSouza, Istvan Megyeri, Xiaojian Xu, Amin Honarmandi Shandiz, Farzin Haddadpour, Krisztian Koos, Laszlo Rusko, Emanuele Valeriano, Bharadwaj Swaninathan, Lei Wu, Parminder Bhatia, Taha Kass-Hout, Erhan Bas

•preprint•Sep 25 2025

Magnetic Resonance Imaging (MRI) is a critical medical imaging modality in clinical diagnosis and research, yet its complexity and heterogeneity pose challenges for automated analysis, particularly in scalable and generalizable machine learning applications. While foundation models have revolutionized natural language and vision tasks, their application to MRI remains limited due to data scarcity and narrow anatomical focus. In this work, we present Decipher-MR, a 3D MRI-specific vision-language foundation model trained on a large-scale dataset comprising 200,000 MRI series from over 22,000 studies spanning diverse anatomical regions, sequences, and pathologies. Decipher-MR integrates self-supervised vision learning with report-guided text supervision to build robust, generalizable representations, enabling effective adaptation across broad applications. To enable robust and diverse clinical tasks with minimal computational overhead, Decipher-MR supports a modular design that enables tuning of lightweight, task-specific decoders attached to a frozen pretrained encoder. Following this setting, we evaluate Decipher-MR across diverse benchmarks including disease classification, demographic prediction, anatomical localization, and cross-modal retrieval, demonstrating consistent performance gains over existing foundation models and task-specific approaches. Our results establish Decipher-MR as a scalable and versatile foundation for MRI-based AI, facilitating efficient development across clinical and research domains.

MRI Classification Whole Body Methodology In Silico Breakthrough Benchmark SOTA

Nuclear Diffusion Models for Low-Rank Background Suppression in Videos

Tristan S. W. Stevens, Oisín Nolan, Jean-Luc Robert, Ruud J. G. van Sloun

•preprint•Sep 25 2025

Video sequences often contain structured noise and background artifacts that obscure dynamic content, posing challenges for accurate analysis and restoration. Robust principal component methods address this by decomposing data into low-rank and sparse components. Still, the sparsity assumption often fails to capture the rich variability present in real video data. To overcome this limitation, a hybrid framework that integrates low-rank temporal modeling with diffusion posterior sampling is proposed. The proposed method, Nuclear Diffusion, is evaluated on a real-world medical imaging problem, namely cardiac ultrasound dehazing, and demonstrates improved dehazing performance compared to traditional RPCA concerning contrast enhancement (gCNR) and signal preservation (KS statistic). These results highlight the potential of combining model-based temporal models with deep generative priors for high-fidelity video restoration.

Ultrasound Image Synthesis Cardiac Methodology In Silico Academic Lab

T2I-Diff: fMRI Signal Generation via Time-Frequency Image Transform and Classifier-Free Denoising Diffusion Models

Hwa Hui Tew, Junn Yong Loo, Yee-Fan Tan, Xinyu Tang, Hernando Ombao, Fuad Noman, Raphael C. -W. Phan, Chee-Ming Ting

•preprint•Sep 25 2025

Functional Magnetic Resonance Imaging (fMRI) is an advanced neuroimaging method that enables in-depth analysis of brain activity by measuring dynamic changes in the blood oxygenation level-dependent (BOLD) signals. However, the resource-intensive nature of fMRI data acquisition limits the availability of high-fidelity samples required for data-driven brain analysis models. While modern generative models can synthesize fMRI data, they often underperform because they overlook the complex non-stationarity and nonlinear BOLD dynamics. To address these challenges, we introduce T2I-Diff, an fMRI generation framework that leverages time-frequency representation of BOLD signals and classifier-free denoising diffusion. Specifically, our framework first converts BOLD signals into windowed spectrograms via a time-dependent Fourier transform, capturing both the underlying temporal dynamics and spectral evolution. Subsequently, a classifier-free diffusion model is trained to generate class-conditioned frequency spectrograms, which are then reverted to BOLD signals via inverse Fourier transforms. Finally, we validate the efficacy of our approach by demonstrating improved accuracy and generalization in downstream fMRI-based brain network classification.

MRI Image Synthesis Neurological Methodology In Silico

Revolutionizing Precise Low Back Pain Diagnosis via Contrastive Learning

Thanh Binh Le, Hoang Nhat Khang Vo, Tan-Ha Mai, Trong Nhan Phan

•preprint•Sep 25 2025

Low back pain affects millions worldwide, driving the need for robust diagnostic models that can jointly analyze complex medical images and accompanying text reports. We present LumbarCLIP, a novel multimodal framework that leverages contrastive language-image pretraining to align lumbar spine MRI scans with corresponding radiological descriptions. Built upon a curated dataset containing axial MRI views paired with expert-written reports, LumbarCLIP integrates vision encoders (ResNet-50, Vision Transformer, Swin Transformer) with a BERT-based text encoder to extract dense representations. These are projected into a shared embedding space via learnable projection heads, configurable as linear or non-linear, and normalized to facilitate stable contrastive training using a soft CLIP loss. Our model achieves state-of-the-art performance on downstream classification, reaching up to 95.00% accuracy and 94.75% F1-score on the test set, despite inherent class imbalance. Extensive ablation studies demonstrate that linear projection heads yield more effective cross-modal alignment than non-linear variants. LumbarCLIP offers a promising foundation for automated musculoskeletal diagnosis and clinical decision support.

MRI Classification Musculoskeletal Methodology In Silico Academic Lab Benchmark SOTA

Interpreting Convolutional Neural Network Activation Maps with Hand-crafted Radiomics Features on Progression of Pediatric Craniopharyngioma after Irradiation Therapy

Wenjun Yang, Chuang Wang, Tina Davis, Jinsoo Uh, Chia-Ho Hua, Thomas E. Merchant

•preprint•Sep 25 2025

Purpose: Convolutional neural networks (CNNs) are promising in predicting treatment outcome for pediatric craniopharyngioma while the decision mechanisms are difficult to interpret. We compared the activation maps of CNN with hand crafted radiomics features of a densely connected artificial neural network (ANN) to correlate with clinical decisions. Methods: A cohort of 100 pediatric craniopharyngioma patients were included. Binary tumor progression was classified by an ANN and CNN with input of T1w, T2w, and FLAIR MRI. Hand-crafted radiomic features were calculated from the MRI using the LifeX software and key features were selected by Group lasso regularization, comparing to the activation maps of CNN. We evaluated the radiomics models by accuracy, area under receiver operational curve (AUC), and confusion matrices. Results: The average accuracy of T1w, T2w, and FLAIR MRI was 0.85, 0.92, and 0.86 (ANOVA, F = 1.96, P = 0.18) with ANN; 0.83, 0.81, and 0.70 (ANOVA, F = 10.11, P = 0.003) with CNN. The average AUC of ANN was 0.91, 0.97, and 0.90; 0.86, 0.88, and 0.75 of CNN for the 3 MRI, respectively. The activation maps were correlated with tumor shape, min and max intensity, and texture features. Conclusions: The tumor progression for pediatric patients with craniopharyngioma achieved promising accuracy with ANN and CNN model. The activation maps extracted from different levels were interpreted with hand-crafted key features of ANN.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Reproducibility

Acute myeloid leukemia classification using ReLViT and detection with YOLO enhanced by adversarial networks on bone marrow images.

Hameed M, Raja MAZ, Zameer A, Dar HS, Alluhaidan AS, Aziz R

•papers•Sep 25 2025

Acute myeloid leukemia (AML) is recognized as a highly aggressive cancer that affects the bone marrow and blood, making it the most lethal type of leukemia. The detection of AML through medical imaging is challenging due to the complex structural and textural variations inherent in bone marrow images. These challenges are further intensified by the overlapping intensity between leukemia and non-leukemia regions, which reduces the effectiveness of traditional predictive models. This study presents a novel artificial intelligence framework that utilizes residual block merging vision transformers, convolutions, and advanced object detection techniques to address the complexities of bone marrow images and enhance the accuracy of AML detection. The framework integrates residual learning-based vision transformer (ReLViT) blocks within a bottleneck architecture, harnessing the combined strengths of residual learning and transformer mechanisms to improve feature representation and computational efficiency. Tailored data pre-processing strategies are employed to manage the textural and structural complexities associated with low-quality images and tumor shapes. The framework's performance is further optimized through a strategic weight-sharing technique to minimize computational overhead. Additionally, a generative adversarial network (GAN) is employed to enhance image quality across all AML imaging modalities, and when combined with a You Only Look Once (YOLO) object detector, it accurately localizes tumor formations in bone marrow images. Extensive and comparative evaluations have demonstrated the superiority of the proposed framework over existing deep convolutional neural networks (CNN) and object detection methods. The model achieves an F1-score of 99.15%, precision of 99.02%, and recall of 99.16%, marking a significant advancement in the field of medical imaging.

Mixed Modality Detection Methodology In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

Integrating CT image reconstruction, segmentation, and large language models for enhanced diagnostic insight.

LiLAW: Lightweight Learnable Adaptive Weighting to Meta-Learn Sample Difficulty and Improve Noisy Training

Mammo-CLIP Dissect: A Framework for Analysing Mammography Concepts in Vision-Language Models

Deep-learning-based Radiomics on Mitigating Post-treatment Obesity for Pediatric Craniopharyngioma Patients after Surgery and Proton Therapy

Decipher-MR: A Vision-Language Foundation Model for 3D MRI Representations

Nuclear Diffusion Models for Low-Rank Background Suppression in Videos

T2I-Diff: fMRI Signal Generation via Time-Frequency Image Transform and Classifier-Free Denoising Diffusion Models

Revolutionizing Precise Low Back Pain Diagnosis via Contrastive Learning

Interpreting Convolutional Neural Network Activation Maps with Hand-crafted Radiomics Features on Progression of Pediatric Craniopharyngioma after Irradiation Therapy

Acute myeloid leukemia classification using ReLViT and detection with YOLO enhanced by adversarial networks on bone marrow images.

Ready to Sharpen Your Edge?