Latest Papers on Radiology AI. Category: preprint, Order: Best Match, Limit: 10.

Comparative Analysis of CNN Performance in Keras, PyTorch and JAX on PathMNIST

Anida Nezović, Jalal Romano, Nada Marić, Medina Kapo, Amila Akagić

•preprint•Jul 16 2025

Deep learning has significantly advanced the field of medical image classification, particularly with the adoption of Convolutional Neural Networks (CNNs). Various deep learning frameworks such as Keras, PyTorch and JAX offer unique advantages in model development and deployment. However, their comparative performance in medical imaging tasks remains underexplored. This study presents a comprehensive analysis of CNN implementations across these frameworks, using the PathMNIST dataset as a benchmark. We evaluate training efficiency, classification accuracy and inference speed to assess their suitability for real-world applications. Our findings highlight the trade-offs between computational speed and model accuracy, offering valuable insights for researchers and practitioners in medical image analysis.

Mixed Modality Classification Methodology In Silico Benchmark SOTA

Hybrid Ensemble Approaches: Optimal Deep Feature Fusion and Hyperparameter-Tuned Classifier Ensembling for Enhanced Brain Tumor Classification

Zahid Ullah, Dragan Pamucar, Jihie Kim

•preprint•Jul 16 2025

Magnetic Resonance Imaging (MRI) is widely recognized as the most reliable tool for detecting tumors due to its capability to produce detailed images that reveal their presence. However, the accuracy of diagnosis can be compromised when human specialists evaluate these images. Factors such as fatigue, limited expertise, and insufficient image detail can lead to errors. For example, small tumors might go unnoticed, or overlap with healthy brain regions could result in misidentification. To address these challenges and enhance diagnostic precision, this study proposes a novel double ensembling framework, consisting of ensembled pre-trained deep learning (DL) models for feature extraction and ensembled fine-tuned hyperparameter machine learning (ML) models to efficiently classify brain tumors. Specifically, our method includes extensive preprocessing and augmentation, transfer learning concepts by utilizing various pre-trained deep convolutional neural networks and vision transformer networks to extract deep features from brain MRI, and fine-tune hyperparameters of ML classifiers. Our experiments utilized three different publicly available Kaggle MRI brain tumor datasets to evaluate the pre-trained DL feature extractor models, ML classifiers, and the effectiveness of an ensemble of deep features along with an ensemble of ML classifiers for brain tumor classification. Our results indicate that the proposed feature fusion and classifier fusion improve upon the state of the art, with hyperparameter fine-tuning providing a significant enhancement over the ensemble method. Additionally, we present an ablation study to illustrate how each component contributes to accurate brain tumor classification.

MRI Classification Neurological Methodology In Silico Benchmark SOTA

Identifying Signatures of Image Phenotypes to Track Treatment Response in Liver Disease

Matthias Perkonigg, Nina Bastati, Ahmed Ba-Ssalamah, Peter Mesenbrink, Alexander Goehler, Miljen Martic, Xiaofei Zhou, Michael Trauner, Georg Langs

•preprint•Jul 16 2025

Quantifiable image patterns associated with disease progression and treatment response are critical tools for guiding individual treatment, and for developing novel therapies. Here, we show that unsupervised machine learning can identify a pattern vocabulary of liver tissue in magnetic resonance images that quantifies treatment response in diffuse liver disease. Deep clustering networks simultaneously encode and cluster patches of medical images into a low-dimensional latent space to establish a tissue vocabulary. The resulting tissue types capture differential tissue change and its location in the liver associated with treatment response. We demonstrate the utility of the vocabulary on a randomized controlled trial cohort of non-alcoholic steatohepatitis patients. First, we use the vocabulary to compare longitudinal liver change in a placebo and a treatment cohort. Results show that the method identifies specific liver tissue change pathways associated with treatment, and enables a better separation between treatment groups than established non-imaging measures. Moreover, we show that the vocabulary can predict biopsy derived features from non-invasive imaging data. We validate the method on a separate replication cohort to demonstrate the applicability of the proposed method.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Site-Level Fine-Tuning with Progressive Layer Freezing: Towards Robust Prediction of Bronchopulmonary Dysplasia from Day-1 Chest Radiographs in Extremely Preterm Infants

Sybelle Goedicke-Fritz, Michelle Bous, Annika Engel, Matthias Flotho, Pascal Hirsch, Hannah Wittig, Dino Milanovic, Dominik Mohr, Mathias Kaspar, Sogand Nemat, Dorothea Kerner, Arno Bücker, Andreas Keller, Sascha Meyer, Michael Zemlin, Philipp Flotho

•preprint•Jul 16 2025

Bronchopulmonary dysplasia (BPD) is a chronic lung disease affecting 35% of extremely low birth weight infants. Defined by oxygen dependence at 36 weeks postmenstrual age, it causes lifelong respiratory complications. However, preventive interventions carry severe risks, including neurodevelopmental impairment, ventilator-induced lung injury, and systemic complications. Therefore, early BPD prognosis and prediction of BPD outcome is crucial to avoid unnecessary toxicity in low risk infants. Admission radiographs of extremely preterm infants are routinely acquired within 24h of life and could serve as a non-invasive prognostic tool. In this work, we developed and investigated a deep learning approach using chest X-rays from 163 extremely low-birth-weight infants ($\leq$32 weeks gestation, 401-999g) obtained within 24 hours of birth. We fine-tuned a ResNet-50 pretrained specifically on adult chest radiographs, employing progressive layer freezing with discriminative learning rates to prevent overfitting and evaluated a CutMix augmentation and linear probing. For moderate/severe BPD outcome prediction, our best performing model with progressive freezing, linear probing and CutMix achieved an AUROC of 0.78 $\pm$ 0.10, balanced accuracy of 0.69 $\pm$ 0.10, and an F1-score of 0.67 $\pm$ 0.11. In-domain pre-training significantly outperformed ImageNet initialization (p = 0.031) which confirms domain-specific pretraining to be important for BPD outcome prediction. Routine IRDS grades showed limited prognostic value (AUROC 0.57 $\pm$ 0.11), confirming the need of learned markers. Our approach demonstrates that domain-specific pretraining enables accurate BPD prediction from routine day-1 radiographs. Through progressive freezing and linear probing, the method remains computationally feasible for site-level implementation and future federated learning deployments.

X-Ray Classification Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

CT-ScanGaze: A Dataset and Baselines for 3D Volumetric Scanpath Modeling

Trong-Thang Pham, Akash Awasthi, Saba Khan, Esteban Duran Marti, Tien-Phat Nguyen, Khoa Vo, Minh Tran, Ngoc Son Nguyen, Cuong Tran Van, Yuki Ikebe, Anh Totti Nguyen, Anh Nguyen, Zhigang Deng, Carol C. Wu, Hien Van Nguyen, Ngan Le

•preprint•Jul 16 2025

Understanding radiologists' eye movement during Computed Tomography (CT) reading is crucial for developing effective interpretable computer-aided diagnosis systems. However, CT research in this area has been limited by the lack of publicly available eye-tracking datasets and the three-dimensional complexity of CT volumes. To address these challenges, we present the first publicly available eye gaze dataset on CT, called CT-ScanGaze. Then, we introduce CT-Searcher, a novel 3D scanpath predictor designed specifically to process CT volumes and generate radiologist-like 3D fixation sequences, overcoming the limitations of current scanpath predictors that only handle 2D inputs. Since deep learning models benefit from a pretraining step, we develop a pipeline that converts existing 2D gaze datasets into 3D gaze data to pretrain CT-Searcher. Through both qualitative and quantitative evaluations on CT-ScanGaze, we demonstrate the effectiveness of our approach and provide a comprehensive assessment framework for 3D scanpath prediction in medical imaging.

CT Classification Whole Body Dataset Release In Silico Academic Lab Open Dataset

AI-Powered Segmentation and Prognosis with Missing MRI in Pediatric Brain Tumors

Chrysochoou, D., Gandhi, D., Adib, S., Familiar, A., Khalili, N., Khalili, N., Ware, J. B., Tu, W., Jain, P., Anderson, H., Haldar, S., Storm, P. B., Franson, A., Prados, M., Kline, C., Mueller, S., Resnick, A., Vossough, A., Davatzikos, C., Nabavizadeh, A., Fathi Kazerooni, A.

•preprint•Jul 16 2025

ImportanceBrain MRI is the main imaging modality for pediatric brain tumors (PBTs); however, incomplete MRI exams are common in pediatric neuro-oncology settings and pose a barrier to the development and application of deep learning (DL) models, such as tumor segmentation and prognostic risk estimation. ObjectiveTo evaluate DL-based strategies (image-dropout training and generative image synthesis) and heuristic imputation approaches for handling missing MRI sequences in PBT imaging from clinical acquisition protocols, and to determine their impact on segmentation accuracy and prognostic risk estimation. DesignThis cohort study included 715 patients from the Childrens Brain Tumor Network (CBTN) and BraTS-PEDs, and 43 patients with longitudinal MRI (157 timepoints) from PNOC003/007 clinical trials. We developed a dropout-trained nnU-Net tumor segmentation model that randomly omitted FLAIR and/or T1w (no contrast) sequences during training to simulate missing inputs. We compared this against three imputation approaches: a generative model for image synthesis, copy-substitution heuristics, and zeroed missing inputs. Model-generated tumor volumes from each segmentation method were compared and evaluated against ground truth (expert manual segmentations) and incorporated into time-varying Cox regression models for survival analysis. SettingMulti-institutional PBT datasets and longitudinal clinical trial cohorts. ParticipantsAll patients had multi-parametric MRI and expert manual segmentations. The PNOC cohort had a median of three imaging timepoints and associated clinical data. Main Outcomes and MeasuresSegmentation accuracy (Dice scores), image quality metrics for synthesized scans (SSIM, PSNR, MSE), and survival discrimination (C-index, hazard ratios). ResultsThe dropout model achieved robust segmentation under missing MRI, with [≤]0.04 Dice drop and a stable C-index of 0.65 compared to complete-input performance. DL-based MRI synthesis achieved high image quality (SSIM > 0.90) and removed artifacts, benefiting visual interpretability. Performance was consistent across cohorts and missing data scenarios. Conclusion and RelevanceModality-dropout training yields robust segmentation and risk-stratification on incomplete pediatric MRI without the computational and clinical complexity of synthesis approaches. Image synthesis, though less effective for these tasks, provides complementary benefits for artifact removal and qualitative assessment of missing or corrupted MRI scans. Together, these approaches can facilitate broader deployment of AI tools in real-world pediatric neuro-oncology settings.

MRI Segmentation Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA Open Dataset

SLOTMFound: Foundation-Based Diagnosis of Multiple Sclerosis Using Retinal SLO Imaging and OCT Thickness-maps

Esmailizadeh, R., Aghababaei, A., Mirzaei, S., Arian, R., Kafieh, R.

•preprint•Jul 15 2025

Multiple Sclerosis (MS) is a chronic autoimmune disorder of the central nervous system that can lead to significant neurological disability. Retinal imaging--particularly Scanning Laser Ophthalmoscopy (SLO) and Optical Coherence Tomography (OCT)--provides valuable biomarkers for early MS diagnosis through non-invasive visualization of neurodegenerative changes. This study proposes a foundation-based bi-modal classification framework that integrates SLO images and OCT-derived retinal thickness maps for MS diagnosis. To facilitate this, we introduce two modality-specific foundation models--SLOFound and TMFound--fine-tuned from the RETFound-Fundus backbone using an independent dataset of 203 healthy eyes, acquired at Noor Ophthalmology Hospital with the Heidelberg Spectralis HRA+OCT system. This dataset, which contains only normal cases, was used exclusively for encoder adaptation and is entirely disjoint from the classification dataset. For the classification stage, we use a separate dataset comprising IR-SLO images from 32 MS patients and 70 healthy controls, collected at the Kashani Comprehensive MS Center in Isfahan, Iran. We first assess OCT-derived maps layer-wise and identify the Ganglion Cell-Inner Plexiform Layer (GCIPL) as the most informative for MS detection. All subsequent analyses utilize GCIPL thickness maps in conjunction with SLO images. Experimental evaluations on the MS classification dataset demonstrate that our foundation-based bi-modal model outperforms unimodal variants and a prior ResNet-based state-of-the-art model, achieving a classification accuracy of 97.37%, with perfect sensitivity (100%). These results highlight the effectiveness of leveraging pre-trained foundation models, even when fine-tuned on limited data, to build robust, efficient, and generalizable diagnostic tools for MS in medical imaging contexts where labeled datasets are often scarce.

OCT Classification Methodology In Silico Academic Lab Benchmark SOTA

Exploring the robustness of TractOracle methods in RL-based tractography

Jeremi Levesque, Antoine Théberge, Maxime Descoteaux, Pierre-Marc Jodoin

•preprint•Jul 15 2025

Tractography algorithms leverage diffusion MRI to reconstruct the fibrous architecture of the brain's white matter. Among machine learning approaches, reinforcement learning (RL) has emerged as a promising framework for tractography, outperforming traditional methods in several key aspects. TractOracle-RL, a recent RL-based approach, reduces false positives by incorporating anatomical priors into the training process via a reward-based mechanism. In this paper, we investigate four extensions of the original TractOracle-RL framework by integrating recent advances in RL, and we evaluate their performance across five diverse diffusion MRI datasets. Results demonstrate that combining an oracle with the RL framework consistently leads to robust and reliable tractography, regardless of the specific method or dataset used. We also introduce a novel RL training scheme called Iterative Reward Training (IRT), inspired by the Reinforcement Learning from Human Feedback (RLHF) paradigm. Instead of relying on human input, IRT leverages bundle filtering methods to iteratively refine the oracle's guidance throughout training. Experimental results show that RL methods trained with oracle feedback significantly outperform widely used tractography techniques in terms of accuracy and anatomical validity.

MRI Segmentation Neurological Methodology In Silico Benchmark SOTA

Interpretable Prediction of Lymph Node Metastasis in Rectal Cancer MRI Using Variational Autoencoders

Benjamin Keel, Aaron Quyn, David Jayne, Maryam Mohsin, Samuel D. Relton

•preprint•Jul 15 2025

Effective treatment for rectal cancer relies on accurate lymph node metastasis (LNM) staging. However, radiological criteria based on lymph node (LN) size, shape and texture morphology have limited diagnostic accuracy. In this work, we investigate applying a Variational Autoencoder (VAE) as a feature encoder model to replace the large pre-trained Convolutional Neural Network (CNN) used in existing approaches. The motivation for using a VAE is that the generative model aims to reconstruct the images, so it directly encodes visual features and meaningful patterns across the data. This leads to a disentangled and structured latent space which can be more interpretable than a CNN. Models are deployed on an in-house MRI dataset with 168 patients who did not undergo neo-adjuvant treatment. The post-operative pathological N stage was used as the ground truth to evaluate model predictions. Our proposed model 'VAE-MLP' achieved state-of-the-art performance on the MRI dataset, with cross-validated metrics of AUC 0.86 +/- 0.05, Sensitivity 0.79 +/- 0.06, and Specificity 0.85 +/- 0.05. Code is available at: https://github.com/benkeel/Lymph_Node_Classification_MIUA.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab Open Code

LRMR: LLM-Driven Relational Multi-node Ranking for Lymph Node Metastasis Assessment in Rectal Cancer

Yaoxian Dong, Yifan Gao, Haoyue Li, Yanfen Cui, Xin Gao

•preprint•Jul 15 2025

Accurate preoperative assessment of lymph node (LN) metastasis in rectal cancer guides treatment decisions, yet conventional MRI evaluation based on morphological criteria shows limited diagnostic performance. While some artificial intelligence models have been developed, they often operate as black boxes, lacking the interpretability needed for clinical trust. Moreover, these models typically evaluate nodes in isolation, overlooking the patient-level context. To address these limitations, we introduce LRMR, an LLM-Driven Relational Multi-node Ranking framework. This approach reframes the diagnostic task from a direct classification problem into a structured reasoning and ranking process. The LRMR framework operates in two stages. First, a multimodal large language model (LLM) analyzes a composite montage image of all LNs from a patient, generating a structured report that details ten distinct radiological features. Second, a text-based LLM performs pairwise comparisons of these reports between different patients, establishing a relative risk ranking based on the severity and number of adverse features. We evaluated our method on a retrospective cohort of 117 rectal cancer patients. LRMR achieved an area under the curve (AUC) of 0.7917 and an F1-score of 0.7200, outperforming a range of deep learning baselines, including ResNet50 (AUC 0.7708). Ablation studies confirmed the value of our two main contributions: removing the relational ranking stage or the structured prompting stage led to a significant performance drop, with AUCs falling to 0.6875 and 0.6458, respectively. Our work demonstrates that decoupling visual perception from cognitive reasoning through a two-stage LLM framework offers a powerful, interpretable, and effective new paradigm for assessing lymph node metastasis in rectal cancer.

MRI Classification Abdominal Retrospective Clinical In Silico GenAI

Filter Papers

Tags

Comparative Analysis of CNN Performance in Keras, PyTorch and JAX on PathMNIST

Hybrid Ensemble Approaches: Optimal Deep Feature Fusion and Hyperparameter-Tuned Classifier Ensembling for Enhanced Brain Tumor Classification

Identifying Signatures of Image Phenotypes to Track Treatment Response in Liver Disease

Site-Level Fine-Tuning with Progressive Layer Freezing: Towards Robust Prediction of Bronchopulmonary Dysplasia from Day-1 Chest Radiographs in Extremely Preterm Infants

CT-ScanGaze: A Dataset and Baselines for 3D Volumetric Scanpath Modeling

AI-Powered Segmentation and Prognosis with Missing MRI in Pediatric Brain Tumors

SLOTMFound: Foundation-Based Diagnosis of Multiple Sclerosis Using Retinal SLO Imaging and OCT Thickness-maps

Exploring the robustness of TractOracle methods in RL-based tractography

Interpretable Prediction of Lymph Node Metastasis in Rectal Cancer MRI Using Variational Autoencoders

LRMR: LLM-Driven Relational Multi-node Ranking for Lymph Node Metastasis Assessment in Rectal Cancer

Ready to Sharpen Your Edge?