Latest Papers on Radiology AI.

Self-supervised Physics-guided Model with Implicit Representation Regularization for Fast MRI Reconstruction

Jingran Xu, Yuanyuan Liu, Yanjie Zhu

•preprint•Oct 8 2025

Magnetic Resonance Imaging (MRI) is a vital clinical diagnostic tool, yet its widespread application is limited by prolonged scan times. Fast MRI reconstruction techniques effectively reduce acquisition duration by reconstructing high-fidelity MR images from undersampled k-space data. In recent years, deep learning-based methods have demonstrated remarkable progress in this field, with self-supervised and unsupervised learning approaches proving particularly valuable in scenarios where fully sampled data are difficult to obtain. This paper proposes a novel zero-shot self-supervised reconstruction framework named UnrollINR, which enables scan-specific MRI reconstruction without relying on external training data. The method adopts a physics-guided unrolled iterative reconstruction architecture and introduces Implicit Neural Representation (INR) as a regularization prior to effectively constrain the solution space. By combining a deep unrolled structure with the powerful implicit representation capability of INR, the model's interpretability and reconstruction performance are enhanced. Experimental results demonstrate that even at a high acceleration rate of 10, UnrollINR achieves superior reconstruction performance compared to the supervised learning method, validating the superiority of the proposed method.

MRI Reconstruction Methodology In Silico

Improving Artifact Robustness for CT Deep Learning Models Without Labeled Artifact Images via Domain Adaptation

Justin Cheung, Samuel Savine, Calvin Nguyen, Lin Lu, Alhassan S. Yasin

•preprint•Oct 8 2025

Deep learning models which perform well on images from their training distribution can degrade substantially when applied to new distributions. If a CT scanner introduces a new artifact not present in the training labels, the model may misclassify the images. Although modern CT scanners include design features which mitigate these artifacts, unanticipated or difficult-to-mitigate artifacts can still appear in practice. The direct solution of labeling images from this new distribution can be costly. As a more accessible alternative, this study evaluates domain adaptation as an approach for training models that maintain classification performance despite new artifacts, even without corresponding labels. We simulate ring artifacts from detector gain error in sinogram space and evaluate domain adversarial neural networks (DANN) against baseline and augmentation-based approaches on the OrganAMNIST abdominal CT dataset. Our results demonstrate that baseline models trained only on clean images fail to generalize to images with ring artifacts, and traditional augmentation with other distortion types provides no improvement on unseen artifact domains. In contrast, the DANN approach successfully maintains high classification accuracy on ring artifact images using only unlabeled artifact data during training, demonstrating the viability of domain adaptation for artifact robustness. The domain-adapted model achieved classification performance on ring artifact test data comparable to models explicitly trained with labeled artifact images, while also showing unexpected generalization to uniform noise. These findings provide empirical evidence that domain adaptation can effectively address distribution shift in medical imaging without requiring expensive expert labeling of new artifact distributions, suggesting promise for deployment in clinical settings where novel artifacts may emerge.

CT Classification Abdominal Methodology In Silico Academic Lab Reproducibility

The Framework That Survives Bad Models: Human-AI Collaboration For Clinical Trials

Yao Chen, David Ohlssen, Aimee Readie, Gregory Ligozio, Ruvie Martin, Thibaud Coroller

•preprint•Oct 8 2025

Artificial intelligence (AI) holds great promise for supporting clinical trials, from patient recruitment and endpoint assessment to treatment response prediction. However, deploying AI without safeguards poses significant risks, particularly when evaluating patient endpoints that directly impact trial conclusions. We compared two AI frameworks against human-only assessment for medical image-based disease evaluation, measuring cost, accuracy, robustness, and generalization ability. To stress-test these frameworks, we injected bad models, ranging from random guesses to naive predictions, to ensure that observed treatment effects remain valid even under severe model degradation. We evaluated the frameworks using two randomized controlled trials with endpoints derived from spinal X-ray images. Our findings indicate that using AI as a supporting reader (AI-SR) is the most suitable approach for clinical trials, as it meets all criteria across various model types, even with bad models. This method consistently provides reliable disease estimation, preserves clinical trial treatment effect estimates and conclusions, and retains these advantages when applied to different populations.

X-Ray Classification Musculoskeletal RCT In Silico Academic Lab Policy Reproducibility

Diffusion Tractography Biomarker for Epilepsy Severity in Children With Drug-Resistant Epilepsy.

Jeong JW, Lee MH, Uda H, Hwang YH, Behen M, Luat A, Juhász C, Asano E

•papers•Oct 8 2025

To develop a novel deep-learning model of clinical DWI tractography that can accurately predict the general assessment of epilepsy severity (GASE) in pediatric drug-resistant epilepsy (DRE) and test if it can screen diverse neurocognitive impairments identified through neuropsychological assessments. DRE children and age-sex-matched healthy controls were enrolled to construct an epilepsy severity network (ESN), whose edges were significantly correlated with GASE scores of DRE children. An ESN-based biomarker called the predicted GASE score was obtained using dilated deep convolutional neural network with a relational network (dilated DCNN+RN) and used to quantify the risk of neurocognitive impairments using global/verbal/non-verbal neuropsychological assessments of 36/37/32 children performed on average 3.2 ± 2.7 months prior to the MRI scan. To warrant the generalizability, the proposed biomarker was trained and evaluated using separate development and independent test sets, with the random score learning experiment included to assess potential overfitting. The dilated DCNN+RN outperformed other state-of-the art methods to create the predicted GASE scores with significant correlation (r = 0.92 and 0.83 for development and test sets with clinical GASE scores) and minimal overfitting (r = -0.25 and 0.00 for development and test sets with random GASE scores). Both univariate and multivariate models demonstrated that compared with the clinical GASE scores, the predicted GASE scores provide better model fit and discriminatory ability, suggesting more adjusted and accurate estimate of epilepsy severity contributing to the overall risk. The proposed biomarker shows strong potential for early identification of DRE children at risk of neurocognitive impairments, enabling timely, personalized interventions to prevent long-term effects.

MRI Classification Neurological Retrospective Clinical In Silico Benchmark SOTA

Rapid flow-artifact-free high-resolution T2 mapping via multi-shot multiple overlapping-echo detachment imaging.

Yang Q, Bao J, Ni Z, Wang J, Chen L, Yu S, Chen Z, Cai C, Cai S

•papers•Oct 8 2025

The aim is to develop a T2 mapping method with submillimeter spatial resolution, approximately 1-minute acquisition time, large volume coverage, and no extra burden on the console or coil setups. Additionally, to develop an associated approach correcting inter-shot phase variations caused by pulsatile CSF. Multi-shot (msh-) acquisition scheme was integrated into multiple overlapping-echo detachment imaging (MOLED) to alleviate pitfalls of single-shot MOLED (e.g., excessive voxel size and B0-induced geometric distortion) from the physical aspect. Discontinuous phase jumps were corrected via deep learning, leveraging the pattern incoherence between desired signal and undesired artifacts. Methodological feasibility was validated with phantoms (at 3 T and 7 T) and humans (at 3 T). Referring to spin-echo (SE), phantom/human results exhibited a mean absolute error of 1.11/0.89 ms, and linear regression of phantom results revealed a slope of 0.994 (R2 = 0.995). Referring to turbo SE, msh-MOLED demonstrated good coincidence of structural depiction. In data acquired with different scanners and scanning parameters, errors from CSF-induced inter-shot phase inconsistency were successfully eliminated without autocalibration kernel, navigators, gating equipment, nor time-consuming postprocessing. A T2 mapping method with submillimeter spatial resolution and high clinical practicality is proposed and validated. Inter-shot phase variation, a nuisance factor to sequences using segmented k-space acquisition, is addressed in a navigator-free and calibrationless manner, retaining the quantification accuracy and acquisition rapidness of msh-MOLED against pulsatile CSF. This work suggests that the trajectory-related ghost artifacts may be removed through specific spatial encoding and image inpainting.

MRI Reconstruction Neurological Methodology Phantom/Animal

Deep learning-based coronary calcium score derived from non-gated chest CT and major adverse cardiovascular events in patients with type 2 diabetes mellitus.

Xu Y, Yu Y, Ding X, Yuan J, Yu L, Dai X, Ling R, Wang Y, Zhang J

•papers•Oct 8 2025

Deep learning (DL) models can quantify coronary artery calcification using non-gated chest CT scans. However, the prognostic value of a DL-based coronary artery calcium score (DL-CACS) for predicting major adverse cardiovascular events (MACEs) in patients with type 2 diabetes mellitus (T2DM) remains unclear. This study aimed to evaluate the prognostic value of DL-CACS derived from non-gated chest CT scans in patients with T2DM and to develop a risk stratification model for predicting MACEs. Patients with T2DM who underwent non-gated chest CT scans were retrospectively included and followed up for at least 2 years. Patients from Hospital A were randomly assigned to a training cohort and an internal validation cohort in a 3:2 ratio. Two predictive models were developed in the training cohort: Model 1 used the Framingham risk score (FRS), and Model 2 incorporated FRS and DL-CACS. The external validation cohort from Hospital B and the internal validation cohort were used to validate the proposed model. A total of 2,241 patients with T2DM (median age, 61 years; range, 54-68 years; 1,257 males) were included in this study. MACEs occurred in 10.71% (240/2241) of patients during follow-up. Patients who experienced MACEs exhibited significantly higher DL-CACS values than those without MACEs (p < 0.001). In the training cohort, multivariate Cox regression analysis identified DL-CACS as an independent predictor of MACEs (hazard ratio [HR], 1.07; p < 0.001). Moreover, Model 2 demonstrated superior predictive performance compared to Model 1 across the training, internal validation, and external validation cohorts. In the external validation cohort, the C-index of Model 2 was larger than that of Model 1 (C-Index, 0.70 [0.63-0.77] vs. 0.67 [0.61-0.74]; p = 0.007). DL-CACS derived from non-gated chest CT is an independent predictor of MACEs and provides incremental value in risk stratification for patients with T2DM compared with the FRS.

CT Segmentation Cardiac Retrospective Clinical In Silico Academic Lab

The value of cardiac CT based inflammatory risk assessment in predicting cardiovascular events: a case report.

Mavrogiannis MC, Garces NS, Costa L, Alsinbili A, Kardos A

•papers•Oct 8 2025

Vascular inflammation plays a critical role in the development of coronary artery disease (CAD). Measurement of coronary inflammation from coronary computed tomography angiography (CCTA) using the perivascular fat attenuation index (FAI) Score could provide unique prognostic information and guide the clinical management of patients. In this context, we also refer to an artificial intelligence-based risk prediction tool (AI-Risk algorithm), which integrates FAI Score with clinical risk factors and plaque burden to estimate the long-term probability of a fatal cardiac event. A 69-year-old male presented with symptoms of new onset angina. Past medical history included coronary artery bypass grafting (CABG) in 2001. Initial evaluation with CCTA showed patent arterial graft to left anterior descending (LAD) artery and two occluded venous grafts to obtuse marginal and diagonal branches, respectively, were identified. The non-grafted right coronary artery (RCA) was non-obstructive with moderate mid-vessel stenosis and the patient was discharged on optimal medical therapy. However, the patient was intolerant to statin. Eight years later, the patient was admitted to the hospital with a non-ST segment elevation myocardial infarction (NSTEMI) and the invasive coronary angiography showed occlusion of the non-grafted RCA. After few months of guidelines directed medical therapy, the patient developed progressive heart failure due to ischaemic cardiomyopathy and mitral regurgitation that led to his death. Retrospective perivascular FAI measurement of the non-grafted RCA captured the significantly elevated residual inflammatory risk. The utilization of perivascular FAI Score and AI-Risk algorithm to capture inflammatory risk and predict future events beyond the current clinical risk stratification and CCTA interpretation, especially in the absence of obstructive CAD, could offer an important adjunct to current strategies in preventive cardiology, pending further validation. In this case report, our patient's management plan could have been adjusted had these technologies been available during initial evaluation, and the high inflammatory burden of the non-grafted RCA was timely captured.

CT Classification Cardiac Retrospective Clinical Concept

TransBreastNet a CNN transformer hybrid deep learning framework for breast cancer subtype classification and temporal lesion progression analysis.

Brahmareddy A, Selvan MP

•papers•Oct 8 2025

Breast cancer continues to be a global public health challenge. An early and precise diagnosis is crucial for improving prognosis and efficacy. While deep learning (DL) methods have shown promising advances in breast cancer classification from mammogram images, most existing DL models remain static, single-view image-based, and overlook the longitudinal progression of lesions and patient-specific clinical context. Moreover, the majority of models also limited their clinical usability by designing tests for subtype classification in isolation (i.e., not predicting disease stages simultaneously). This paper introduces BreastXploreAI, a simple yet powerful multimodal, multitask deep learning framework for breast cancer diagnosis to fill these gaps. TransBreastNet, a hybrid architecture that combines convolutional neural networks (CNNs) for spatial encoding of lesions, a Transformer-based modular approach for temporal encoding of lesions, and dense metadata encoders for fusion of patient-specific clinical information, forms the backbone of our system. The breast cancer subtype and disease stage are predicted simultaneously from a dual-head classifier. They are then used to construct temporal lesion sequences, either by employing genuine longitudinal data or by adding sequence augmentation to sample sequences, thereby strengthening the model's ability to learn Progression Patterns. We conduct extensive experiments on a public mammogram dataset and demonstrate that our model outperforms several state-of-the-art baselines in both subtype classification, achieving a macro accuracy of 95.2%, and stage Prediction, with a macro accuracy of 93.8%. We also provide ablation studies, which confirm how every module contributes to the framework. Unlike prior static single-view models, our framework jointly models spatial, temporal, and clinical features using a CNN-Transformer hybrid design. It simultaneously predicts breast cancer subtypes and lesion progression stages, while generating synthetic temporal lesion sequences where longitudinal data is scarce. Built-in explainability modules enhance interpretability and clinical trust. BreastXploreAI offers a robust, scalable, and clinically relevant approach to diagnosing breast cancer from full-field digital mammogram (FFDM) images. ZH is computationally capable of analyzing spatial, temporal, and clinical features simultaneously, which enables a more informed diagnosis and lays the foundation for improved clinical decision support systems in oncology.

Mammography Classification Breast Methodology In Silico Academic Lab Benchmark SOTA

Utilizing a publicly accessible automated machine learning platform to enable diagnosis before tumor surgery.

Hosseinzadeh F, Liu G, Tsai E, Mahmoudi A, Yang A, Kim D, Fieux M, Levi L, Abdul-Hadi S, Adappa ND, Alt JA, Altartoor KA, Banyi N, Challa M, Chandra R, Chang MT, Chen PG, Cho DY, de Choudens CR, Chowdhury N, Colon CM, DelGaudio JM, Del Signore A, Dorismond C, Dutra D, Edalati S, Edwards TS, Ferriol JB, Geltzeiler M, Georgalas C, Govindaraj S, Grayson JW, Gudis DA, Harvey RJ, Heffernan A, Hwang PH, Iloreta AM, Knight ND, Kohanski MA, Lerner DK, Leventi A, Lee LH, Lubner R, Mahomva C, Massey C, McCoul ED, Nayak JV, Pak-Harvey E, Palmer JN, Pandrangi VC, Psaltis AJ, Raviv J, Sacks P, Sacks R, Schaberg M, Soudry E, Sweis A, Thamboo A, Turner JH, Wang SX, Wise SK, Woodworth BA, Wormald PJ, Patel ZM

•papers•Oct 8 2025

In benign tumors with potential for malignant transformation, sampling error during pre-operative biopsy can significantly change patient counseling and surgical planning. Sinonasal inverted papilloma (IP) is the most common benign soft tissue tumor of the sinuses, yet it can undergo malignant transformation to squamous cell carcinoma (IP-SCC), for which the planned surgery could be drastically different. Artificial intelligence (AI) could potentially help with this diagnostic challenge. CT images from 19 institutions were used to train the Google Cloud Vertex AI platform to distinguish between IP and IP-SCC. The model was evaluated on a holdout test dataset of images from patients whose data were not used for training or validation. Performance metrics of area under the curve (AUC), sensitivity, specificity, accuracy, and F1 were used to assess the model. Here we show CT image data from 958 patients and 41099 individual images that were labeled to train and validate the deep learning image classification model. The model demonstrated a 95.8 % sensitivity in correctly identifying IP-SCC cases from IP, while specificity was robust at 99.7 %. Overall, the model achieved an accuracy of 99.1%. A deep automated machine learning model, created from a publicly available artificial intelligence tool, using pre-operative CT imaging alone, identified malignant transformation of inverted papilloma with excellent accuracy.

CT Classification Retrospective Clinical In Silico Big Tech Benchmark SOTA

A hybrid approach for enhancing pseudo-labeling in medical images through pseudo-label refinement.

Rahmati B, Shirani S, Keshavarz-Motamed Z

•papers•Oct 8 2025

Segmentation of medical images is critical for the evaluation, diagnosis, and treatment of various medical conditions. While deep learning-based approaches are the dominant methodology, they rely heavily on abundant labeled data and face significant challenges when data is limited. Semi-supervised learning methods mitigate this issue but there are still some challenges associated with them. Additionally, these approaches can be improved specifically for medical images considering their unique properties (e.g., smooth boundaries). In this work, we adapt and enhance the well-established pseudo-labeling approach specifically for medical image segmentation. Our exploration consists of modifying the network's loss function, pruning the pseudo-labels, and refining pseudo-labels by integrating traditional image processing methods with semi-supervised learning. This integration enables traditional segmentation techniques to complement deep semi-supervised methods, particularly in capturing fine edges where deep models often struggle. It also incorporates the smoothness of the edges in the segmentation and achieves a balance between deep learning and traditional methods through tunable parameters. Moreover, to address the problem of noisy or unreliable pseudo-labels, we utilize uncertainty-based pixel-level and image-level pruning of the pseudo-labels using a specific loss function, thereby improving the accuracy and robustness of the segmentation. We evaluated our approach on three different datasets from two imaging modalities (CT and MRI) and demonstrated its superior performance, highlighting its accuracy and robustness in the presence of limited labeled data. With only 15% of the labeled data, on the Sunnybrook Cardiac dataset, our approaches increased endocardium segmentation accuracy from 82.1% to 87.5%, and epicardium segmentation from 82.5% to 86.7%. On the COVID-19 CT lung and infection segmentation dataset, our approach improved left lung segmentation accuracy from 72.5% to 79.3%, and right lung segmentation from 75.8% to 81.6% when using only 15% of labeled data. On the Automated Cardiac Diagnostic Challenge dataset, with just 10% of labeled data, our approach increased endocardium segmentation from 91% to 93.7%, myocardium from 69.8% to 74.5%, and right ventricle from 76.7% to 82.1%. Our codes will be published in https://github.com/behnam-rahmati .

Mixed Modality Segmentation Cardiac Methodology In Silico Academic Lab Open Code

Filter Papers

Tags

Self-supervised Physics-guided Model with Implicit Representation Regularization for Fast MRI Reconstruction

Improving Artifact Robustness for CT Deep Learning Models Without Labeled Artifact Images via Domain Adaptation

The Framework That Survives Bad Models: Human-AI Collaboration For Clinical Trials

Diffusion Tractography Biomarker for Epilepsy Severity in Children With Drug-Resistant Epilepsy.

Rapid flow-artifact-free high-resolution T<sub>2</sub> mapping via multi-shot multiple overlapping-echo detachment imaging.

Deep learning-based coronary calcium score derived from non-gated chest CT and major adverse cardiovascular events in patients with type 2 diabetes mellitus.

The value of cardiac CT based inflammatory risk assessment in predicting cardiovascular events: a case report.

TransBreastNet a CNN transformer hybrid deep learning framework for breast cancer subtype classification and temporal lesion progression analysis.

Utilizing a publicly accessible automated machine learning platform to enable diagnosis before tumor surgery.

A hybrid approach for enhancing pseudo-labeling in medical images through pseudo-label refinement.

Ready to Sharpen Your Edge?