Latest Papers on Radiology AI.

Enhancing Spinal Cord and Canal Segmentation in Degenerative Cervical Myelopathy : The Role of Interactive Learning Models with manual Click.

Han S, Oh JK, Cho W, Kim TJ, Hong N, Park SB

•papers•Sep 29 2025

We aim to develop an interactive segmentation model that can offer accuracy and reliability for the segmentation of the irregularly shaped spinal cord and canal in degenerative cervical myelopathy (DCM) through manual click and model refinement. A dataset of 1444 frames from 294 magnetic resonance imaging records of DCM patients was used and we developed two different segmentation models for comparison : auto-segmentation and interactive segmentation. The former was based on U-Net and utilized a pretrained ConvNeXT-tiny as its encoder. For the latter, we employed an interactive segmentation model structured by SimpleClick, a large model that utilizes a vision transformer as its backbone, together with simple fine-tuning. The segmentation performance of the two models were compared in terms of their Dice scores, mean intersection over union (mIoU), Average Precision and Hausdorff distance. The efficiency of the interactive segmentation model was evaluated by the number of clicks required to achieve a target mIoU. Our model achieved better scores across all four-evaluation metrics for segmentation accuracy, showing improvements of +6.4%, +1.8%, +3.7%, and -53.0% for canal segmentation, and +11.7%, +6.0%, +18.2%, and -70.9% for cord segmentation with 15 clicks, respectively. The required clicks for the interactive segmentation model to achieve a 90% mIoU for spinal canal with cord cases and 80% mIoU for spinal cord cases were 11.71 and 11.99, respectively. We found that the interactive segmentation model significantly outperformed the auto-segmentation model. By incorporating simple manual inputs, the interactive model effectively identified regions of interest, particularly in the complex and irregular shapes of the spinal cord, demonstrating both enhanced accuracy and adaptability.

MRI Segmentation Neurological Methodology In Silico

A review of methods for trustworthy AI in medical imaging: The FUTURE-AI Guidelines.

Kondylakis H, Osuala R, Puig-Bosch X, Lazrak N, Diaz O, Kushibar K, Chouvarda I, Charalambous S, Starmans MP, Colantonio S, Tachos N, Joshi S, Woodruff HC, Salahuddin Z, Tsakou G, Ausso S, Alberich LC, Papanikolaou N, Lambin P, Marias K, Tsiknakis M, Fotiadis DI, Marti-Bonmati L, Lekadir K

•papers•Sep 29 2025

Recent advancements in artificial intelligence (AI) and the vast data generated by modern clinical systems have driven the development of AI solutions in medical imaging, encompassing image reconstruction, segmentation, diagnosis, and treatment planning. Despite these successes and potential, many stakeholders worry about the risks and ethical implications of imaging AI, viewing it as complex, opaque, and challenging to understand, use, and trust in critical clinical applications. The FUTURE-AI guideline for trustworthy AI in healthcare was established based on six guiding principles: Fairness, Universality, Traceability, Usability, Robustness, and Explainability. Through international consensus, a set of recommendations was defined, covering the entire lifecycle of medical AI tools, from design, development, and validation to regulation, deployment, and monitoring. In this paper, we describe how these specific recommendations can be instantiated in the domain of medical imaging, providing an overview of current best practices along with guidelines and concrete metrics on how those recommendations could be met, offering a valuable resource to the international medical imaging community.

Review Ethics Policy

Low-Count PET Image Reconstruction with Generalized Sparsity Priors via Unrolled Deep Networks.

Fu M, Fang M, Liao B, Liang D, Hu Z, Wu FX

•papers•Sep 29 2025

Deep learning has demonstrated remarkable efficacy in reconstructing low-count PET (Positron Emission Tomography) images, attracting considerable attention in the medical imaging community. However, most existing deep learning approaches have not fully exploited the unique physical characteristics of PET imaging in the design of fidelity and prior regularization terms, resulting in constrained model performance and interpretability. In light of these considerations, we introduce an unrolled deep network based on maximum likelihood estimation for the Poisson distribution and a Generalized domain transformation for Sparsity learning, dubbed GS-Net. To address this complex optimization challenge, we employ the Alternating Direction Method of Multipliers (ADMM) framework, integrating a modified Expectation Maximization (EM) approach to address the primary objective and utilize the shrinkage thresholding approach to optimize the L1 norm term. Additionally, within this unrolled deep network, all hyperparameters are adaptively adjusted through end-to-end learning to eliminate the need for manual parameter tuning. Through extensive experiments on simulated patient brain datasets and real patient whole-body clinical datasets with multiple count levels, our method has demonstrated advanced performance compared to traditional non-iterative and iterative reconstruction, deep learning-based direct reconstruction, and hybrid unrolled methods, as demonstrated by qualitative and quantitative evaluations.

PET Reconstruction Whole Body Methodology In Silico

Recent technological advances in video capsule endoscopy: a comprehensive review.

Kim M, Jang HJ

•papers•Sep 29 2025

Video capsule endoscopy (VCE) originally revolutionized gastrointestinal imaging by providing a noninvasive method for evaluating small bowel diseases. Recent technological innovations, including enhanced imaging systems, artificial intelligence (AI), and improved localization, have significantly improved VCE's diagnostic accuracy, efficiency, and clinical utility. This review aims to summarize and evaluate recent technological advances in VCE, focusing on system comparisons, image enhancement, localization technologies, and AI-assisted lesion detection.

OCT Detection Abdominal Review Prototype GenAI

AI Screening Tool Based on X-Rays Improves Early Detection of Decreased Bone Density in a Clinical Setting.

Jayarajah AN, Atinga A, Probyn L, Sivakumaran T, Christakis M, Oikonomou A

•papers•Sep 29 2025

Osteoporosis is an under-screened musculoskeletal disorder that results in diminished quality of life and significant burden to the healthcare system. We aimed to evaluate the ability of Rho, an artificial intelligence (AI) tool, to prospectively identify patients at-risk for low bone mineral density (BMD) from standard x-rays, its adoption rate by radiologists, and acceptance by primary care providers (PCPs). Patients ≥50 years were recruited when undergoing an x-ray of a Rho-eligible body part for any clinical indication. Questionnaires were completed at baseline and 6-month follow-up, and PCPs of "Rho-Positive" patients (those likely to have low BMD) were asked for feedback. Positive predictive value (PPV) was calculated in patients who returned within 6 months for a DXA. Of 1145 patients consented, 987 had x-rays screened by Rho, and 655 were flagged as Rho-Positive. Radiologists included this finding in 524 (80%) of reports. Of all Rho-Positive patients, 125 had a DXA within 6 months; Rho had a 74% PPV for DXA T-Score <-1. From 51 PCP responses, 78% found Rho beneficial. Of 389 patients with follow-up questionnaire data, a greater proportion of Rho-Positive versus -negative patients had discussed bone health with their PCP since study start (36% vs 18%, <i>P</i> < .001), or were newly diagnosed with osteoporosis (11% vs 5%; <i>P</i> = .03). By identifying patients at-risk of low BMD, with acceptability of reporting by radiologists and generally positive feedback from PCPs, Rho has the potential to improve low screening rates for osteoporosis by leveraging existing x-ray data.

X-Ray Classification Musculoskeletal Prospective Clinical Pilot

Global mapping of artificial intelligence applications in breast cancer from 1988-2024: a machine learning approach.

Nguyen THT, Jeon S, Yoon J, Park B

•papers•Sep 29 2025

Artificial intelligence (AI) has become increasingly integral to various aspects of breast cancer care, including screening, diagnosis, and treatment. This study aimed to critically examine the application of AI throughout the breast cancer care continuum to elucidate key research developments, emerging trends, and prevalent patterns. English articles and reviews published between 1988 and 2024 were retrieved from the Web of Science database, focusing on studies that applied AI in breast cancer research. Collaboration among countries was analyzed using co-authorship networks and co-occurrence mapping. Additionally, clustering analysis using Latent Dirichlet Allocation (LDA) was conducted for topic modeling, whereas linear regression was employed to assess trends in research outputs over time. A total of 8,711 publications were included in the analysis. The United States has led the research in applying AI to the breast cancer care continuum, followed by China and India. Recent publications have increasingly focused on the utilization of deep learning and machine learning (ML) algorithms for automated breast cancer detection in mammography and histopathology. Moreover, the integration of multi-omics data and molecular profiling with AI has emerged as a significant trend. However, research on the applications of robotic and ML technologies in surgical oncology and postoperative care remains limited. Overall, the volume of research addressing AI for early detection, diagnosis, and classification of breast cancer has markedly increased over the past five years. The rapid expansion of AI-related research on breast cancer underscores its potential impact. However, significant challenges remain. Ongoing rigorous investigations are essential to ensure that AI technologies yield evidence-based benefits across diverse patient populations, thereby avoiding the inadvertent exacerbation of existing healthcare disparities.

Mammography Detection Breast Review GenAI

TemMed-Bench: Evaluating Temporal Medical Image Reasoning in Vision-Language Models

Junyi Zhang, Jia-Chen Gu, Wenbo Hu, Yu Zhou, Robinson Piramuthu, Nanyun Peng

•preprint•Sep 29 2025

Existing medical reasoning benchmarks for vision-language models primarily focus on analyzing a patient's condition based on an image from a single visit. However, this setting deviates significantly from real-world clinical practice, where doctors typically refer to a patient's historical conditions to provide a comprehensive assessment by tracking their changes over time. In this paper, we introduce TemMed-Bench, the first benchmark designed for analyzing changes in patients' conditions between different clinical visits, which challenges large vision-language models (LVLMs) to reason over temporal medical images. TemMed-Bench consists of a test set comprising three tasks - visual question-answering (VQA), report generation, and image-pair selection - and a supplementary knowledge corpus of over 17,000 instances. With TemMed-Bench, we conduct an evaluation of six proprietary and six open-source LVLMs. Our results show that most LVLMs lack the ability to analyze patients' condition changes over temporal medical images, and a large proportion perform only at a random-guessing level in the closed-book setting. In contrast, GPT o3, o4-mini and Claude 3.5 Sonnet demonstrate comparatively decent performance, though they have yet to reach the desired level. Furthermore, we explore augmenting the input with both retrieved visual and textual modalities in the medical domain. We also show that multi-modal retrieval augmentation yields notably higher performance gains than no retrieval and textual retrieval alone across most models on our benchmark, with the VQA task showing an average improvement of 2.59%. Overall, we compose a benchmark grounded on real-world clinical practice, and it reveals LVLMs' limitations in temporal medical image reasoning, as well as highlighting the use of multi-modal retrieval augmentation as a potentially promising direction worth exploring to address this challenge.

Mixed Modality LLM Radiology Report Dataset Release In Silico Academic Lab Open Dataset Benchmark SOTA

Impact of Artificial Intelligence Triage on Radiologist Report Turnaround Time: Real-World Time Savings and Insights From Model Predictions.

Thompson YLE, Fergus J, Chung J, Delfino JG, Chen W, Levine GM, Samuelson FW

•papers•Sep 29 2025

To quantify the impact of workflow parameters on time savings in report turnaround time due to an AI triage device that prioritized pulmonary embolism (PE) in chest CT pulmonary angiography (CTPA) examinations. This retrospective study analyzed 11,252 adult CTPA examinations conducted for suspected PE at a single tertiary academic medical center. Data was divided into two periods: pre-artificial intelligence (AI) and post-AI. For PE-positive examinations, turnaround time (TAT)-defined as the duration from patient scan completion to the first preliminary report completion-was compared between the two periods. Time savings were reported separately for work-hour and off-hour cohorts. To characterize radiologist workflow, 527,234 records were retrieved from the PACS and workflow parameters such as examination interarrival time and radiologist read time extracted. These parameters were input into a computational model to predict time savings after deployment of an AI triage device and to study the impact of workflow parameters. The pre-AI dataset included 4,694 chest CTPA examinations with 13.3% being PE-positive. The post-AI dataset comprised 6,558 examinations with 16.2% being PE-positive. The mean TAT for pre-AI and post-AI during work hours are 68.9 (95% confidence interval 55.0-82.8) and 46.7 (38.1-55.2) min, respectively, and those during off-hours are 44.8 (33.7-55.9) and 42.0 (33.6-50.3) min. Clinically observed time savings during work hours (22.2 [95% confidence interval: 5.85-38.6] min) were significant (P = .004), while off-hour (2.82 [-11.1 to 16.7] min) were not (P = .345). Observed time savings aligned with model predictions (29.6 [95% range: 23.2-38.1] min for work hours; 2.10 [1.76, 2.58] min for off-hours). Consideration and quantification of the clinical workflow contributes to the accurate assessment of the expected time savings in report TAT after deployment of an AI triage device.

CT Triage Chest Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

Development of a High-Performance Ultrasound Prediction Model for the Diagnosis of Endometrial Cancer: An Interpretable XGBoost Algorithm Utilizing SHAP Analysis.

Lai H, Wu Q, Weng Z, Lyu G, Yang W, Ye F

•papers•Sep 29 2025

To develop and validate an ultrasonography-based machine learning (ML) model for predicting malignant endometrial and cavitary lesions. This retrospective study was conducted on patients with pathologically confirmed results following transvaginal or transrectal ultrasound from 2021 to 2023. Endometrial ultrasound features were characterized using the International Endometrial Tumor Analysis (IETA) terminology. The dataset was ranomly divided (7:3) into training and validation sets. LASSO (least absolute shrinkage and selection operator) regression was applied for feature selection, and an extreme gradient boosting (XGBoost) model was developed. Performance was assessed via receiver operating characteristic (ROC) analysis, calibration, decision curve analysis, sensitivity, specificity, and accuracy. Among 1080 patients, 6 had a non-measurable endometrium. Of the remaining 1074 cases, 641 were premenopausal and 433 postmenopausal. Performance of the XGBoost model on the test set: The area under the curve (AUC) for the premenopausal group was 0.845 (0.781-0.909), with a relatively low sensitivity (0.588, 0.442-0.722) and a relatively high specificity (0.923, 0.863-0.959); the AUC for the postmenopausal group was 0.968 (0.944-0.992), with both sensitivity (0.895, 0.778-0.956) and specificity (0.931, 0.839-0.974) being relatively high. SHapley Additive exPlanations (SHAP) analysis identified key predictors: endometrial-myometrial junction, endometrial thickness, endometrial echogenicity, color Doppler flow score, and vascular pattern in premenopausal women; endometrial thickness, endometrial-myometrial junction, endometrial echogenicity, and color Doppler flow score in postmenopausal women. The XGBoost-based model exhibited excellent predictive performance, particularly in postmenopausal patients. SHAP analysis further enhances interpretability by identifying key ultrasonographic predictors of malignancy.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Novel multi-task learning for Alzheimer's stage classification using hippocampal MRI segmentation, feature fusion, and nomogram modeling.

Hu W, Du Q, Wei L, Wang D, Zhang G

•papers•Sep 29 2025

To develop and validate a comprehensive and interpretable framework for multi-class classification of Alzheimer's disease (AD) progression stages based on hippocampal MRI, integrating radiomic, deep, and clinical features. This retrospective multi-center study included 2956 patients across four AD stages (Non-Demented, Very Mild Demented, Mild Demented, Moderate Demented). T1-weighted MRI scans were processed through a standardized pipeline involving hippocampal segmentation using four models (U-Net, nnU-Net, Swin-UNet, MedT). Radiomic features (n = 215) were extracted using the SERA platform, and deep features (n = 256) were learned using an LSTM network with attention applied to hippocampal slices. Fused features were harmonized with ComBat and filtered by ICC (≥ 0.75), followed by LASSO-based feature selection. Classification was performed using five machine learning models, including Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Multilayer Perceptron (MLP), and eXtreme Gradient Boosting (XGBoost). Model interpretability was addressed using SHAP, and a nomogram and decision curve analysis (DCA) were developed. Additionally, an end-to-end 3D CNN-LSTM model and two transformer-based benchmarks (Vision Transformer, Swin Transformer) were trained for comparative evaluation. MedT achieved the best hippocampal segmentation (Dice = 92.03% external). Fused features yielded the highest classification performance with XGBoost (external accuracy = 92.8%, AUC = 94.2%). SHAP identified MMSE, hippocampal volume, and APOE ε4 as top contributors. The nomogram accurately predicted early-stage AD with clinical utility confirmed by DCA. The end-to-end model performed acceptably (AUC = 84.0%) but lagged behind the fused pipeline. Statistical tests confirmed significant performance advantages for feature fusion and MedT-based segmentation. This study demonstrates that integrating radiomics, deep learning, and clinical data from hippocampal MRI enables accurate and interpretable classification of AD stages. The proposed framework is robust, generalizable, and clinically actionable, representing a scalable solution for AD diagnostics.

MRI Classification Neurological Retrospective Clinical In Silico Benchmark SOTA

Filter Papers

Tags

Enhancing Spinal Cord and Canal Segmentation in Degenerative Cervical Myelopathy : The Role of Interactive Learning Models with manual Click.

A review of methods for trustworthy AI in medical imaging: The FUTURE-AI Guidelines.

Low-Count PET Image Reconstruction with Generalized Sparsity Priors via Unrolled Deep Networks.

Recent technological advances in video capsule endoscopy: a comprehensive review.

AI Screening Tool Based on X-Rays Improves Early Detection of Decreased Bone Density in a Clinical Setting.

Global mapping of artificial intelligence applications in breast cancer from 1988-2024: a machine learning approach.

TemMed-Bench: Evaluating Temporal Medical Image Reasoning in Vision-Language Models

Impact of Artificial Intelligence Triage on Radiologist Report Turnaround Time: Real-World Time Savings and Insights From Model Predictions.

Development of a High-Performance Ultrasound Prediction Model for the Diagnosis of Endometrial Cancer: An Interpretable XGBoost Algorithm Utilizing SHAP Analysis.

Novel multi-task learning for Alzheimer's stage classification using hippocampal MRI segmentation, feature fusion, and nomogram modeling.

Ready to Sharpen Your Edge?