Latest Papers on Radiology AI.

A Multicentre Comparative Analysis of Radiomics, Deep-learning, and Fusion Models for Predicting Postpartum Hemorrhage.

Zhang W, Zhao X, Meng L, Lu L, Guo J, Cheng M, Tian H, Ren N, Yin J, Zhang X

•papers•Jun 24 2025

This study compared the capabilities of two-dimensional (2D) and three-dimensional (3D) deep learning (DL), radiomics, and fusion models to predict postpartum hemorrhage (PPH), using sagittal T2-weighted MRI images. This retrospective study successively included 581 pregnant women suspected of placenta accreta spectrum (PAS) disorders who underwent placental MRI assessment between May 2018 and June 2024 in two hospitals. Clinical information was collected, and MRI images were analyzed by two experienced radiologists. The study cohort was divided into training (hospital 1, n=470) and validation (hospital 2, n=160) sets. Radiomics features were extracted after image segmentation to develop the radiomics model, 2D and 3D DL models were developed, and two fusion strategies (early and late fusion) were used to construct the fusion models. ROC curves, AUC, sensitivity, specificity, calibration curves, and decision curve analysis were used to evaluate the models' performance. The late-fusion model (DLRad_LF) yielded the highest performance, with AUCs of 0.955 (95% CI: 0.935-0.974) and 0.898 (95% CI: 0.848-0.949) in the training and validation sets, respectively. In the validation set, the AUC of the 3D DL model was significantly larger than those of the radiomics (AUC=0.676, P<0.001) and 2D DL (AUC=0.752, P<0.001) models. Subgroup analysis found that placenta previa and PAS did not impact the models' performance significantly. The DLRad_LF model could predict PPH reasonably accurately based on sagittal T2-weighted MRI images.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab

From Faster Frames to Flawless Focus: Deep Learning HASTE in Postoperative Single Sequence MRI.

Hosse C, Fehrenbach U, Pivetta F, Malinka T, Wagner M, Walter-Rittel T, Gebauer B, Kolck J, Geisel D

•papers•Jun 24 2025

This study evaluates the feasibility of a novel deep learning-accelerated half-fourier single-shot turbo spin-echo sequence (HASTE-DL) compared to the conventional HASTE sequence (HASTES) in postoperative single-sequence MRI for the detection of fluid collections following abdominal surgery. As small fluid collections are difficult to visualize using other techniques, HASTE-DL may offer particular advantages in this clinical context. A retrospective analysis was conducted on 76 patients (mean age 65±11.69 years) who underwent abdominal MRI for suspected septic foci following abdominal surgery. Imaging was performed using 3-T MRI scanners, and both sequences were analyzed in terms of image quality, contrast, sharpness, and artifact presence. Quantitative assessments focused on fluid collection detectability, while qualitative assessments evaluated visualization of critical structures. Inter-reader agreement was measured using Cohen's kappa coefficient, and statistical significance was determined with the Mann-Whitney U test. HASTE-DL achieved a 46% reduction in scan time compared to HASTES, while significantly improving overall image quality (p<0.001), contrast (p<0.001), and sharpness (p<0.001). The inter-reader agreement for HASTE-DL was excellent (κ=0.960), with perfect agreement on overall image quality and fluid collection detection (κ=1.0). Fluid detectability and characterization scores were higher for HASTE-DL, and visualization of critical structures was significantly enhanced (p<0.001). No relevant artifacts were observed in either sequence. HASTE-DL offers superior image quality, improved visualization of critical structures, such as drainages, vessels, bile and pancreatic ducts, and reduced acquisition time, making it an effective alternative to the standard HASTE sequence, and a promising complementary tool in the postoperative imaging workflow.

MRI Reconstruction Abdominal Retrospective Clinical In Silico Academic Lab

Differentiating adenocarcinoma and squamous cell carcinoma in lung cancer using semi automated segmentation and radiomics.

Vijitha R, Wickramasinghe WMIS, Perera PAS, Jayatissa RMGCSB, Hettiarachchi RT, Alwis HARV

•papers•Jun 24 2025

Adenocarcinoma (AD) and squamous cell carcinoma (SCC) are frequently observed forms of non-small cell lung cancer (NSCLC), playing a significant role in global cancer mortality. This research categorizes NSCLC subtypes by analyzing image details using computer-assisted semi-automatic segmentation and radiomic features in model development. This study includes 80 patients with 50 AD and 30 SCC which were analyzed using 3D Slicer software and extracted 107 quantitative radiomic features per patient. After eliminating correlated attributes, LASSO binary logistic regression model and 10-fold cross-validation were used for feature selection. The Shapiro-Wilk test assessed radiomic score normality, and the Mann-Whitney U test compared score distributions. Random Forest (RF) and Support Vector Machine (SVM) classification models were implemented for subtype classification. Receiver-Operator Characteristic (ROC) curves evaluated the radiomics score, showing a moderate predictive ability with training set area under curve (AUC) of 0.679 (95 % CI, 0.541-0.871) and validation set AUC of 0.560 (95 % CI, 0.342-0.778). Rad-Score distributions were normal for AD and not normal for SCC. RF and SVM classification models, which are based on selected features, resulted RF accuracy (95 % CI) of 0.73 and SVM accuracy (95 % CI) of 0.87, with respective AUC values of 0.54 and 0.87. These findings enhance the understanding that the two subtypes of NSCLC can be differentiated. The study demonstrated radiomic analysis improves diagnostic accuracy and offers a non-invasive alternative. However, the AUCs and ROC curves for the machine learning models must be critically evaluated to ensure clinical acceptability. If robust, these models could reduce the need for biopsies and enhance personalized treatment planning. Further research is needed to validate these findings and integrate radiomics into NSCLC clinical practice.

CT Classification Chest Retrospective Clinical In Silico Academic Lab

Semantic Scene Graph for Ultrasound Image Explanation and Scanning Guidance

Xuesong Li, Dianye Huang, Yameng Zhang, Nassir Navab, Zhongliang Jiang

•preprint•Jun 24 2025

Understanding medical ultrasound imaging remains a long-standing challenge due to significant visual variability caused by differences in imaging and acquisition parameters. Recent advancements in large language models (LLMs) have been used to automatically generate terminology-rich summaries orientated to clinicians with sufficient physiological knowledge. Nevertheless, the increasing demand for improved ultrasound interpretability and basic scanning guidance among non-expert users, e.g., in point-of-care settings, has not yet been explored. In this study, we first introduce the scene graph (SG) for ultrasound images to explain image content to ordinary and provide guidance for ultrasound scanning. The ultrasound SG is first computed using a transformer-based one-stage method, eliminating the need for explicit object detection. To generate a graspable image explanation for ordinary, the user query is then used to further refine the abstract SG representation through LLMs. Additionally, the predicted SG is explored for its potential in guiding ultrasound scanning toward missing anatomies within the current imaging view, assisting ordinary users in achieving more standardized and complete anatomical exploration. The effectiveness of this SG-based image explanation and scanning guidance has been validated on images from the left and right neck regions, including the carotid and thyroid, across five volunteers. The results demonstrate the potential of the method to maximally democratize ultrasound by enhancing its interpretability and usability for ordinaries.

Ultrasound Classification Vascular Methodology In Silico Academic Lab GenAI

Predicting enamel depth distribution of maxillary teeth based on intraoral scanning: A machine learning study.

Chen D, He X, Li Q, Wang Z, Shen J, Shen J

•papers•Jun 24 2025

Measuring enamel depth distribution (EDD) is of great importance for preoperative design of tooth preparations, restorative aesthetic preview and monitoring enamel wear. But, currently there are no non-invasive methods available to efficiently obtain EDD. This study aimed to develop a machine learning (ML) framework to achieve noninvasive and radiation-free EDD predictions with intraoral scanning (IOS) images. Cone-beam computed tomography (CBCT) and IOS images of right maxillary central incisors, canines, and first premolars from 200 volunteers were included and preprocessed with surface parameterization. During the training stage, the EDD ground truths were obtained from CBCT. Five-dimensional features (incisal-gingival position, mesial-distal position, local surface curvature, incisal-gingival stretch, mesial-distal stretch) were extracted on labial enamel surfaces and served as inputs to the ML models. An eXtreme gradient boosting (XGB) model was trained to establish the mapping of features to the enamel depth values. R2 and mean absolute error (MAE) were utilized to evaluate the training accuracy of XGB model. In prediction stage, the predicted EDDs were compared with the ground truths, and the EDD discrepancies were analyzed using a paired t-test and Frobenius norm. The XGB model achieved superior performance in training with average R2 and MAE values of 0.926 and 0.080, respectively. Independent validation confirmed its robust EDD prediction ability, showing no significant deviation from ground truths in paired t-test and low prediction errors (Frobenius norm: 12.566-18.312), despite minor noise in IOS-based predictions. This study performed preliminary validation of an IOS-based ML model for high-quality EDD prediction.

CT Registration Retrospective Clinical In Silico Academic Lab

Diagnostic Performance of Universal versus Stratified Computer-Aided Detection Thresholds for Chest X-Ray-Based Tuberculosis Screening

Sung, J., Kitonsa, P. J., Nalutaaya, A., Isooba, D., Birabwa, S., Ndyabayunga, K., Okura, R., Magezi, J., Nantale, D., Mugabi, I., Nakiiza, V., Dowdy, D. W., Katamba, A., Kendall, E. A.

•preprint•Jun 24 2025

BackgroundComputer-aided detection (CAD) software analyzes chest X-rays for features suggestive of tuberculosis (TB) and provides a numeric abnormality score. However, estimates of CAD accuracy for TB screening are hindered by the lack of confirmatory data among people with lower CAD scores, including those without symptoms. Additionally, the appropriate CAD score thresholds for obtaining further testing may vary according to population and client characteristics. MethodsWe screened for TB in Ugandan individuals aged [≥]15 years using portable chest X-rays with CAD (qXR v3). Participants were offered screening regardless of their symptoms. Those with X-ray scores above a threshold of 0.1 (range, 0 - 1) were asked to provide sputum for Xpert Ultra testing. We estimated the diagnostic accuracy of CAD for detecting Xpert-positive TB when using the same threshold for all individuals (under different assumptions about TB prevalence among people with X-ray scores <0.1), and compared this estimate to age- and/or sex-stratified approaches. FindingsOf 52,835 participants screened for TB using CAD, 8,949 (16.9%) had X-ray scores [≥]0.1. Of 7,219 participants with valid Xpert Ultra results, 382 (5.3%) were Xpert-positive, including 81 with trace results. Assuming 0.1% of participants with X-ray scores <0.1 would have been Xpert-positive if tested, qXR had an estimated AUC of 0.920 (95% confidence interval 0.898-0.941) for Xpert-positive TB. Stratifying CAD thresholds according to age and sex improved accuracy; for example, at 96.1% specificity, estimated sensitivity was 75.0% for a universal threshold (of [≥]0.65) versus 76.9% for thresholds stratified by age and sex (p=0.046). InterpretationThe accuracy of CAD for TB screening among all screening participants, including those without symptoms or abnormal chest X-rays, is higher than previously estimated. Stratifying CAD thresholds based on client characteristics such as age and sex could further improve accuracy, enabling a more effective and personalized approach to TB screening. FundingNational Institutes of Health Research in contextO_ST_ABSEvidence before this studyC_ST_ABSThe World Health Organization (WHO) has endorsed computer-aided detection (CAD) as a screening tool for tuberculosis (TB), but the appropriate CAD score that triggers further diagnostic evaluation for tuberculosis varies by population. The WHO recommends determining the appropriate CAD threshold for specific settings and population and considering unique thresholds for specific populations, including older age groups, among whom CAD may perform poorly. We performed a PubMed literature search for articles published until September 9, 2024, using the search terms "tuberculosis" AND ("computer-aided detection" OR "computer aided detection" OR "CAD" OR "computer-aided reading" OR "computer aided reading" OR "artificial intelligence"), which resulted in 704 articles. Among them, we identified studies that evaluated the performance of CAD for tuberculosis screening and additionally reviewed relevant references. Most prior studies reported area under the curves (AUC) ranging from 0.76 to 0.88 but limited their evaluations to individuals with symptoms or abnormal chest X-rays. Some prior studies identified subgroups (including older individuals and people with prior TB) among whom CAD had lower-than-average AUCs, and authors discussed how the prevalence of such characteristics could affect the optimal value of a population-wide CAD threshold; however, none estimated the accuracy that could be gained with adjusting CAD thresholds between individuals based on personal characteristics. Added value of this studyIn this study, all consenting individuals in a high-prevalence setting were offered chest X-ray screening, regardless of symptoms, if they were [≥]15 years old, not pregnant, and not on TB treatment. A very low CAD score cutoff (qXR v3 score of 0.1 on a 0-1 scale) was used to select individuals for confirmatory sputum molecular testing, enabling the detection of radiographically mild forms of TB and facilitating comparisons of diagnostic accuracy at different CAD thresholds. With this more expansive, symptom-neutral evaluation of CAD, we estimated an AUC of 0.920, and we found that the qXR v3 threshold needed to decrease to under 0.1 to meet the WHO target product profile goal of [≥]90% sensitivity and [≥]70% specificity. Compared to using the same thresholds for all participants, adjusting CAD thresholds by age and sex strata resulted in a 1 to 2% increase in sensitivity without affecting specificity. Implications of all the available evidenceTo obtain high sensitivity with CAD screening in high-prevalence settings, low score thresholds may be needed. However, countries with a high burden of TB often do not have sufficient resources to test all individuals above a low threshold. In such settings, adjusting CAD thresholds based on individual characteristics associated with TB prevalence (e.g., male sex) and those associated with false-positive X-ray results (e.g., old age) can potentially improve the efficiency of TB screening programs.

X-Ray Detection Chest Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

Enhancing Lung Cancer Diagnosis: An Optimization-Driven Deep Learning Approach with CT Imaging.

Lakshminarasimha K, Priyeshkumar AT, Karthikeyan M, Sakthivel R

•papers•Jun 23 2025

Lung cancer (LC) remains a leading cause of mortality worldwide, affecting individuals across all genders and age groups. Early and accurate diagnosis is critical for effective treatment and improved survival rates. Computed Tomography (CT) imaging is widely used for LC detection and classification. However, manual identification can be time-consuming and error-prone due to the visual similarities among various LC types. Deep learning (DL) has shown significant promise in medical image analysis. Although numerous studies have investigated LC detection using deep learning techniques, the effective extraction of highly correlated features remains a significant challenge, thereby limiting diagnostic accuracy. Furthermore, most existing models encounter substantial computational complexity and find it difficult to efficiently handle the high-dimensional nature of CT images. This study introduces an optimized CBAM-EfficientNet model to enhance feature extraction and improve LC classification. EfficientNet is utilized to reduce computational complexity, while the Convolutional Block Attention Module (CBAM) emphasizes essential spatial and channel features. Additionally, optimization algorithms including Gray Wolf Optimization (GWO), Whale Optimization (WO), and the Bat Algorithm (BA) are applied to fine-tune hyperparameters and boost predictive accuracy. The proposed model, integrated with different optimization strategies, is evaluated on two benchmark datasets. The GWO-based CBAM-EfficientNet achieves outstanding classification accuracies of 99.81% and 99.25% on the Lung-PET-CT-Dx and LIDC-IDRI datasets, respectively. Following GWO, the BA-based CBAM-EfficientNet achieves 99.44% and 98.75% accuracy on the same datasets. Comparative analysis highlights the superiority of the proposed model over existing approaches, demonstrating strong potential for reliable and automated LC diagnosis. Its lightweight architecture also supports real-time implementation, offering valuable assistance to radiologists in high-demand clinical environments.

CT Classification Chest Methodology In Silico Academic Lab

MedSeg-R: Medical Image Segmentation with Clinical Reasoning

Hao Shao, Qibin Hou

•preprint•Jun 23 2025

Medical image segmentation is challenging due to overlapping anatomies with ambiguous boundaries and a severe imbalance between the foreground and background classes, which particularly affects the delineation of small lesions. Existing methods, including encoder-decoder networks and prompt-driven variants of the Segment Anything Model (SAM), rely heavily on local cues or user prompts and lack integrated semantic priors, thus failing to generalize well to low-contrast or overlapping targets. To address these issues, we propose MedSeg-R, a lightweight, dual-stage framework inspired by inspired by clinical reasoning. Its cognitive stage interprets medical report into structured semantic priors (location, texture, shape), which are fused via transformer block. In the perceptual stage, these priors modulate the SAM backbone: spatial attention highlights likely lesion regions, dynamic convolution adapts feature filters to expected textures, and deformable sampling refines spatial support. By embedding this fine-grained guidance early, MedSeg-R disentangles inter-class confusion and amplifies minority-class cues, greatly improving sensitivity to small lesions. In challenging benchmarks, MedSeg-R produces large Dice improvements in overlapping and ambiguous structures, demonstrating plug-and-play compatibility with SAM-based systems.

Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Benchmarking Foundation Models and Parameter-Efficient Fine-Tuning for Prognosis Prediction in Medical Imaging

Filippo Ruffini, Elena Mulero Ayllon, Linlin Shen, Paolo Soda, Valerio Guarrasi

•preprint•Jun 23 2025

Artificial Intelligence (AI) holds significant promise for improving prognosis prediction in medical imaging, yet its effective application remains challenging. In this work, we introduce a structured benchmark explicitly designed to evaluate and compare the transferability of Convolutional Neural Networks and Foundation Models in predicting clinical outcomes in COVID-19 patients, leveraging diverse publicly available Chest X-ray datasets. Our experimental methodology extensively explores a wide set of fine-tuning strategies, encompassing traditional approaches such as Full Fine-Tuning and Linear Probing, as well as advanced Parameter-Efficient Fine-Tuning methods including Low-Rank Adaptation, BitFit, VeRA, and IA3. The evaluations were conducted across multiple learning paradigms, including both extensive full-data scenarios and more clinically realistic Few-Shot Learning settings, which are critical for modeling rare disease outcomes and rapidly emerging health threats. By implementing a large-scale comparative analysis involving a diverse selection of pretrained models, including general-purpose architectures pretrained on large-scale datasets such as CLIP and DINOv2, to biomedical-specific models like MedCLIP, BioMedCLIP, and PubMedCLIP, we rigorously assess each model's capacity to effectively adapt and generalize to prognosis tasks, particularly under conditions of severe data scarcity and pronounced class imbalance. The benchmark was designed to capture critical conditions common in prognosis tasks, including variations in dataset size and class distribution, providing detailed insights into the strengths and limitations of each fine-tuning strategy. This extensive and structured evaluation aims to inform the practical deployment and adoption of robust, efficient, and generalizable AI-driven solutions in real-world clinical prognosis prediction workflows.

X-Ray Classification Chest Methodology In Silico Academic Lab Benchmark SOTA

Adaptive Mask-guided K-space Diffusion for Accelerated MRI Reconstruction

Qinrong Cai, Yu Guan, Zhibo Chen, Dong Liang, Qiuyun Fan, Qiegen Liu

•preprint•Jun 23 2025

As the deep learning revolution marches on, masked modeling has emerged as a distinctive approach that involves predicting parts of the original data that are proportionally masked during training, and has demonstrated exceptional performance in multiple fields. Magnetic Resonance Imaging (MRI) reconstruction is a critical task in medical imaging that seeks to recover high-quality images from under-sampled k-space data. However, previous MRI reconstruction strategies usually optimized the entire image domain or k-space, without considering the importance of different frequency regions in the k-space This work introduces a diffusion model based on adaptive masks (AMDM), which utilizes the adaptive adjustment of frequency distribution based on k-space data to develop a hybrid masks mechanism that adapts to different k-space inputs. This enables the effective separation of high-frequency and low-frequency components, producing diverse frequency-specific representations. Additionally, the k-space frequency distribution informs the generation of adaptive masks, which, in turn, guide a closed-loop diffusion process. Experimental results verified the ability of this method to learn specific frequency information and thereby improved the quality of MRI reconstruction, providing a flexible framework for optimizing k-space data using masks in the future.

MRI Reconstruction Methodology In Silico Academic Lab

Filter Papers

Tags

A Multicentre Comparative Analysis of Radiomics, Deep-learning, and Fusion Models for Predicting Postpartum Hemorrhage.

From Faster Frames to Flawless Focus: Deep Learning HASTE in Postoperative Single Sequence MRI.

Differentiating adenocarcinoma and squamous cell carcinoma in lung cancer using semi automated segmentation and radiomics.

Semantic Scene Graph for Ultrasound Image Explanation and Scanning Guidance

Predicting enamel depth distribution of maxillary teeth based on intraoral scanning: A machine learning study.

Diagnostic Performance of Universal versus Stratified Computer-Aided Detection Thresholds for Chest X-Ray-Based Tuberculosis Screening

Enhancing Lung Cancer Diagnosis: An Optimization-Driven Deep Learning Approach with CT Imaging.

MedSeg-R: Medical Image Segmentation with Clinical Reasoning

Benchmarking Foundation Models and Parameter-Efficient Fine-Tuning for Prognosis Prediction in Medical Imaging

Adaptive Mask-guided K-space Diffusion for Accelerated MRI Reconstruction

Ready to Sharpen Your Edge?