Latest Papers on Radiology AI.

Development and validation of a cranial ultrasound imaging-based deep learning model for periventricular-intraventricular haemorrhage detection and grading: a two-centre study.

Peng Y, Hu Z, Wen M, Deng Y, Zhao D, Yu Y, Liang W, Dai X, Wang Y

•papers•Jul 29 2025

Periventricular-intraventricular haemorrhage (IVH) is the most prevalent type of neonatal intracranial haemorrhage. It is especially threatening to preterm infants, in whom it is associated with significant morbidity and mortality. Cranial ultrasound has become an important means of screening periventricular IVH in infants. The integration of artificial intelligence with neonatal ultrasound is promising for enhancing diagnostic accuracy, reducing physician workload, and consequently improving periventricular IVH outcomes. The study investigated whether deep learning-based analysis of the cranial ultrasound images of infants could detect and grade periventricular IVH. This multicentre observational study included 1,060 cases and healthy controls from two hospitals. The retrospective modelling dataset encompassed 773 participants from January 2020 to July 2023, while the prospective two-centre validation dataset included 287 participants from August 2023 to January 2024. The periventricular IVH net model, a deep learning model incorporating the convolutional block attention module mechanism, was developed. The model's effectiveness was assessed by randomly dividing the retrospective data into training and validation sets, followed by independent validation with the prospective two-centre data. To evaluate the model, we measured its recall, precision, accuracy, F1-score, and area under the curve (AUC). The regions of interest (ROI) that influenced the detection by the deep learning model were visualised in significance maps, and the t-distributed stochastic neighbour embedding (t-SNE) algorithm was used to visualise the clustering of model detection parameters. The final retrospective dataset included 773 participants (mean (standard deviation (SD)) gestational age, 32.7 (4.69) weeks; mean (SD) weight, 1,862.60 (855.49) g). For the retrospective data, the model's AUC was 0.99 (95% confidence interval (CI), 0.98-0.99), precision was 0.92 (0.89-0.95), recall was 0.93 (0.89-0.95), and F1-score was 0.93 (0.90-0.95). For the prospective two-centre validation data, the model's AUC was 0.961 (95% CI, 0.94-0.98) and accuracy was 0.89 (95% CI, 0.86-0.92). The two-centre prospective validation results of the periventricular IVH net model demonstrated its tremendous potential for paediatric clinical applications. Combining artificial intelligence with paediatric ultrasound can enhance the accuracy and efficiency of periventricular IVH diagnosis, especially in primary hospitals or community hospitals.

Ultrasound Classification Neurological Prospective Clinical Pilot

A hybrid M-DbneAlexnet for brain tumour detection using MRI images.

Kotti J, Chalasani V, Rajan C

•papers•Jul 29 2025

Brain Tumour (BT) is characterised by the uncontrolled proliferation of the cells within the brain which can result in cancer. Detecting BT at the early stage significantly increases the patient's survival chances. The existing BT detection methods often struggle with high computational complexity, limited feature discrimination, and poor generalisation. To mitigate these issues, an effective brain tumour detection and segmentation method based on A hybrid network named MobileNet- Deep Batch-Normalized eLU AlexNet (M-DbneAlexnet) is developed based on Magnetic Resonance Imaging (MRI). The image enhancement is done by Piecewise Linear Transformation (PLT) function. BT region is segmented Transformer Brain Tumour Segmentation (TransBTSV2). Then feature extraction is done. Finally, BT is detected using M-DbneAlexnet model, which is devised by combining MobileNet and Deep Batch-Normalized eLU AlexNet (DbneAlexnet).<b>Results:</b> The proposed model achieved an accuracy of 92.68%, sensitivity of 93.02%, and specificity of 92.85%, demonstrating its effectiveness in accurately detecting brain tumors from MRI images. The proposed model enhances training speed and performs well on limited datasets, making it effective for distinguishing between tumor and healthy tissues. Its practical utility lies in enabling early detection and diagnosis of brain tumors, which can significantly reduce mortality rates.

MRI Detection Neurological Methodology In Silico

Radiomics meets transformers: A novel approach to tumor segmentation and classification in mammography for breast cancer.

Saadh MJ, Hussain QM, Albadr RJ, Doshi H, Rekha MM, Kundlas M, Pal A, Rizaev J, Taher WM, Alwan M, Jawad MJ, Al-Nuaimi AMA, Farhood B

•papers•Jul 29 2025

ObjectiveThis study aimed to develop a robust framework for breast cancer diagnosis by integrating advanced segmentation and classification approaches. Transformer-based and U-Net segmentation models were combined with radiomic feature extraction and machine learning classifiers to improve segmentation precision and classification accuracy in mammographic images.Materials and MethodsA multi-center dataset of 8000 mammograms (4200 normal, 3800 abnormal) was used. Segmentation was performed using Transformer-based and U-Net models, evaluated through Dice Coefficient (DSC), Intersection over Union (IoU), Hausdorff Distance (HD95), and Pixel-Wise Accuracy. Radiomic features were extracted from segmented masks, with Recursive Feature Elimination (RFE) and Analysis of Variance (ANOVA) employed to select significant features. Classifiers including Logistic Regression, XGBoost, CatBoost, and a Stacking Ensemble model were applied to classify tumors into benign or malignant. Classification performance was assessed using accuracy, sensitivity, F1 score, and AUC-ROC. SHAP analysis validated feature importance, and Q-value heatmaps evaluated statistical significance.ResultsThe Transformer-based model achieved superior segmentation results with DSC (0.94 ± 0.01 training, 0.92 ± 0.02 test), IoU (0.91 ± 0.01 training, 0.89 ± 0.02 test), HD95 (3.0 ± 0.3 mm training, 3.3 ± 0.4 mm test), and Pixel-Wise Accuracy (0.96 ± 0.01 training, 0.94 ± 0.02 test), consistently outperforming U-Net across all metrics. For classification, Transformer-segmented features with the Stacking Ensemble achieved the highest test results: 93% accuracy, 92% sensitivity, 93% F1 score, and 95% AUC. U-Net-segmented features achieved lower metrics, with the best test accuracy at 84%. SHAP analysis confirmed the importance of features like Gray-Level Non-Uniformity and Zone Entropy.ConclusionThis study demonstrates the superiority of Transformer-based segmentation integrated with radiomic feature selection and robust classification models. The framework provides a precise and interpretable solution for breast cancer diagnosis, with potential for scalability to 3D imaging and multimodal datasets.

Mammography Segmentation Breast Methodology In Silico

SwinECAT: A Transformer-based fundus disease classification model with Shifted Window Attention and Efficient Channel Attention

Peiran Gu, Teng Yao, Mengshen He, Fuhao Duan, Feiyan Liu, RenYuan Peng, Bao Ge

•preprint•Jul 29 2025

In recent years, artificial intelligence has been increasingly applied in the field of medical imaging. Among these applications, fundus image analysis presents special challenges, including small lesion areas in certain fundus diseases and subtle inter-disease differences, which can lead to reduced prediction accuracy and overfitting in the models. To address these challenges, this paper proposes the Transformer-based model SwinECAT, which combines the Shifted Window (Swin) Attention with the Efficient Channel Attention (ECA) Attention. SwinECAT leverages the Swin Attention mechanism in the Swin Transformer backbone to effectively capture local spatial structures and long-range dependencies within fundus images. The lightweight ECA mechanism is incorporated to guide the SwinECAT's attention toward critical feature channels, enabling more discriminative feature representation. In contrast to previous studies that typically classify fundus images into 4 to 6 categories, this work expands fundus disease classification to 9 distinct types, thereby enhancing the granularity of diagnosis. We evaluate our method on the Eye Disease Image Dataset (EDID) containing 16,140 fundus images for 9-category classification. Experimental results demonstrate that SwinECAT achieves 88.29\% accuracy, with weighted F1-score of 0.88 and macro F1-score of 0.90. The classification results of our proposed model SwinECAT significantly outperform the baseline Swin Transformer and multiple compared baseline models. To our knowledge, this represents the highest reported performance for 9-category classification on this public dataset.

OCT Classification Methodology In Silico Academic Lab Benchmark SOTA

Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images

Yutao Hu, Ying Zheng, Shumei Miao, Xiaolei Zhang, Jiahao Xia, Yaolei Qi, Yiyang Zhang, Yuting He, Qian Chen, Jing Ye, Hongyan Qiao, Xiuhua Hu, Lei Xu, Jiayin Zhang, Hui Liu, Minwen Zheng, Yining Wang, Daimin Zhang, Ji Zhang, Wenqi Shao, Yun Liu, Longjiang Zhang, Guanyu Yang

•preprint•Jul 29 2025

Foundation models have demonstrated remarkable potential in medical domain. However, their application to complex cardiovascular diagnostics remains underexplored. In this paper, we present Cardiac-CLIP, a multi-modal foundation model designed for 3D cardiac CT images. Cardiac-CLIP is developed through a two-stage pre-training strategy. The first stage employs a 3D masked autoencoder (MAE) to perform self-supervised representation learning from large-scale unlabeled volumetric data, enabling the visual encoder to capture rich anatomical and contextual features. In the second stage, contrastive learning is introduced to align visual and textual representations, facilitating cross-modal understanding. To support the pre-training, we collect 16641 real clinical CT scans, supplemented by 114k publicly available data. Meanwhile, we standardize free-text radiology reports into unified templates and construct the pathology vectors according to diagnostic attributes, based on which the soft-label matrix is generated to supervise the contrastive learning process. On the other hand, to comprehensively evaluate the effectiveness of Cardiac-CLIP, we collect 6,722 real-clinical data from 12 independent institutions, along with the open-source data to construct the evaluation dataset. Specifically, Cardiac-CLIP is comprehensively evaluated across multiple tasks, including cardiovascular abnormality classification, information retrieval and clinical analysis. Experimental results demonstrate that Cardiac-CLIP achieves state-of-the-art performance across various downstream tasks in both internal and external data. Particularly, Cardiac-CLIP exhibits great effectiveness in supporting complex clinical tasks such as the prospective prediction of acute coronary syndrome, which is notoriously difficult in real-world scenarios.

CT Classification Cardiac Methodology In Silico Academic Lab Benchmark SOTA Open Dataset

VidFuncta: Towards Generalizable Neural Representations for Ultrasound Videos

Julia Wolleb, Florentin Bieder, Paul Friedrich, Hemant D. Tagare, Xenophon Papademetris

•preprint•Jul 29 2025

Ultrasound is widely used in clinical care, yet standard deep learning methods often struggle with full video analysis due to non-standardized acquisition and operator bias. We offer a new perspective on ultrasound video analysis through implicit neural representations (INRs). We build on Functa, an INR framework in which each image is represented by a modulation vector that conditions a shared neural network. However, its extension to the temporal domain of medical videos remains unexplored. To address this gap, we propose VidFuncta, a novel framework that leverages Functa to encode variable-length ultrasound videos into compact, time-resolved representations. VidFuncta disentangles each video into a static video-specific vector and a sequence of time-dependent modulation vectors, capturing both temporal dynamics and dataset-level redundancies. Our method outperforms 2D and 3D baselines on video reconstruction and enables downstream tasks to directly operate on the learned 1D modulation vectors. We validate VidFuncta on three public ultrasound video datasets -- cardiac, lung, and breast -- and evaluate its downstream performance on ejection fraction prediction, B-line detection, and breast lesion classification. These results highlight the potential of VidFuncta as a generalizable and efficient representation framework for ultrasound videos. Our code is publicly available under https://github.com/JuliaWolleb/VidFuncta_public.

Ultrasound Classification Methodology In Silico Academic Lab Open Code GenAI

GDAIP: A Graph-Based Domain Adaptive Framework for Individual Brain Parcellation

Jianfei Zhu, Haiqi Zhu, Shaohui Liu, Feng Jiang, Baichun Wei, Chunzhi Yi

•preprint•Jul 29 2025

Recent deep learning approaches have shown promise in learning such individual brain parcellations from functional magnetic resonance imaging (fMRI). However, most existing methods assume consistent data distributions across domains and struggle with domain shifts inherent to real-world cross-dataset scenarios. To address this challenge, we proposed Graph Domain Adaptation for Individual Parcellation (GDAIP), a novel framework that integrates Graph Attention Networks (GAT) with Minimax Entropy (MME)-based domain adaptation. We construct cross-dataset brain graphs at both the group and individual levels. By leveraging semi-supervised training and adversarial optimization of the prediction entropy on unlabeled vertices from target brain graph, the reference atlas is adapted from the group-level brain graph to the individual brain graph, enabling individual parcellation under cross-dataset settings. We evaluated our method using parcellation visualization, Dice coefficient, and functional homogeneity. Experimental results demonstrate that GDAIP produces individual parcellations with topologically plausible boundaries, strong cross-session consistency, and ability of reflecting functional organization.

MRI Segmentation Neurological Methodology In Silico

Distribution-Based Masked Medical Vision-Language Model Using Structured Reports

Shreyank N Gowda, Ruichi Zhang, Xiao Gu, Ying Weng, Lu Yang

•preprint•Jul 29 2025

Medical image-language pre-training aims to align medical images with clinically relevant text to improve model performance on various downstream tasks. However, existing models often struggle with the variability and ambiguity inherent in medical data, limiting their ability to capture nuanced clinical information and uncertainty. This work introduces an uncertainty-aware medical image-text pre-training model that enhances generalization capabilities in medical image analysis. Building on previous methods and focusing on Chest X-Rays, our approach utilizes structured text reports generated by a large language model (LLM) to augment image data with clinically relevant context. These reports begin with a definition of the disease, followed by the `appearance' section to highlight critical regions of interest, and finally `observations' and `verdicts' that ground model predictions in clinical semantics. By modeling both inter- and intra-modal uncertainty, our framework captures the inherent ambiguity in medical images and text, yielding improved representations and performance on downstream tasks. Our model demonstrates significant advances in medical image-text pre-training, obtaining state-of-the-art performance on multiple downstream tasks.

X-Ray Classification Chest Methodology In Silico Benchmark SOTA GenAI

BioAug-Net: a bioimage sensor-driven attention-augmented segmentation framework with physiological coupling for early prostate cancer detection in T2-weighted MRI.

Arshad M, Wang C, Us Sima MW, Ali Shaikh J, Karamti H, Alharthi R, Selecky J

•papers•Jul 29 2025

Accurate segmentation of the prostate peripheral zone (PZ) in T2-weighted MRI is critical for the early detection of prostate cancer. Existing segmentation methods are hindered by significant inter-observer variability (37.4 ± 5.6%), poor boundary localization, and the presence of motion artifacts, along with challenges in clinical integration. In this study, we propose BioAug-Net, a novel framework that integrates real-time physiological signal feedback with MRI data, leveraging transformer-based attention mechanisms and a probabilistic clinical decision support system (PCDSS). BioAug-Net features a dual-branch asymmetric attention mechanism: one branch processes spatial MRI features, while the other incorporates temporal sensor signals through a BiGRU-driven adaptive masking module. Additionally, a Markov Decision Process-based PCDSS maps segmentation outputs to clinical PI-RADS scores, with uncertainty quantification. We validated BioAug-Net on a multi-institutional dataset (n=1,542) and demonstrated state-of-the-art performance, achieving a Dice Similarity Coefficient of 89.7% (p < 0.001), sensitivity of 91.2% (p < 0.001), specificity of 88.4% (p < 0.001), and HD95 of 2.14 mm (p < 0.001), outperforming U-Net, Attention U-Net, and TransUNet. Sensor integration improved segmentation accuracy by 12.6% (p < 0.001) and reduced inter-observer variation by 48.3% (p < 0.001). Radiologist evaluations (n=3) confirmed a 15.0% reduction in diagnosis time (p = 0.003) and an increase in inter-reader agreement from K = 0.68 to K = 0.82 (p = 0.001). Our results show that BioAug-Net offers a clinically viable solution for early prostate cancer detection through enhanced physiological coupling and explainable AI diagnostics.

MRI Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Evaluation and analysis of risk factors for fractured vertebral recompression post-percutaneous kyphoplasty: a retrospective cohort study based on logistic regression analysis.

Zhao Y, Li B, Qian L, Chen X, Wang Y, Cui L, Xin Y, Liu L

•papers•Jul 29 2025

Vertebral recompression after percutaneous kyphoplasty (PKP) for osteoporotic vertebral compression fractures (OVCFs) may lead to recurrent pain, deformity, and neurological impairment, compromising prognosis and quality of life. To identify independent risk factors for postoperative recompression and develop predictive models for risk assessment. We retrospectively analyzed 284 OVCF patients treated with PKP, grouped by recompression status. Predictors were screened using univariate and correlation analyses. Multicollinearity was assessed using variance inflation factor (VIF). A multivariable logistic regression model was constructed and validated via 10-fold cross-validation and temporal validation. Five independent predictors were identified: incomplete anterior cortex (odds ratio [OR] = 9.38), high paravertebral muscle fat infiltration (OR = 218.68), low vertebral CT value (OR = 0.87), large Cobb change (OR = 1.45), and high vertebral height recovery rate (OR = 22.64). The logistic regression model achieved strong performance: accuracy 97.67%, precision 97.06%, recall 97.06%, F1 score 97.06%, specificity 98.08%, area under the receiver operating characteristic curve (AUC) 0.998. Machine learning models (e.g., random forest) were also evaluated but did not outperform logistic regression in accuracy or interpretability. Five imaging-based predictors of vertebral recompression were identified. The logistic regression model showed excellent predictive accuracy and generalizability, supporting its clinical utility for early risk stratification and personalized decision-making in OVCF patients undergoing PKP.

CT Classification Musculoskeletal Retrospective Clinical In Silico

Filter Papers

Tags

Development and validation of a cranial ultrasound imaging-based deep learning model for periventricular-intraventricular haemorrhage detection and grading: a two-centre study.

A hybrid M-DbneAlexnet for brain tumour detection using MRI images.

Radiomics meets transformers: A novel approach to tumor segmentation and classification in mammography for breast cancer.

SwinECAT: A Transformer-based fundus disease classification model with Shifted Window Attention and Efficient Channel Attention

Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images

VidFuncta: Towards Generalizable Neural Representations for Ultrasound Videos

GDAIP: A Graph-Based Domain Adaptive Framework for Individual Brain Parcellation

Distribution-Based Masked Medical Vision-Language Model Using Structured Reports

BioAug-Net: a bioimage sensor-driven attention-augmented segmentation framework with physiological coupling for early prostate cancer detection in T2-weighted MRI.

Evaluation and analysis of risk factors for fractured vertebral recompression post-percutaneous kyphoplasty: a retrospective cohort study based on logistic regression analysis.

Ready to Sharpen Your Edge?