Sort by:
Page 35 of 3993982 results

Automated ultrasound doppler angle estimation using deep learning

Nilesh Patil, Ajay Anand

arxiv logopreprintAug 6 2025
Angle estimation is an important step in the Doppler ultrasound clinical workflow to measure blood velocity. It is widely recognized that incorrect angle estimation is a leading cause of error in Doppler-based blood velocity measurements. In this paper, we propose a deep learning-based approach for automated Doppler angle estimation. The approach was developed using 2100 human carotid ultrasound images including image augmentation. Five pre-trained models were used to extract images features, and these features were passed to a custom shallow network for Doppler angle estimation. Independently, measurements were obtained by a human observer reviewing the images for comparison. The mean absolute error (MAE) between the automated and manual angle estimates ranged from 3.9{\deg} to 9.4{\deg} for the models evaluated. Furthermore, the MAE for the best performing model was less than the acceptable clinical Doppler angle error threshold thus avoiding misclassification of normal velocity values as a stenosis. The results demonstrate potential for applying a deep-learning based technique for automated ultrasound Doppler angle estimation. Such a technique could potentially be implemented within the imaging software on commercial ultrasound scanners.

TCSAFormer: Efficient Vision Transformer with Token Compression and Sparse Attention for Medical Image Segmentation

Zunhui Xia, Hongxing Li, Libin Lan

arxiv logopreprintAug 6 2025
In recent years, transformer-based methods have achieved remarkable progress in medical image segmentation due to their superior ability to capture long-range dependencies. However, these methods typically suffer from two major limitations. First, their computational complexity scales quadratically with the input sequences. Second, the feed-forward network (FFN) modules in vanilla Transformers typically rely on fully connected layers, which limits models' ability to capture local contextual information and multiscale features critical for precise semantic segmentation. To address these issues, we propose an efficient medical image segmentation network, named TCSAFormer. The proposed TCSAFormer adopts two key ideas. First, it incorporates a Compressed Attention (CA) module, which combines token compression and pixel-level sparse attention to dynamically focus on the most relevant key-value pairs for each query. This is achieved by pruning globally irrelevant tokens and merging redundant ones, significantly reducing computational complexity while enhancing the model's ability to capture relationships between tokens. Second, it introduces a Dual-Branch Feed-Forward Network (DBFFN) module as a replacement for the standard FFN to capture local contextual features and multiscale information, thereby strengthening the model's feature representation capability. We conduct extensive experiments on three publicly available medical image segmentation datasets: ISIC-2018, CVC-ClinicDB, and Synapse, to evaluate the segmentation performance of TCSAFormer. Experimental results demonstrate that TCSAFormer achieves superior performance compared to existing state-of-the-art (SOTA) methods, while maintaining lower computational overhead, thus achieving an optimal trade-off between efficiency and accuracy.

Small Lesions-aware Bidirectional Multimodal Multiscale Fusion Network for Lung Disease Classification

Jianxun Yu, Ruiquan Ge, Zhipeng Wang, Cheng Yang, Chenyu Lin, Xianjun Fu, Jikui Liu, Ahmed Elazab, Changmiao Wang

arxiv logopreprintAug 6 2025
The diagnosis of medical diseases faces challenges such as the misdiagnosis of small lesions. Deep learning, particularly multimodal approaches, has shown great potential in the field of medical disease diagnosis. However, the differences in dimensionality between medical imaging and electronic health record data present challenges for effective alignment and fusion. To address these issues, we propose the Multimodal Multiscale Cross-Attention Fusion Network (MMCAF-Net). This model employs a feature pyramid structure combined with an efficient 3D multi-scale convolutional attention module to extract lesion-specific features from 3D medical images. To further enhance multimodal data integration, MMCAF-Net incorporates a multi-scale cross-attention module, which resolves dimensional inconsistencies, enabling more effective feature fusion. We evaluated MMCAF-Net on the Lung-PET-CT-Dx dataset, and the results showed a significant improvement in diagnostic accuracy, surpassing current state-of-the-art methods. The code is available at https://github.com/yjx1234/MMCAF-Net

Towards Globally Predictable k-Space Interpolation: A White-box Transformer Approach

Chen Luo, Qiyu Jin, Taofeng Xie, Xuemei Wang, Huayu Wang, Congcong Liu, Liming Tang, Guoqing Chen, Zhuo-Xu Cui, Dong Liang

arxiv logopreprintAug 6 2025
Interpolating missing data in k-space is essential for accelerating imaging. However, existing methods, including convolutional neural network-based deep learning, primarily exploit local predictability while overlooking the inherent global dependencies in k-space. Recently, Transformers have demonstrated remarkable success in natural language processing and image analysis due to their ability to capture long-range dependencies. This inspires the use of Transformers for k-space interpolation to better exploit its global structure. However, their lack of interpretability raises concerns regarding the reliability of interpolated data. To address this limitation, we propose GPI-WT, a white-box Transformer framework based on Globally Predictable Interpolation (GPI) for k-space. Specifically, we formulate GPI from the perspective of annihilation as a novel k-space structured low-rank (SLR) model. The global annihilation filters in the SLR model are treated as learnable parameters, and the subgradients of the SLR model naturally induce a learnable attention mechanism. By unfolding the subgradient-based optimization algorithm of SLR into a cascaded network, we construct the first white-box Transformer specifically designed for accelerated MRI. Experimental results demonstrate that the proposed method significantly outperforms state-of-the-art approaches in k-space interpolation accuracy while providing superior interpretability.

UNISELF: A Unified Network with Instance Normalization and Self-Ensembled Lesion Fusion for Multiple Sclerosis Lesion Segmentation

Jinwei Zhang, Lianrui Zuo, Blake E. Dewey, Samuel W. Remedios, Yihao Liu, Savannah P. Hays, Dzung L. Pham, Ellen M. Mowry, Scott D. Newsome, Peter A. Calabresi, Aaron Carass, Jerry L. Prince

arxiv logopreprintAug 6 2025
Automated segmentation of multiple sclerosis (MS) lesions using multicontrast magnetic resonance (MR) images improves efficiency and reproducibility compared to manual delineation, with deep learning (DL) methods achieving state-of-the-art performance. However, these DL-based methods have yet to simultaneously optimize in-domain accuracy and out-of-domain generalization when trained on a single source with limited data, or their performance has been unsatisfactory. To fill this gap, we propose a method called UNISELF, which achieves high accuracy within a single training domain while demonstrating strong generalizability across multiple out-of-domain test datasets. UNISELF employs a novel test-time self-ensembled lesion fusion to improve segmentation accuracy, and leverages test-time instance normalization (TTIN) of latent features to address domain shifts and missing input contrasts. Trained on the ISBI 2015 longitudinal MS segmentation challenge training dataset, UNISELF ranks among the best-performing methods on the challenge test dataset. Additionally, UNISELF outperforms all benchmark methods trained on the same ISBI training data across diverse out-of-domain test datasets with domain shifts and missing contrasts, including the public MICCAI 2016 and UMCL datasets, as well as a private multisite dataset. These test datasets exhibit domain shifts and/or missing contrasts caused by variations in acquisition protocols, scanner types, and imaging artifacts arising from imperfect acquisition. Our code is available at https://github.com/uponacceptance.

Predicting language outcome after stroke using machine learning: in search of the big data benefit.

Saranti M, Neville D, White A, Rotshtein P, Hope TMH, Price CJ, Bowman H

pubmed logopapersAug 6 2025
Accurate prediction of post-stroke language outcomes using machine learning offers the potential to enhance clinical treatment and rehabilitation for aphasic patients. This study of 758 English speaking stroke patients from the PLORAS project explores the impact of sample size on the performance of logistic regression and a deep learning (ResNet-18) model in predicting language outcomes from neuroimaging and impairment-relevant tabular data. We assessed the performance of both models on two key language tasks from the Comprehensive Aphasia Test: Spoken Picture Description and Naming, using a learning curve approach. Contrary to expectations, the simpler logistic regression model performed comparably or better than the deep learning model (with overlapping confidence intervals), with both models showing an accuracy plateau around 80% for sample sizes larger than 300 patients. Principal Component Analysis revealed that the dimensionality of the neuroimaging data could be reduced to as few as 20 (or even 2) dominant components without significant loss in accuracy, suggesting that classification may be driven by simple patterns such as lesion size. The study highlights both the potential limitations of current dataset size in achieving further accuracy gains and the need for larger datasets to capture more complex patterns, as some of our results indicate that we might not have reached an absolute classification performance ceiling. Overall, these findings provide insights into the practical use of machine learning for predicting aphasia outcomes and the potential benefits of much larger datasets in enhancing model performance.

AI-derived CT biomarker score for robust COVID-19 mortality prediction across multiple waves and regions using machine learning.

De Smet K, De Smet D, De Jaeger P, Dewitte J, Martens GA, Buls N, De Mey J

pubmed logopapersAug 6 2025
This study aimed to develop a simple, interpretable model using routinely available data for predicting COVID-19 mortality at admission, addressing limitations of complex models, and to provide a statistically robust framework for controlled clinical use, managing model uncertainty for responsible healthcare application. Data from Belgium's first COVID-19 wave (UZ Brussel, n = 252) were used for model development. External validation utilized data from unvaccinated patients during the late second and early third waves (AZ Delta, n = 175). Various machine learning methods were trained and compared for diagnostic performance after data preprocessing and feature selection. The final model, the M3-score, incorporated three features: age, white blood cell (WBC) count, and AI-derived total lung involvement (TOTAL<sub>AI</sub>) quantified from CT scans using Icolung software. The M3-score demonstrated strong classification performance in the training cohort (AUC 0.903) and clinically useful performance in the external validation dataset (AUC 0.826), indicating generalizability potential. To enhance clinical utility and interpretability, predicted probabilities were categorized into actionable likelihood ratio (LR) intervals: highly unlikely (LR 0.0), unlikely (LR 0.13), gray zone (LR 0.85), more likely (LR 2.14), and likely (LR 8.19) based on the training cohort. External validation suggested temporal and geographical robustness, though some variability in AUC and LR performance was observed, as anticipated in real-world settings. The parsimonious M3-score, integrating AI-based CT quantification with clinical and laboratory data, offers an interpretable tool for predicting in-hospital COVID-19 mortality, showing robust training performance. Observed performance variations in external validation underscore the need for careful interpretation and further extensive validation across international cohorts to confirm wider applicability and robustness before widespread clinical adoption.

The development of a multimodal prediction model based on CT and MRI for the prognosis of pancreatic cancer.

Dou Z, Lin J, Lu C, Ma X, Zhang R, Zhu J, Qin S, Xu C, Li J

pubmed logopapersAug 6 2025
To develop and validate a hybrid radiomics model to predict the overall survival in pancreatic cancer patients and identify risk factors that affect patient prognosis. We conducted a retrospective analysis of 272 pancreatic cancer patients diagnosed at the First Affiliated Hospital of Soochow University from January 2013 to December 2023, and divided them into a training set and a test set at a ratio of 7:3. Pre-treatment contrast-enhanced computed tomography (CT), magnetic resonance imaging (MRI) images, and clinical features were collected. Dimensionality reduction was performed on the radiomics features using principal component analysis (PCA), and important features with non-zero coefficients were selected using the least absolute shrinkage and selection operator (LASSO) with 10-fold cross-validation. In the training set, we built clinical prediction models using both random survival forests (RSF) and traditional Cox regression analysis. These models included a radiomics model based on contrast-enhanced CT, a radiomics model based on MRI, a clinical model, 3 bimodal models combining two types of features, and a multimodal model combining radiomics features with clinical features. Model performance evaluation in the test set was based on two dimensions: discrimination and calibration. In addition, risk stratification was performed in the test set based on predicted risk scores to evaluate the model's prognostic utility. The RSF-based hybrid model performed best with a C-index of 0.807 and a Brier score of 0.101, outperforming the COX hybrid model (C-index of 0.726 and a Brier score of 0.145) and other unimodal and bimodal models. The SurvSHAP(t) plot highlighted CA125 as the most important variable. In the test set, patients were stratified into high- and low-risk groups based on the predicted risk scores, and Kaplan-Meier analysis demonstrated a significant survival difference between the two groups (p < 0.0001). A multi-modal model using radiomics based on clinical tabular data and contrast-enhanced CT and MRI was developed by RSF, presenting strengths in predicting prognosis in pancreatic cancer patients.

Development and validation of the multidimensional machine learning model for preoperative risk stratification in papillary thyroid carcinoma: a multicenter, retrospective cohort study.

Feng JW, Zhang L, Yang YX, Qin RJ, Liu SQ, Qin AC, Jiang Y

pubmed logopapersAug 6 2025
This study aims to develop and validate a multi-modal machine learning model for preoperative risk stratification in papillary thyroid carcinoma (PTC), addressing limitations of current systems that rely on postoperative pathological features. We analyzed 974 PTC patients from three medical centers in China using a multi-modal approach integrating: (1) clinical indicators, (2) immunological indices, (3) ultrasound radiomics features, and (4) CT radiomics features. Our methodology employed gradient boosting machine for feature selection and random forest for classification, with model interpretability provided through SHapley Additive exPlanations (SHAP) analysis. The model was validated on internal (n = 225) and two external cohorts (n = 51, n = 174). The final 15-feature model achieved AUCs of 0.91, 0.84, and 0.77 across validation cohorts, improving to 0.96, 0.95, and 0.89 after cohort-specific refitting. SHAP analysis revealed CT texture features, ultrasound morphological features, and immune-inflammatory markers as key predictors, with consistent patterns across validation sites despite center-specific variations. Subgroup analysis showed superior performance in tumors > 1 cm and patients without extrathyroidal extension. Our multi-modal machine learning approach provides accurate preoperative risk stratification for PTC with robust cross-center applicability. This computational framework for integrating heterogeneous imaging and clinical data demonstrates the potential of multi-modal joint learning in healthcare imaging to transform clinical decision-making by enabling personalized treatment planning.

Clinical information prompt-driven retinal fundus image for brain health evaluation.

Tong N, Hui Y, Gou SP, Chen LX, Wang XH, Chen SH, Li J, Li XS, Wu YT, Wu SL, Wang ZC, Sun J, Lv H

pubmed logopapersAug 6 2025
Brain volume measurement serves as a critical approach for assessing brain health status. Considering the close biological connection between the eyes and brain, this study aims to investigate the feasibility of estimating brain volume through retinal fundus imaging integrated with clinical metadata, and to offer a cost-effective approach for assessing brain health. Based on clinical information, retinal fundus images, and neuroimaging data derived from a multicenter, population-based cohort study, the KaiLuan Study, we proposed a cross-modal correlation representation (CMCR) network to elucidate the intricate co-degenerative relationships between the eyes and brain for 755 subjects. Specifically, individual clinical information, which has been followed up for as long as 12 years, was encoded as a prompt to enhance the accuracy of brain volume estimation. Independent internal validation and external validation were performed to assess the robustness of the proposed model. Root mean square error (RMSE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM) metrics were employed to quantitatively evaluate the quality of synthetic brain images derived from retinal imaging data. The proposed framework yielded average RMSE, PSNR, and SSIM values of 98.23, 35.78 dB, and 0.64, respectively, which significantly outperformed 5 other methods: multi-channel Variational Autoencoder (mcVAE), Pixel-to-Pixel (Pixel2pixel), transformer-based U-Net (TransUNet), multi-scale transformer network (MT-Net), and residual vision transformer (ResViT). The two- (2D) and three-dimensional (3D) visualization results showed that the shape and texture of the synthetic brain images generated by the proposed method most closely resembled those of actual brain images. Thus, the CMCR framework accurately captured the latent structural correlations between the fundus and the brain. The average difference between predicted and actual brain volumes was 61.36 cm<sup>3</sup>, with a relative error of 4.54%. When all of the clinical information (including age and sex, daily habits, cardiovascular factors, metabolic factors, and inflammatory factors) was encoded, the difference was decreased to 53.89 cm<sup>3</sup>, with a relative error of 3.98%. Based on the synthesized brain MR images from retinal fundus images, the volumes of brain tissues could be estimated with high accuracy. This study provides an innovative, accurate, and cost-effective approach to characterize brain health status through readily accessible retinal fundus images. NCT05453877 ( https://clinicaltrials.gov/ ).
Page 35 of 3993982 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.