Sort by:
Page 146 of 1521519 results

X-GRM: Large Gaussian Reconstruction Model for Sparse-view X-rays to Computed Tomography

Yifan Liu, Wuyang Li, Weihao Yu, Chenxin Li, Alexandre Alahi, Max Meng, Yixuan Yuan

arxiv logopreprintMay 21 2025
Computed Tomography serves as an indispensable tool in clinical workflows, providing non-invasive visualization of internal anatomical structures. Existing CT reconstruction works are limited to small-capacity model architecture, inflexible volume representation, and small-scale training data. In this paper, we present X-GRM (X-ray Gaussian Reconstruction Model), a large feedforward model for reconstructing 3D CT from sparse-view 2D X-ray projections. X-GRM employs a scalable transformer-based architecture to encode an arbitrary number of sparse X-ray inputs, where tokens from different views are integrated efficiently. Then, tokens are decoded into a new volume representation, named Voxel-based Gaussian Splatting (VoxGS), which enables efficient CT volume extraction and differentiable X-ray rendering. To support the training of X-GRM, we collect ReconX-15K, a large-scale CT reconstruction dataset containing around 15,000 CT/X-ray pairs across diverse organs, including the chest, abdomen, pelvis, and tooth etc. This combination of a high-capacity model, flexible volume representation, and large-scale training data empowers our model to produce high-quality reconstructions from various testing inputs, including in-domain and out-domain X-ray projections. Project Page: https://github.com/CUHK-AIM-Group/X-GRM.

SAMA-UNet: Enhancing Medical Image Segmentation with Self-Adaptive Mamba-Like Attention and Causal-Resonance Learning

Saqib Qamar, Mohd Fazil, Parvez Ahmad, Ghulam Muhammad

arxiv logopreprintMay 21 2025
Medical image segmentation plays an important role in various clinical applications, but existing models often struggle with the computational inefficiencies and challenges posed by complex medical data. State Space Sequence Models (SSMs) have demonstrated promise in modeling long-range dependencies with linear computational complexity, yet their application in medical image segmentation remains hindered by incompatibilities with image tokens and autoregressive assumptions. Moreover, it is difficult to achieve a balance in capturing both local fine-grained information and global semantic dependencies. To address these challenges, we introduce SAMA-UNet, a novel architecture for medical image segmentation. A key innovation is the Self-Adaptive Mamba-like Aggregated Attention (SAMA) block, which integrates contextual self-attention with dynamic weight modulation to prioritise the most relevant features based on local and global contexts. This approach reduces computational complexity and improves the representation of complex image features across multiple scales. We also suggest the Causal-Resonance Multi-Scale Module (CR-MSM), which enhances the flow of information between the encoder and decoder by using causal resonance learning. This mechanism allows the model to automatically adjust feature resolution and causal dependencies across scales, leading to better semantic alignment between the low-level and high-level features in U-shaped architectures. Experiments on MRI, CT, and endoscopy images show that SAMA-UNet performs better in segmentation accuracy than current methods using CNN, Transformer, and Mamba. The implementation is publicly available at GitHub.

An Exploratory Approach Towards Investigating and Explaining Vision Transformer and Transfer Learning for Brain Disease Detection

Shuvashis Sarker, Shamim Rahim Refat, Faika Fairuj Preotee, Shifat Islam, Tashreef Muhammad, Mohammad Ashraful Hoque

arxiv logopreprintMay 21 2025
The brain is a highly complex organ that manages many important tasks, including movement, memory and thinking. Brain-related conditions, like tumors and degenerative disorders, can be hard to diagnose and treat. Magnetic Resonance Imaging (MRI) serves as a key tool for identifying these conditions, offering high-resolution images of brain structures. Despite this, interpreting MRI scans can be complicated. This study tackles this challenge by conducting a comparative analysis of Vision Transformer (ViT) and Transfer Learning (TL) models such as VGG16, VGG19, Resnet50V2, MobilenetV2 for classifying brain diseases using MRI data from Bangladesh based dataset. ViT, known for their ability to capture global relationships in images, are particularly effective for medical imaging tasks. Transfer learning helps to mitigate data constraints by fine-tuning pre-trained models. Furthermore, Explainable AI (XAI) methods such as GradCAM, GradCAM++, LayerCAM, ScoreCAM, and Faster-ScoreCAM are employed to interpret model predictions. The results demonstrate that ViT surpasses transfer learning models, achieving a classification accuracy of 94.39%. The integration of XAI methods enhances model transparency, offering crucial insights to aid medical professionals in diagnosing brain diseases with greater precision.

Right Ventricular Strain as a Key Feature in Interpretable Machine Learning for Identification of Takotsubo Syndrome: A Multicenter CMR-based Study.

Du Z, Hu H, Shen C, Mei J, Feng Y, Huang Y, Chen X, Guo X, Hu Z, Jiang L, Su Y, Biekan J, Lyv L, Chong T, Pan C, Liu K, Ji J, Lu C

pubmed logopapersMay 21 2025
To develop an interpretable machine learning (ML) model based on cardiac magnetic resonance (CMR) multimodal parameters and clinical data to discriminate Takotsubo syndrome (TTS), acute myocardial infarction (AMI), and acute myocarditis (AM), and to further assess the diagnostic value of right ventricular (RV) strain in TTS. This study analyzed CMR and clinical data of 130 patients from three centers. Key features were selected using least absolute shrinkage and selection operator regression and random forest. Data were split into a training cohort and an internal testing cohort (ITC) in the ratio 7:3, with overfitting avoided using leave-one-out cross-validation and bootstrap methods. Nine ML models were evaluated using standard performance metrics, with Shapley additive explanations (SHAP) analysis used for model interpretation. A total of 11 key features were identified. The extreme gradient boosting model showed the best performance, with an area under the curve (AUC) value of 0.94 (95% CI: 0.85-0.97) in the ITC. Right ventricular basal circumferential strain (RVCS-basal) was the most important feature for identifying TTS. Its absolute value was significantly higher in TTS patients than in AMI and AM patients (-9.93%, -5.21%, and -6.18%, respectively, p < 0.001), with values above -6.55% contributing to a diagnosis of TTS. This study developed an interpretable ternary classification ML model for identifying TTS and used SHAP analysis to elucidate the significant value of RVCS-basal in TTS diagnosis. An online calculator (https://lsszxyy.shinyapps.io/XGboost/) based on this model was developed to provide immediate decision support for clinical use.

Large medical image database impact on generalizability of synthetic CT scan generation.

Boily C, Mazellier JP, Meyer P

pubmed logopapersMay 21 2025
This study systematically examines the impact of training database size and the generalizability of deep learning models for synthetic medical image generation. Specifically, we employ a Cycle-Consistency Generative Adversarial Network (CycleGAN) with softly paired data to synthesize kilovoltage computed tomography (kVCT) images from megavoltage computed tomography (MVCT) scans. Unlike previous works, which were constrained by limited data availability, our study uses an extensive database comprising 4,000 patient CT scans, an order of magnitude larger than prior research, allowing for a more rigorous assessment of database size in medical image translation. We quantitatively evaluate the fidelity of the generated synthetic images using established image similarity metrics, including Mean Absolute Error (MAE) and Structural Similarity Index Measure (SSIM). Beyond assessing image quality, we investigate the model's capacity for generalization by analyzing its performance across diverse patient subgroups, considering factors such as sex, age, and anatomical region. This approach enables a more granular understanding of how dataset composition influences model robustness.

Deep learning radiopathomics based on pretreatment MRI and whole slide images for predicting over survival in locally advanced nasopharyngeal carcinoma.

Yi X, Yu X, Li C, Li J, Cao H, Lu Q, Li J, Hou J

pubmed logopapersMay 21 2025
To develop an integrative radiopathomic model based on deep learning to predict overall survival (OS) in locally advanced nasopharyngeal carcinoma (LANPC) patients. A cohort of 343 LANPC patients with pretreatment MRI and whole slide image (WSI) were randomly divided into training (n = 202), validation (n = 91), and external test (n = 50) sets. For WSIs, a self-attention mechanism was employed to assess the significance of different patches for the prognostic task, aggregating them into a WSI-level representation. For MRI, a multilayer perceptron was used to encode the extracted radiomic features, resulting in an MRI-level representation. These were combined in a multimodal fusion model to produce prognostic predictions. Model performances were evaluated using the concordance index (C-index), and Kaplan-Meier curves were employed for risk stratification. To enhance model interpretability, attention-based and Integrated Gradients techniques were applied to explain how WSIs and MRI features contribute to prognosis predictions. The radiopathomics model achieved high predictive accuracy in predicting the OS, with a C-index of 0.755 (95 % CI: 0.673-0.838) and 0.744 (95 % CI: 0.623-0.808) in the training and validation sets, respectively, outperforming single-modality models (radiomic signature: 0.636, 95 % CI: 0.584-0.688; deep pathomic signature: 0.736, 95 % CI: 0.684-0.810). In the external test, similar findings were observed for the predictive performance of the radiopathomics, radiomic signature, and deep pathomic signature, with their C-indices being 0.735, 0.626, and 0.660 respectively. The radiopathomics model effectively stratified patients into high- and low-risk groups (P < 0.001). Additionally, attention heatmaps revealed that high-attention regions corresponded with tumor areas in both risk groups. n: The radiopathomics model holds promise for predicting clinical outcomes in LANPC patients, offering a potential tool for improving clinical decision-making.

Update on the detection of frailty in older adults: a multicenter cohort machine learning-based study protocol.

Fernández-Carnero S, Martínez-Pozas O, Pecos-Martín D, Pardo-Gómez A, Cuenca-Zaldívar JN, Sánchez-Romero EA

pubmed logopapersMay 21 2025
This study aims to investigate the relationship between muscle activation variables assessed via ultrasound and the comprehensive assessment of geriatric patients, as well as to analyze ultrasound images to determine their correlation with morbimortality factors in frail patients. The present cohort study will be conducted in 500 older adults diagnosed with frailty. A multicenter study will be conducted among the day care centers and nursing homes. This will be achieved through the evaluation of frail older adults via instrumental and functional tests, along with specific ultrasound images to study sarcopenia and nutrition, followed by a detailed analysis of the correlation between all collected variables. This study aims to investigate the correlation between ultrasound-assessed muscle activation variables and the overall health of geriatric patients. It addresses the limitations of previous research by including a large sample size of 500 patients and measuring various muscle parameters beyond thickness. Additionally, it aims to analyze ultrasound images to identify markers associated with higher risk of complications in frail patients. The study involves frail older adults undergoing functional tests and specific ultrasound examinations. A comprehensive analysis of functional, ultrasound, and nutritional variables will be conducted to understand their correlation with overall health and risk of complications in frail older patients. The study was approved by the Research Ethics Committee of the Hospital Universitario Puerta de Hierro, Madrid, Spain (Act nº 18/2023). In addition, the study was registered with https://clinicaltrials.gov/ (NCT06218121).

Performance of multimodal prediction models for intracerebral hemorrhage outcomes using real-world data.

Matsumoto K, Suzuki M, Ishihara K, Tokunaga K, Matsuda K, Chen J, Yamashiro S, Soejima H, Nakashima N, Kamouchi M

pubmed logopapersMay 21 2025
We aimed to develop and validate multimodal models integrating computed tomography (CT) images, text and tabular clinical data to predict poor functional outcomes and in-hospital mortality in patients with intracerebral hemorrhage (ICH). These models were designed to assist non-specialists in emergency settings with limited access to stroke specialists. A retrospective analysis of 527 patients with ICH admitted to a Japanese tertiary hospital between April 2019 and February 2022 was conducted. Deep learning techniques were used to extract features from three-dimensional CT images and unstructured data, which were then combined with tabular data to develop an L1-regularized logistic regression model to predict poor functional outcomes (modified Rankin scale score 3-6) and in-hospital mortality. The model's performance was evaluated by assessing discrimination metrics, calibration plots, and decision curve analysis (DCA) using temporal validation data. The multimodal model utilizing both imaging and text data, such as medical interviews, exhibited the highest performance in predicting poor functional outcomes. In contrast, the model that combined imaging with tabular data, including physiological and laboratory results, demonstrated the best predictive performance for in-hospital mortality. These models exhibited high discriminative performance, with areas under the receiver operating curve (AUROCs) of 0.86 (95% CI: 0.79-0.92) and 0.91 (95% CI: 0.84-0.96) for poor functional outcomes and in-hospital mortality, respectively. Calibration was satisfactory for predicting poor functional outcomes, but requires refinement for mortality prediction. The models performed similar to or better than conventional risk scores, and DCA curves supported their clinical utility. Multimodal prediction models have the potential to aid non-specialists in making informed decisions regarding ICH cases in emergency departments as part of clinical decision support systems. Enhancing real-world data infrastructure and improving model calibration are essential for successful implementation in clinical practice.

X-GRM: Large Gaussian Reconstruction Model for Sparse-view X-rays to Computed Tomography

Yifan Liu, Wuyang Li, Weihao Yu, Chenxin Li, Alexandre Alahi, Max Meng, Yixuan Yuan

arxiv logopreprintMay 21 2025
Computed Tomography serves as an indispensable tool in clinical workflows, providing non-invasive visualization of internal anatomical structures. Existing CT reconstruction works are limited to small-capacity model architecture and inflexible volume representation. In this work, we present X-GRM (X-ray Gaussian Reconstruction Model), a large feedforward model for reconstructing 3D CT volumes from sparse-view 2D X-ray projections. X-GRM employs a scalable transformer-based architecture to encode sparse-view X-ray inputs, where tokens from different views are integrated efficiently. Then, these tokens are decoded into a novel volume representation, named Voxel-based Gaussian Splatting (VoxGS), which enables efficient CT volume extraction and differentiable X-ray rendering. This combination of a high-capacity model and flexible volume representation, empowers our model to produce high-quality reconstructions from various testing inputs, including in-domain and out-domain X-ray projections. Our codes are available at: https://github.com/CUHK-AIM-Group/X-GRM.

Cardiac Magnetic Resonance Imaging in the German National Cohort: Automated Segmentation of Short-Axis Cine Images and Post-Processing Quality Control

Full, P. M., Schirrmeister, R. T., Hein, M., Russe, M. F., Reisert, M., Ammann, C., Greiser, K. H., Niendorf, T., Pischon, T., Schulz-Menger, J., Maier-Hein, K. H., Bamberg, F., Rospleszcz, S., Schlett, C. L., Schuppert, C.

medrxiv logopreprintMay 21 2025
PurposeTo develop a segmentation and quality control pipeline for short-axis cardiac magnetic resonance (CMR) cine images from the prospective, multi-center German National Cohort (NAKO). Materials and MethodsA deep learning model for semantic segmentation, based on the nnU-Net architecture, was applied to full-cycle short-axis cine images from 29,908 baseline participants. The primary objective was to determine data on structure and function for both ventricles (LV, RV), including end diastolic volumes (EDV), end systolic volumes (ESV), and LV myocardial mass. Quality control measures included a visual assessment of outliers in morphofunctional parameters, inter- and intra-ventricular phase differences, and LV time-volume curves (TVC). These were adjudicated using a five-point rating scale, ranging from five (excellent) to one (non-diagnostic), with ratings of three or lower subject to exclusion. The predictive value of outlier criteria for inclusion and exclusion was analyzed using receiver operating characteristics. ResultsThe segmentation model generated complete data for 29,609 participants (incomplete in 1.0%) and 5,082 cases (17.0 %) were visually assessed. Quality assurance yielded a sample of 26,899 participants with excellent or good quality (89.9%; exclusion of 1,875 participants due to image quality issues and 835 cases due to segmentation quality issues). TVC was the strongest single discriminator between included and excluded participants (AUC: 0.684). Of the two-category combinations, the pairing of TVC and phases provided the greatest improvement over TVC alone (AUC difference: 0.044; p<0.001). The best performance was observed when all three categories were combined (AUC: 0.748). Extending the quality-controlled sample to include acceptable quality ratings, a total of 28,413 (95.0%) participants were available. ConclusionThe implemented pipeline facilitated the automated segmentation of an extensive CMR dataset, integrating quality control measures. This methodology ensures that ensuing quantitative analyses are conducted with a diminished risk of bias.
Page 146 of 1521519 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.