Sort by:
Page 32 of 42411 results

X-GRM: Large Gaussian Reconstruction Model for Sparse-view X-rays to Computed Tomography

Yifan Liu, Wuyang Li, Weihao Yu, Chenxin Li, Alexandre Alahi, Max Meng, Yixuan Yuan

arxiv logopreprintMay 21 2025
Computed Tomography serves as an indispensable tool in clinical workflows, providing non-invasive visualization of internal anatomical structures. Existing CT reconstruction works are limited to small-capacity model architecture, inflexible volume representation, and small-scale training data. In this paper, we present X-GRM (X-ray Gaussian Reconstruction Model), a large feedforward model for reconstructing 3D CT from sparse-view 2D X-ray projections. X-GRM employs a scalable transformer-based architecture to encode an arbitrary number of sparse X-ray inputs, where tokens from different views are integrated efficiently. Then, tokens are decoded into a new volume representation, named Voxel-based Gaussian Splatting (VoxGS), which enables efficient CT volume extraction and differentiable X-ray rendering. To support the training of X-GRM, we collect ReconX-15K, a large-scale CT reconstruction dataset containing around 15,000 CT/X-ray pairs across diverse organs, including the chest, abdomen, pelvis, and tooth etc. This combination of a high-capacity model, flexible volume representation, and large-scale training data empowers our model to produce high-quality reconstructions from various testing inputs, including in-domain and out-domain X-ray projections. Project Page: https://github.com/CUHK-AIM-Group/X-GRM.

Customized GPT-4V(ision) for radiographic diagnosis: can large language model detect supernumerary teeth?

Aşar EM, İpek İ, Bi Lge K

pubmed logopapersMay 21 2025
With the growing capabilities of language models like ChatGPT to process text and images, this study evaluated their accuracy in detecting supernumerary teeth on periapical radiographs. A customized GPT-4V model (CGPT-4V) was also developed to assess whether domain-specific training could improve diagnostic performance compared to standard GPT-4V and GPT-4o models. One hundred eighty periapical radiographs (90 with and 90 without supernumerary teeth) were evaluated using GPT-4 V, GPT-4o, and a fine-tuned CGPT-4V model. Each image was assessed separately with the standardized prompt "Are there any supernumerary teeth in the radiograph above?" to avoid contextual bias. Three dental experts scored the responses using a three-point Likert scale for positive cases and a binary scale for negatives. Chi-square tests and ROC analysis were used to compare model performances (p < 0.05). Among the three models, CGPT-4 V exhibited the highest accuracy, detecting supernumerary teeth correctly in 91% of cases, compared to 77% for GPT-4o and 63% for GPT-4V. The CGPT-4V model also demonstrated a significantly lower false positive rate (16%) than GPT-4V (42%). A statistically significant difference was found between CGPT-4V and GPT-4o (p < 0.001), while no significant difference was observed between GPT-4V and CGPT-4V or between GPT-4V and GPT-4o. Additionally, CGPT-4V successfully identified multiple supernumerary teeth in radiographs where present. These findings highlight the diagnostic potential of customized GPT models in dental radiology. Future research should focus on multicenter validation, seamless clinical integration, and cost-effectiveness to support real-world implementation.

Deep learning radiopathomics based on pretreatment MRI and whole slide images for predicting over survival in locally advanced nasopharyngeal carcinoma.

Yi X, Yu X, Li C, Li J, Cao H, Lu Q, Li J, Hou J

pubmed logopapersMay 21 2025
To develop an integrative radiopathomic model based on deep learning to predict overall survival (OS) in locally advanced nasopharyngeal carcinoma (LANPC) patients. A cohort of 343 LANPC patients with pretreatment MRI and whole slide image (WSI) were randomly divided into training (n = 202), validation (n = 91), and external test (n = 50) sets. For WSIs, a self-attention mechanism was employed to assess the significance of different patches for the prognostic task, aggregating them into a WSI-level representation. For MRI, a multilayer perceptron was used to encode the extracted radiomic features, resulting in an MRI-level representation. These were combined in a multimodal fusion model to produce prognostic predictions. Model performances were evaluated using the concordance index (C-index), and Kaplan-Meier curves were employed for risk stratification. To enhance model interpretability, attention-based and Integrated Gradients techniques were applied to explain how WSIs and MRI features contribute to prognosis predictions. The radiopathomics model achieved high predictive accuracy in predicting the OS, with a C-index of 0.755 (95 % CI: 0.673-0.838) and 0.744 (95 % CI: 0.623-0.808) in the training and validation sets, respectively, outperforming single-modality models (radiomic signature: 0.636, 95 % CI: 0.584-0.688; deep pathomic signature: 0.736, 95 % CI: 0.684-0.810). In the external test, similar findings were observed for the predictive performance of the radiopathomics, radiomic signature, and deep pathomic signature, with their C-indices being 0.735, 0.626, and 0.660 respectively. The radiopathomics model effectively stratified patients into high- and low-risk groups (P < 0.001). Additionally, attention heatmaps revealed that high-attention regions corresponded with tumor areas in both risk groups. n: The radiopathomics model holds promise for predicting clinical outcomes in LANPC patients, offering a potential tool for improving clinical decision-making.

SAMA-UNet: Enhancing Medical Image Segmentation with Self-Adaptive Mamba-Like Attention and Causal-Resonance Learning

Saqib Qamar, Mohd Fazil, Parvez Ahmad, Ghulam Muhammad

arxiv logopreprintMay 21 2025
Medical image segmentation plays an important role in various clinical applications, but existing models often struggle with the computational inefficiencies and challenges posed by complex medical data. State Space Sequence Models (SSMs) have demonstrated promise in modeling long-range dependencies with linear computational complexity, yet their application in medical image segmentation remains hindered by incompatibilities with image tokens and autoregressive assumptions. Moreover, it is difficult to achieve a balance in capturing both local fine-grained information and global semantic dependencies. To address these challenges, we introduce SAMA-UNet, a novel architecture for medical image segmentation. A key innovation is the Self-Adaptive Mamba-like Aggregated Attention (SAMA) block, which integrates contextual self-attention with dynamic weight modulation to prioritise the most relevant features based on local and global contexts. This approach reduces computational complexity and improves the representation of complex image features across multiple scales. We also suggest the Causal-Resonance Multi-Scale Module (CR-MSM), which enhances the flow of information between the encoder and decoder by using causal resonance learning. This mechanism allows the model to automatically adjust feature resolution and causal dependencies across scales, leading to better semantic alignment between the low-level and high-level features in U-shaped architectures. Experiments on MRI, CT, and endoscopy images show that SAMA-UNet performs better in segmentation accuracy than current methods using CNN, Transformer, and Mamba. The implementation is publicly available at GitHub.

Enhancing pathological myopia diagnosis: a bimodal artificial intelligence approach integrating fundus and optical coherence tomography imaging for precise atrophy, traction and neovascularisation grading.

Xu Z, Yang Y, Chen H, Han R, Han X, Zhao J, Yu W, Yang Z, Chen Y

pubmed logopapersMay 20 2025
Pathological myopia (PM) has emerged as a leading cause of global visual impairment, early detection and precise grading of PM are crucial for timely intervention. The atrophy, traction and neovascularisation (ATN) system is applied to define PM progression and stages with precision. This study focuses on constructing a comprehensive PM image dataset comprising both fundus and optical coherence tomography (OCT) images and developing a bimodal artificial intelligence (AI) classification model for ATN grading in PM. This single-centre retrospective cross-sectional study collected 2760 colour fundus photographs and matching OCT images of PM from January 2019 to November 2022 at Peking Union Medical College Hospital. Ophthalmology specialists labelled and inspected all paired images using the ATN grading system. The AI model used a ResNet-50 backbone and a multimodal multi-instance learning module to enhance interaction across instances from both modalities. Performance comparisons among single-modality fundus, OCT and bimodal AI models were conducted for ATN grading in PM. The bimodality model, dual-deep learning (DL), demonstrated superior accuracy in both detailed multiclassification and biclassification of PM, which aligns well with our observation from instance attention-weight activation maps. The area under the curve for severe PM using dual-DL was 0.9635 (95% CI 0.9380 to 0.9890), compared with 0.9359 (95% CI 0.9027 to 0.9691) for the solely OCT model and 0.9268 (95% CI 0.8915 to 0.9621) for the fundus model. Our novel bimodal AI multiclassification model for PM ATN staging proves accurate and beneficial for public health screening and prompt referral of PM patients.

MedBLIP: Fine-tuning BLIP for Medical Image Captioning

Manshi Limbu, Diwita Banerjee

arxiv logopreprintMay 20 2025
Medical image captioning is a challenging task that requires generating clinically accurate and semantically meaningful descriptions of radiology images. While recent vision-language models (VLMs) such as BLIP, BLIP2, Gemini and ViT-GPT2 show strong performance on natural image datasets, they often produce generic or imprecise captions when applied to specialized medical domains. In this project, we explore the effectiveness of fine-tuning the BLIP model on the ROCO dataset for improved radiology captioning. We compare the fine-tuned BLIP against its zero-shot version, BLIP-2 base, BLIP-2 Instruct and a ViT-GPT2 transformer baseline. Our results demonstrate that domain-specific fine-tuning on BLIP significantly improves performance across both quantitative and qualitative evaluation metrics. We also visualize decoder cross-attention maps to assess interpretability and conduct an ablation study to evaluate the contributions of encoder-only and decoder-only fine-tuning. Our findings highlight the importance of targeted adaptation for medical applications and suggest that decoder-only fine-tuning (encoder-frozen) offers a strong performance baseline with 5% lower training time than full fine-tuning, while full model fine-tuning still yields the best results overall.

Detection of maxillary sinus pathologies using deep learning algorithms.

Aktuna Belgin C, Kurbanova A, Aksoy S, Akkaya N, Orhan K

pubmed logopapersMay 20 2025
Deep learning, a subset of machine learning, is widely utilized in medical applications. Identifying maxillary sinus pathologies before surgical interventions is crucial for ensuring successful treatment outcomes. Cone beam computed tomography (CBCT) is commonly employed for maxillary sinus evaluations due to its high resolution and lower radiation exposure. This study aims to assess the accuracy of artificial intelligence (AI) algorithms in detecting maxillary sinus pathologies from CBCT scans. A dataset comprising 1000 maxillary sinuses (MS) from 500 patients was analyzed using CBCT. Sinuses were categorized based on the presence or absence of pathology, followed by segmentation of the maxillary sinus. Manual segmentation masks were generated using the semiautomatic software ITK-SNAP, which served as a reference for comparison. A convolutional neural network (CNN)-based machine learning model was then implemented to automatically segment maxillary sinus pathologies from CBCT images. To evaluate segmentation accuracy, metrics such as the Dice similarity coefficient (DSC) and intersection over union (IoU) were utilized by comparing AI-generated results with human-generated segmentations. The automated segmentation model achieved a Dice score of 0.923, a recall of 0.979, an IoU of 0.887, an F1 score of 0.970, and a precision of 0.963. This study successfully developed an AI-driven approach for segmenting maxillary sinus pathologies in CBCT images. The findings highlight the potential of this method for rapid and accurate clinical assessment of maxillary sinus conditions using CBCT imaging.

CONSIGN: Conformal Segmentation Informed by Spatial Groupings via Decomposition

Bruno Viti, Elias Karabelas, Martin Holler

arxiv logopreprintMay 20 2025
Most machine learning-based image segmentation models produce pixel-wise confidence scores - typically derived from softmax outputs - that represent the model's predicted probability for each class label at every pixel. While this information can be particularly valuable in high-stakes domains such as medical imaging, these (uncalibrated) scores are heuristic in nature and do not constitute rigorous quantitative uncertainty estimates. Conformal prediction (CP) provides a principled framework for transforming heuristic confidence scores into statistically valid uncertainty estimates. However, applying CP directly to image segmentation ignores the spatial correlations between pixels, a fundamental characteristic of image data. This can result in overly conservative and less interpretable uncertainty estimates. To address this, we propose CONSIGN (Conformal Segmentation Informed by Spatial Groupings via Decomposition), a CP-based method that incorporates spatial correlations to improve uncertainty quantification in image segmentation. Our method generates meaningful prediction sets that come with user-specified, high-probability error guarantees. It is compatible with any pre-trained segmentation model capable of generating multiple sample outputs - such as those using dropout, Bayesian modeling, or ensembles. We evaluate CONSIGN against a standard pixel-wise CP approach across three medical imaging datasets and two COCO dataset subsets, using three different pre-trained segmentation models. Results demonstrate that accounting for spatial structure significantly improves performance across multiple metrics and enhances the quality of uncertainty estimates.

Current trends and emerging themes in utilizing artificial intelligence to enhance anatomical diagnostic accuracy and efficiency in radiotherapy.

Pezzino S, Luca T, Castorina M, Puleo S, Castorina S

pubmed logopapersMay 19 2025
Artificial intelligence (AI) incorporation into healthcare has proven revolutionary, especially in radiotherapy, where accuracy is critical. The purpose of the study is to present patterns and develop topics in the application of AI to improve the precision of anatomical diagnosis, delineation of organs, and therapeutic effectiveness in radiation and radiological imaging. We performed a bibliometric analysis of scholarly articles in the fields starting in 2014. Through an examination of research output from key contributing nations and institutions, an analysis of notable research subjects, and an investigation of trends in scientific terminology pertaining to AI in radiology and radiotherapy. Furthermore, we examined software solutions based on AI in these domains, with a specific emphasis on extracting anatomical features and recognizing organs for the purpose of treatment planning. Our investigation found a significant surge in papers pertaining to AI in the fields since 2014. Institutions such as Emory University and Memorial Sloan-Kettering Cancer Center made substantial contributions to the development of the United States and China as leading research-producing nations. Key study areas encompassed adaptive radiation informed by anatomical alterations, MR-Linac for enhanced vision of soft tissues, and multi-organ segmentation for accurate planning of radiotherapy. An evident increase in the frequency of phrases such as 'radiomics,' 'radiotherapy segmentation,' and 'dosiomics' was noted. The evaluation of AI-based software revealed a wide range of uses in several subdisciplinary fields of radiation and radiology, particularly in improving the identification of anatomical features for treatment planning and identifying organs at risk. The incorporation of AI in anatomical diagnosis in radiological imaging and radiotherapy is progressing rapidly, with substantial capacity to transform the precision of diagnoses and the effectiveness of treatment planning.

Development and Validation an Integrated Deep Learning Model to Assist Eosinophilic Chronic Rhinosinusitis Diagnosis: A Multicenter Study.

Li J, Mao N, Aodeng S, Zhang H, Zhu Z, Wang L, Liu Y, Qi H, Qiao H, Lin Y, Qiu Z, Yang T, Zha Y, Wang X, Wang W, Song X, Lv W

pubmed logopapersMay 19 2025
The assessment of eosinophilic chronic rhinosinusitis (eCRS) lacks accurate non-invasive preoperative prediction methods, relying primarily on invasive histopathological sections. This study aims to use computed tomography (CT) images and clinical parameters to develop an integrated deep learning model for the preoperative identification of eCRS and further explore the biological basis of its predictions. A total of 1098 patients with sinus CT images were included from two hospitals and were divided into training, internal, and external test sets. The region of interest of sinus lesions was manually outlined by an experienced radiologist. We utilized three deep learning models (3D-ResNet, 3D-Xception, and HR-Net) to extract features from CT images and calculate deep learning scores. The clinical signature and deep learning score were inputted into a support vector machine for classification. The receiver operating characteristic curve, sensitivity, specificity, and accuracy were used to evaluate the integrated deep learning model. Additionally, proteomic analysis was performed on 34 patients to explore the biological basis of the model's predictions. The area under the curve of the integrated deep learning model to predict eCRS was 0.851 (95% confidence interval [CI]: 0.77-0.93) and 0.821 (95% CI: 0.78-0.86) in the internal and external test sets. Proteomic analysis revealed that in patients predicted to be eCRS, 594 genes were dysregulated, and some of them were associated with pathways and biological processes such as chemokine signaling pathway. The proposed integrated deep learning model could effectively predict eCRS patients. This study provided a non-invasive way of identifying eCRS to facilitate personalized therapy, which will pave the way toward precision medicine for CRS.
Page 32 of 42411 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.