Sort by:
Page 66 of 78773 results

Standardizing Heterogeneous MRI Series Description Metadata Using Large Language Models.

Kamel PI, Doo FX, Savani D, Kanhere A, Yi PH, Parekh VS

pubmed logopapersMay 29 2025
MRI metadata, particularly free-text series descriptions (SDs) used to identify sequences, are highly heterogeneous due to variable inputs by manufacturers and technologists. This variability poses challenges in correctly identifying series for hanging protocols and dataset curation. The purpose of this study was to evaluate the ability of large language models (LLMs) to automatically classify MRI SDs. We analyzed non-contrast brain MRIs performed between 2016 and 2022 at our institution, identifying all unique SDs in the metadata. A practicing neuroradiologist manually classified the SD text into: "T1," "T2," "T2/FLAIR," "SWI," "DWI," ADC," or "Other." Then, various LLMs, including GPT 3.5 Turbo, GPT-4, GPT-4o, Llama 3 8b, and Llama 3 70b, were asked to classify each SD into one of the sequence categories. Model performances were compared to ground truth classification using area under the curve (AUC) as the primary metric. Additionally, GPT-4o was tasked with generating regular expression templates to match each category. In 2510 MRI brain examinations, there were 1395 unique SDs, with 727/1395 (52.1%) appearing only once, indicating high variability. GPT-4o demonstrated the highest performance, achieving an average AUC of 0.983 ± 0.020 for all series with detailed prompting. GPT models significantly outperformed Llama models, with smaller differences within the GPT family. Regular expression generation was inconsistent, demonstrating an average AUC of 0.774 ± 0.161 for all sequences. Our findings suggest that LLMs are effective for interpreting and standardizing heterogeneous MRI SDs.

Can Large Language Models Challenge CNNs in Medical Image Analysis?

Shibbir Ahmed, Shahnewaz Karim Sakib, Anindya Bijoy Das

arxiv logopreprintMay 29 2025
This study presents a multimodal AI framework designed for precisely classifying medical diagnostic images. Utilizing publicly available datasets, the proposed system compares the strengths of convolutional neural networks (CNNs) and different large language models (LLMs). This in-depth comparative analysis highlights key differences in diagnostic performance, execution efficiency, and environmental impacts. Model evaluation was based on accuracy, F1-score, average execution time, average energy consumption, and estimated $CO_2$ emission. The findings indicate that although CNN-based models can outperform various multimodal techniques that incorporate both images and contextual information, applying additional filtering on top of LLMs can lead to substantial performance gains. These findings highlight the transformative potential of multimodal AI systems to enhance the reliability, efficiency, and scalability of medical diagnostics in clinical settings.

Prediction of clinical stages of cervical cancer via machine learning integrated with clinical features and ultrasound-based radiomics.

Zhang M, Zhang Q, Wang X, Peng X, Chen J, Yang H

pubmed logopapersMay 29 2025
To investigate the prediction of a model constructed by combining machine learning (ML) with clinical features and ultrasound radiomics in the clinical staging of cervical cancer. General clinical and ultrasound data of 227 patients with cervical cancer who received transvaginal ultrasonography were retrospectively analyzed. The region of interest (ROI) radiomics profiles of the original image and derived image were retrieved and profile screening was performed. The chosen profiles were employed in radiomics model and Radscore formula construction. Prediction models were developed utilizing several ML algorithms by Python based on an integrated dataset of clinical features and ultrasound radiomics. Model performances were evaluated via AUC. Plot calibration curves and clinical decision curves were used to assess model efficacy. The model developed by support vector machine (SVM) emerged as the superior model. Integrating clinical characteristics with ultrasound radiomics, it showed notable performance metrics in both the training and validation datasets. Specifically, in the training set, the model obtained an AUC of 0.88 (95% Confidence Interval (CI): 0.83-0.93), alongside a 0.84 accuracy, 0.68 sensitivity, and 0.91 specificity. When validated, the model maintained an AUC of 0.77 (95% CI: 0.63-0.88), with 0.77 accuracy, 0.62 sensitivity, and 0.83 specificity. The calibration curve aligned closely with the perfect calibration line. Additionally, based on the clinical decision curve analysis, the model offers clinical utility over wide-ranging threshold possibilities. The clinical- and radiomics-based SVM model provides a noninvasive tool for predicting cervical cancer stage, integrating ultrasound radiomics and key clinical factors (age, abortion history) to improve risk stratification. This approach could guide personalized treatment (surgery vs. chemoradiation) and optimize staging accuracy, particularly in resource-limited settings where advanced imaging is scarce.

Can Large Language Models Challenge CNNS in Medical Image Analysis?

Shibbir Ahmed, Shahnewaz Karim Sakib, Anindya Bijoy Das

arxiv logopreprintMay 29 2025
This study presents a multimodal AI framework designed for precisely classifying medical diagnostic images. Utilizing publicly available datasets, the proposed system compares the strengths of convolutional neural networks (CNNs) and different large language models (LLMs). This in-depth comparative analysis highlights key differences in diagnostic performance, execution efficiency, and environmental impacts. Model evaluation was based on accuracy, F1-score, average execution time, average energy consumption, and estimated $CO_2$ emission. The findings indicate that although CNN-based models can outperform various multimodal techniques that incorporate both images and contextual information, applying additional filtering on top of LLMs can lead to substantial performance gains. These findings highlight the transformative potential of multimodal AI systems to enhance the reliability, efficiency, and scalability of medical diagnostics in clinical settings.

Deep Learning-Based BMD Estimation from Radiographs with Conformal Uncertainty Quantification

Long Hui, Wai Lok Yeung

arxiv logopreprintMay 28 2025
Limited DXA access hinders osteoporosis screening. This proof-of-concept study proposes using widely available knee X-rays for opportunistic Bone Mineral Density (BMD) estimation via deep learning, emphasizing robust uncertainty quantification essential for clinical use. An EfficientNet model was trained on the OAI dataset to predict BMD from bilateral knee radiographs. Two Test-Time Augmentation (TTA) methods were compared: traditional averaging and a multi-sample approach. Crucially, Split Conformal Prediction was implemented to provide statistically rigorous, patient-specific prediction intervals with guaranteed coverage. Results showed a Pearson correlation of 0.68 (traditional TTA). While traditional TTA yielded better point predictions, the multi-sample approach produced slightly tighter confidence intervals (90%, 95%, 99%) while maintaining coverage. The framework appropriately expressed higher uncertainty for challenging cases. Although anatomical mismatch between knee X-rays and standard DXA limits immediate clinical use, this method establishes a foundation for trustworthy AI-assisted BMD screening using routine radiographs, potentially improving early osteoporosis detection.

Artificial Intelligence Augmented Cerebral Nuclear Imaging.

Currie GM, Hawk KE

pubmed logopapersMay 28 2025
Artificial intelligence (AI), particularly machine learning (ML) and deep learning (DL), has significant potential to advance the capabilities of nuclear neuroimaging. The current and emerging applications of ML and DL in the processing, analysis, enhancement and interpretation of SPECT and PET imaging are explored for brain imaging. Key developments include automated image segmentation, disease classification, and radiomic feature extraction, including lower dimensionality first and second order radiomics, higher dimensionality third order radiomics and more abstract fourth order deep radiomics. DL-based reconstruction, attenuation correction using pseudo-CT generation, and denoising of low-count studies have a role in enhancing image quality. AI has a role in sustainability through applications in radioligand design and preclinical imaging while federated learning addresses data security challenges to improve research and development in nuclear cerebral imaging. There is also potential for generative AI to transform the nuclear cerebral imaging space through solutions to data limitations, image enhancement, patient-centered care, workflow efficiencies and trainee education. Innovations in ML and DL are re-engineering the nuclear neuroimaging ecosystem and reimagining tomorrow's precision medicine landscape.

Efficient feature extraction using light-weight CNN attention-based deep learning architectures for ultrasound fetal plane classification.

Sivasubramanian A, Sasidharan D, Sowmya V, Ravi V

pubmed logopapersMay 28 2025
Ultrasound fetal imaging is beneficial to support prenatal development because it is affordable and non-intrusive. Nevertheless, fetal plane classification (FPC) remains challenging and time-consuming for obstetricians since it depends on nuanced clinical aspects, which increases the difficulty in identifying relevant features of the fetal anatomy. Thus, to assist with its accurate feature extraction, a lightweight artificial intelligence architecture leveraging convolutional neural networks and attention mechanisms is proposed to classify the largest benchmark ultrasound dataset. The approach fine-tunes from lightweight EfficientNet feature extraction backbones pre-trained on the ImageNet1k. to classify key fetal planes such as the brain, femur, thorax, cervix, and abdomen. Our methodology incorporates the attention mechanism to refine features and 3-layer perceptrons for classification, achieving superior performance with the highest Top-1 accuracy of 96.25%, Top-2 accuracy of 99.80% and F1-Score of 0.9576. Importantly, the model has 40x fewer trainable parameters than existing benchmark ensemble or transformer pipelines, facilitating easy deployment on edge devices to help clinical practitioners with real-time FPC. The findings are also interpreted using GradCAM to carry out clinical correlation to aid doctors with diagnostics and improve treatment plans for expectant mothers.

High Volume Rate 3D Ultrasound Reconstruction with Diffusion Models

Tristan S. W. Stevens, Oisín Nolan, Oudom Somphone, Jean-Luc Robert, Ruud J. G. van Sloun

arxiv logopreprintMay 28 2025
Three-dimensional ultrasound enables real-time volumetric visualization of anatomical structures. Unlike traditional 2D ultrasound, 3D imaging reduces the reliance on precise probe orientation, potentially making ultrasound more accessible to clinicians with varying levels of experience and improving automated measurements and post-exam analysis. However, achieving both high volume rates and high image quality remains a significant challenge. While 3D diverging waves can provide high volume rates, they suffer from limited tissue harmonic generation and increased multipath effects, which degrade image quality. One compromise is to retain the focusing in elevation while leveraging unfocused diverging waves in the lateral direction to reduce the number of transmissions per elevation plane. Reaching the volume rates achieved by full 3D diverging waves, however, requires dramatically undersampling the number of elevation planes. Subsequently, to render the full volume, simple interpolation techniques are applied. This paper introduces a novel approach to 3D ultrasound reconstruction from a reduced set of elevation planes by employing diffusion models (DMs) to achieve increased spatial and temporal resolution. We compare both traditional and supervised deep learning-based interpolation methods on a 3D cardiac ultrasound dataset. Our results show that DM-based reconstruction consistently outperforms the baselines in image quality and downstream task performance. Additionally, we accelerate inference by leveraging the temporal consistency inherent to ultrasound sequences. Finally, we explore the robustness of the proposed method by exploiting the probabilistic nature of diffusion posterior sampling to quantify reconstruction uncertainty and demonstrate improved recall on out-of-distribution data with synthetic anomalies under strong subsampling.

Look & Mark: Leveraging Radiologist Eye Fixations and Bounding boxes in Multimodal Large Language Models for Chest X-ray Report Generation

Yunsoo Kim, Jinge Wu, Su-Hwan Kim, Pardeep Vasudev, Jiashu Shen, Honghan Wu

arxiv logopreprintMay 28 2025
Recent advancements in multimodal Large Language Models (LLMs) have significantly enhanced the automation of medical image analysis, particularly in generating radiology reports from chest X-rays (CXR). However, these models still suffer from hallucinations and clinically significant errors, limiting their reliability in real-world applications. In this study, we propose Look & Mark (L&M), a novel grounding fixation strategy that integrates radiologist eye fixations (Look) and bounding box annotations (Mark) into the LLM prompting framework. Unlike conventional fine-tuning, L&M leverages in-context learning to achieve substantial performance gains without retraining. When evaluated across multiple domain-specific and general-purpose models, L&M demonstrates significant gains, including a 1.2% improvement in overall metrics (A.AVG) for CXR-LLaVA compared to baseline prompting and a remarkable 9.2% boost for LLaVA-Med. General-purpose models also benefit from L&M combined with in-context learning, with LLaVA-OV achieving an 87.3% clinical average performance (C.AVG)-the highest among all models, even surpassing those explicitly trained for CXR report generation. Expert evaluations further confirm that L&M reduces clinically significant errors (by 0.43 average errors per report), such as false predictions and omissions, enhancing both accuracy and reliability. These findings highlight L&M's potential as a scalable and efficient solution for AI-assisted radiology, paving the way for improved diagnostic workflows in low-resource clinical settings.
Page 66 of 78773 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.