Sort by:
Page 24 of 1431427 results

Large Language Models in Medical Diagnostics: Scoping Review With Bibliometric Analysis.

Su H, Sun Y, Li R, Zhang A, Yang Y, Xiao F, Duan Z, Chen J, Hu Q, Yang T, Xu B, Zhang Q, Zhao J, Li Y, Li H

pubmed logopapersJun 9 2025
The integration of large language models (LLMs) into medical diagnostics has garnered substantial attention due to their potential to enhance diagnostic accuracy, streamline clinical workflows, and address health care disparities. However, the rapid evolution of LLM research necessitates a comprehensive synthesis of their applications, challenges, and future directions. This scoping review aimed to provide an overview of the current state of research regarding the use of LLMs in medical diagnostics. The study sought to answer four primary subquestions, as follows: (1) Which LLMs are commonly used? (2) How are LLMs assessed in diagnosis? (3) What is the current performance of LLMs in diagnosing diseases? (4) Which medical domains are investigating the application of LLMs? This scoping review was conducted according to the Joanna Briggs Institute Manual for Evidence Synthesis and adheres to the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews). Relevant literature was searched from the Web of Science, PubMed, Embase, IEEE Xplore, and ACM Digital Library databases from 2022 to 2025. Articles were screened and selected based on predefined inclusion and exclusion criteria. Bibliometric analysis was performed using VOSviewer to identify major research clusters and trends. Data extraction included details on LLM types, application domains, and performance metrics. The field is rapidly expanding, with a surge in publications after 2023. GPT-4 and its variants dominated research (70/95, 74% of studies), followed by GPT-3.5 (34/95, 36%). Key applications included disease classification (text or image-based), medical question answering, and diagnostic content generation. LLMs demonstrated high accuracy in specialties like radiology, psychiatry, and neurology but exhibited biases in race, gender, and cost predictions. Ethical concerns, including privacy risks and model hallucination, alongside regulatory fragmentation, were critical barriers to clinical adoption. LLMs hold transformative potential for medical diagnostics but require rigorous validation, bias mitigation, and multimodal integration to address real-world complexities. Future research should prioritize explainable artificial intelligence frameworks, specialty-specific optimization, and international regulatory harmonization to ensure equitable and safe clinical deployment.

Automated detection of spinal bone marrow oedema in axial spondyloarthritis: training and validation using two large phase 3 trial datasets.

Jamaludin A, Windsor R, Ather S, Kadir T, Zisserman A, Braun J, Gensler LS, Østergaard M, Poddubnyy D, Coroller T, Porter B, Ligozio G, Readie A, Machado PM

pubmed logopapersJun 9 2025
To evaluate the performance of machine learning (ML) models for the automated scoring of spinal MRI bone marrow oedema (BMO) in patients with axial spondyloarthritis (axSpA) and compare them with expert scoring. ML algorithms using SpineNet software were trained and validated on 3483 spinal MRIs from 686 axSpA patients across two clinical trial datasets. The scoring pipeline involved (i) detection and labelling of vertebral bodies and (ii) classification of vertebral units for the presence or absence of BMO. Two models were tested: Model 1, without manual segmentation, and Model 2, incorporating an intermediate manual segmentation step. Model outputs were compared with those of human experts using kappa statistics, balanced accuracy, sensitivity, specificity, and AUC. Both models performed comparably to expert readers, regarding presence vs absence of BMO. Model 1 outperformed Model 2, with an AUC of 0.94 (vs 0.88), accuracy of 75.8% (vs 70.5%), and kappa of 0.50 (vs 0.31), using absolute reader consensus scoring as the external reference; this performance was similar to the expert inter-reader accuracy of 76.8% and kappa of 0.47, in a radiographic axSpA dataset. In a non-radiographic axSpA dataset, Model 1 achieved an AUC of 0.97 (vs 0.91 for Model 2), accuracy of 74.6% (vs 70%), and kappa of 0.52 (vs 0.27), comparable to the expert inter-reader accuracy of 74.2% and kappa of 0.46. ML software shows potential for automated MRI BMO assessment in axSpA, offering benefits such as improved consistency, reduced labour costs, and minimised inter- and intra-reader variability. Clinicaltrials.gov, MEASURE 1 study (NCT01358175); PREVENT study (NCT02696031).

Radiomics-based machine learning atherosclerotic carotid artery disease in ultrasound: systematic review with meta-analysis of RQS.

Vacca S, Scicolone R, Pisu F, Cau R, Yang Q, Annoni A, Pontone G, Costa F, Paraskevas KI, Nicolaides A, Suri JS, Saba L

pubmed logopapersJun 9 2025
Stroke, a leading global cause of mortality and neurological disability, is often associated with atherosclerotic carotid artery disease. Distinguishing between symptomatic and asymptomatic carotid artery disease is crucial for appropriate treatment decisions. Radiomics, a quantitative image analysis technique, and machine learning (ML) have emerged as promising tools in Ultrasound (US) imaging, potentially providing a helpful tool in the screening of such lesions. Pubmed, Web of Science and Scopus databases were searched for relevant studies published from January 2005 to May 2023. The Radiomics Quality Score (RQS) was used to assess methodological quality of studies included in the review. The Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) assessed the risk of bias. Sensitivity, specificity, and logarithmic diagnostic odds ratio (logDOR) meta-analyses have been conducted, alongside an influence analysis. RQS assessed methodological quality, revealing an overall low score and consistent findings with other radiology domains. QUADAS-2 indicated an overall low risk, except for two studies with high bias. The meta-analysis demonstrated that radiomics-based ML models for predicting culprit plaques on US had a satisfactory performance, with a sensitivity of 0.84 and specificity of 0.82. The logDOR analysis confirmed the positive results, yielding a pooled logDOR of 3.54. The summary ROC curve provided an AUC of 0.887. Radiomics combined with ML provide high sensitivity and low false positive rate for carotid plaque vulnerability assessment on US. However, current evidence is not definitive, given the low overall study quality and high inter-study heterogeneity. High quality, prospective studies are needed to confirm the potential of these promising techniques.

Multi-task and multi-scale attention network for lymph node metastasis prediction in esophageal cancer.

Yi Y, Wang J, Li Z, Wang L, Ding X, Zhou Q, Huang Y, Li B

pubmed logopapersJun 9 2025
The accurate diagnosis of lymph node metastasis in esophageal squamous cell carcinoma is crucial in the treatment workflow, and the process is often time-consuming for clinicians. Recent deep learning models predicting whether lymph nodes are affected by cancer in esophageal cancer cases suffer from challenging node delineation and hence gain poor diagnosis accuracy. This paper proposes an innovative multi-task and multi-scale attention network (M <math xmlns="http://www.w3.org/1998/Math/MathML"><mmultiscripts><mrow></mrow> <mrow></mrow> <mn>2</mn></mmultiscripts> </math> ANet) to predict lymph node metastasis precisely. The network softly expands the regions of the node mask and subsequently utilizes the expanded mask to aggregate image features, thereby amplifying the node contexts. It additionally proposes a two-branch training strategy that compels the model to simultaneously predict metastasis probability and node masks, fostering a more comprehensive learning process. The node metastasis prediction performance has been evaluated on a self-collected dataset with 177 patients. Our model finally achieves a competitive accuracy of 83.7% on the test set comprising 577 nodes. With the adaptability to intricate patterns and ability to handle data variations, M <math xmlns="http://www.w3.org/1998/Math/MathML"><mmultiscripts><mrow></mrow> <mrow></mrow> <mn>2</mn></mmultiscripts> </math> ANet emerges as a promising tool for robust and comprehensive lymph node metastasis prediction in medical image analysis.

Brain tau PET-based identification and characterization of subpopulations in patients with Alzheimer's disease using deep learning-derived saliency maps.

Li Y, Wang X, Ge Q, Graeber MB, Yan S, Li J, Li S, Gu W, Hu S, Benzinger TLS, Lu J, Zhou Y

pubmed logopapersJun 9 2025
Alzheimer's disease (AD) is a heterogeneous neurodegenerative disorder in which tau neurofibrillary tangles are a pathological hallmark closely associated with cognitive dysfunction and neurodegeneration. In this study, we used brain tau data to investigate AD heterogeneity by identifying and characterizing the subpopulations among patients. We included 615 cognitively normal and 159 AD brain <sup>18</sup>F-flortaucipr PET scans, along with T1-weighted MRI from the Alzheimer Disease Neuroimaging Initiative database. A three dimensional-convolutional neural network model was employed for AD detection using standardized uptake value ratio (SUVR) images. The model-derived saliency maps were generated and employed as informative image features for clustering AD participants. Among the identified subpopulations, statistical analysis of demographics, neuropsychological measures, and SUVR were compared. Correlations between neuropsychological measures and regional SUVRs were assessed. A generalized linear model was utilized to investigate the sex and APOE ε4 interaction effect on regional SUVRs. Two distinct subpopulations of AD patients were revealed, denoted as S<sub>Hi</sub> and S<sub>Lo</sub>. Compared to the S<sub>Lo</sub> group, the S<sub>Hi</sub> group exhibited a significantly higher global tau burden in the brain, but both groups showed similar cognition distribution levels. In the S<sub>Hi</sub> group, the associations between the neuropsychological measurements and regional tau deposition were weakened. Moreover, a significant interaction effect of sex and APOE ε4 on tau deposition was observed in the S<sub>Lo</sub> group, but no such effect was found in the S<sub>Hi</sub> group. Our results suggest that tau tangles, as shown by SUVR, continue to accumulate even when cognitive function plateaus in AD patients, highlighting the advantages of PET in later disease stages. The differing relationships between cognition and tau deposition, and between gender, APOE4, and tau deposition, provide potential for subtype-specific treatments. Targeting gender-specific and genetic factors influencing tau deposition, as well as interventions aimed at tau's impact on cognition, may be effective.

Curriculum check, 2025-equipping radiology residents for AI challenges of tomorrow.

Venugopal VK, Kumar A, Tan MO, Szarf G

pubmed logopapersJun 9 2025
The exponential rise in the artificial intelligence (AI) tools for medical imaging is profoundly impacting the practice of radiology. With over 1000 FDA-cleared AI algorithms now approved for clinical use-many of them designed for radiologic tasks-the responsibility lies with training institutions to ensure that radiology residents are equipped not only to use AI systems, but to critically evaluate, monitor, respond to their output in a safe, ethical manner. This review proposes a comprehensive framework to integrate AI into radiology residency curricula, targeting both essential competencies required of all residents, optional advanced skills for those interested in research or AI development. Core educational strategies include structured didactic instruction, hands-on lab exposure to commercial AI tools, case-based discussions, simulation-based clinical pathways, teaching residents how to interpret model cards, regulatory documentation. Clinical examples such as stroke triage, Urinary tract calculi detection, AI-CAD in mammography, false-positive detection are used to anchor theory in practice. The article also addresses critical domains of AI governance: model transparency, ethical dilemmas, algorithmic bias, the role of residents in human-in-the-loop oversight systems. It outlines mentorship, faculty development strategies to build institutional readiness, proposes a roadmap to future-proof radiology education. This includes exposure to foundation models, vision-language systems, multi-agent workflows, global best practices in post-deployment AI monitoring. This pragmatic framework aims to serve as a guide for residency programs adapting to the next era of radiology practice.

Comparative accuracy of two commercial AI algorithms for musculoskeletal trauma detection in emergency radiographs.

Huhtanen JT, Nyman M, Blanco Sequeiros R, Koskinen SK, Pudas TK, Kajander S, Niemi P, Aronen HJ, Hirvonen J

pubmed logopapersJun 9 2025
Missed fractures are the primary cause of interpretation errors in emergency radiology, and artificial intelligence has recently shown great promise in radiograph interpretation. This study compared the diagnostic performance of two AI algorithms, BoneView and RBfracture, in detecting traumatic abnormalities (fractures and dislocations) in MSK radiographs. AI algorithms analyzed 998 radiographs (585 normal, 413 abnormal), against the consensus of two MSK specialists. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and interobserver agreement (Cohen's Kappa) were calculated. 95% confidence intervals (CI) assessed robustness, and McNemar's tests compared sensitivity and specificity between the AI algorithms. BoneView demonstrated a sensitivity of 0.893 (95% CI: 0.860-0.920), specificity of 0.885 (95% CI: 0.857-0.909), PPV of 0.846, NPV of 0.922, and accuracy of 0.889. RBfracture demonstrated a sensitivity of 0.872 (95% CI: 0.836-0.901), specificity of 0.892 (95% CI: 0.865-0.915), PPV of 0.851, NPV of 0.908, and accuracy of 0.884. No statistically significant differences were found in sensitivity (p = 0.151) or specificity (p = 0.708). Kappa was 0.81 (95% CI: 0.77-0.84), indicating almost perfect agreement between the two AI algorithms. Performance was similar in adults and children. Both AI algorithms struggled more with subtle abnormalities, which constituted 66% and 70% of false negatives but only 20% and 18% of true positives for the two AI algorithms, respectively (p < 0.001). BoneView and RBfracture exhibited high diagnostic performance and almost perfect agreement, with consistent results across adults and children, highlighting the potential of AI in emergency radiograph interpretation.

Hybrid adaptive attention deep supervision-guided U-Net for breast lesion segmentation in ultrasound computed tomography images.

Liu X, Zhou L, Cai M, Zheng H, Zheng S, Wang X, Wang Y, Ding M

pubmed logopapersJun 9 2025
Breast cancer is the second deadliest cancer among women after lung cancer. Though the breast cancer death rate continues to decline in the past 20 years, the stages IV and III breast cancer death rates remain high. Therefore, an automated breast cancer diagnosis system is of great significance for early screening of breast lesions to improve the survival rate of patients. This paper proposes a deep learning-based network hybrid adaptive attention deep supervision-guided U-Net (HAA-DSUNet) for breast lesion segmentation of breast ultrasound computed tomography (BUCT) images, which replaces the traditionally sampled convolution module of U-Net with the hybrid adaptive attention module (HAAM), aiming to enlarge the receptive field and probe rich global features while preserving fine details. Moreover, we apply the contrast loss to intermediate outputs as deep supervision to minimize the information loss during upsampling. Finally, the segmentation prediction results are further processed by filtering, segmentation, and morphology to obtain the final results. We conducted the experiment on our two UCT image datasets HCH and HCH-PHMC, and the highest Dice score is 0.8729 and IoU is 0.8097, which outperform all the other state-of-the-art methods. It is demonstrated that our algorithm is effective in segmenting the legion from BUCT images.

Evaluation of AI diagnostic systems for breast ultrasound: comparative analysis with radiologists and the effect of AI assistance.

Tsuyuzaki S, Fujioka T, Yamaga E, Katsuta L, Mori M, Yashima Y, Hara M, Sato A, Onishi I, Tsukada J, Aruga T, Kubota K, Tateishi U

pubmed logopapersJun 9 2025
The purpose of this study is to evaluate the diagnostic accuracy of an artificial intelligence (AI)-based Computer-Aided Diagnosis (CADx) system for breast ultrasound, compare its performance with radiologists, and assess the effect of AI-assisted diagnosis. This study aims to investigate the system's ability to differentiate between benign and malignant breast masses among Japanese patients. This retrospective study included 171 breast mass ultrasound images (92 benign, 79 malignant). The AI system, BU-CAD™, provided Breast Imaging Reporting and Data System (BI-RADS) categorization, which was compared with the performance of three radiologists. Diagnostic accuracy, sensitivity, specificity, and area under the curve (AUC) were analyzed. Radiologists' diagnostic performance with and without AI assistance was also compared, and their reading time was measured using a stopwatch. The AI system demonstrated a sensitivity of 91.1%, specificity of 92.4%, and an AUC of 0.948. It showed comparable diagnostic performance to Radiologist 1, with 10 years of experience in breast imaging (0.948 vs. 0.950; p = 0.893), and superior performance to Radiologist 2 (7 years of experience, 0.948 vs. 0.881; p = 0.015) and Radiologist 3 (3 years of experience, 0.948 vs. 0.832; p = 0.001). When comparing diagnostic performance with and without AI, the use of AI significantly improved the AUC for Radiologists 2 and 3 (p = 0.001 and 0.005, respectively). However, there was no significant difference for Radiologist 1 (p = 0.139). In terms of diagnosis time, the use of AI reduced the reading time for all radiologists. Although there was no significant difference in diagnostic performance between AI and Radiologist 1, the use of AI substantially decreased the diagnosis time for Radiologist 1 as well. The AI system significantly improved diagnostic efficiency and accuracy, particularly for junior radiologists, highlighting its potential clinical utility in breast ultrasound diagnostics.

Integration of artificial intelligence into cardiac ultrasonography practice.

Shaulian SY, Gala D, Makaryus AN

pubmed logopapersJun 9 2025
Over the last several decades, echocardiography has made numerous technological advancements, with one of the most significant being the integration of artificial intelligence (AI). AI algorithms assist novice operators to acquire diagnostic-quality images and automate complex analyses. This review explores the integration of AI into various echocardiographic modalities, including transthoracic, transesophageal, intracardiac, and point-of-care ultrasound. It examines how AI enhances image acquisition, streamlines analysis, and improves diagnostic performance across routine, critical care, and complex cardiac imaging. To conduct this review, PubMed was searched using targeted keywords aligned with each section of the paper, focusing primarily on peer-reviewed articles published from 2020 onward. Earlier studies were included when foundational or frequently cited. The findings were organized thematically to highlight clinical relevance and practical applications. Challenges persist in clinical application, including algorithmic bias, ethical concerns, and the need for clinician training and AI oversight. Despite these, AI's potential to revolutionize cardiovascular care through precision and accessibility remains unparalleled, with benefits likely to far outweigh obstacles if appropriately applied and implemented in cardiac ultrasonography.
Page 24 of 1431427 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.