Sort by:
Page 142 of 1751742 results

Evaluating the Performance of Reasoning Large Language Models on Japanese Radiology Board Examination Questions.

Nakaura T, Takamure H, Kobayashi N, Shiraishi K, Yoshida N, Nagayama Y, Uetani H, Kidoh M, Funama Y, Hirai T

pubmed logopapersMay 17 2025
This study evaluates the performance, cost, and processing time of OpenAI's reasoning large language models (LLMs) (o1-preview, o1-mini) and their base models (GPT-4o, GPT-4o-mini) on Japanese radiology board examination questions. A total of 210 questions from the 2022-2023 official board examinations of the Japan Radiological Society were presented to each of the four LLMs. Performance was evaluated by calculating the percentage of correctly answered questions within six predefined radiology subspecialties. The total cost and processing time for each model were also recorded. The McNemar test was used to assess the statistical significance of differences in accuracy between paired model responses. The o1-preview achieved the highest accuracy (85.7%), significantly outperforming GPT-4o (73.3%, P<.001). Similarly, o1-mini (69.5%) performed significantly better than GPT-4o-mini (46.7%, P<.001). Across all radiology subspecialties, o1-preview consistently ranked highest. However, reasoning models incurred substantially higher costs (o1-preview: $17.10, o1-mini: $2.58) compared to their base counterparts (GPT-4o: $0.496, GPT-4o-mini: $0.04), and their processing times were approximately 3.7 and 1.2 times longer, respectively. Reasoning LLMs demonstrated markedly superior performance in answering radiology board exam questions compared to their base models, albeit at a substantially higher cost and increased processing time.

Breast Arterial Calcifications on Mammography: A Review of the Literature.

Rossi J, Cho L, Newell MS, Venta LA, Montgomery GH, Destounis SV, Moy L, Brem RF, Parghi C, Margolies LR

pubmed logopapersMay 17 2025
Identifying systemic disease with medical imaging studies may improve population health outcomes. Although the pathogenesis of peripheral arterial calcification and coronary artery calcification differ, breast arterial calcification (BAC) on mammography is associated with cardiovascular disease (CVD), a leading cause of death in women. While professional society guidelines on the reporting or management of BAC have not yet been established, and assessment and quantification methods are not yet standardized, the value of reporting BAC is being considered internationally as a possible indicator of subclinical CVD. Furthermore, artificial intelligence (AI) models are being developed to identify and quantify BAC on mammography, as well as to predict the risk of CVD. This review outlines studies evaluating the association of BAC and CVD, introduces the role of preventative cardiology in clinical management, discusses reasons to consider reporting BAC, acknowledges current knowledge gaps and barriers to assessing and reporting calcifications, and provides examples of how AI can be utilized to measure BAC and contribute to cardiovascular risk assessment. Ultimately, reporting BAC on mammography might facilitate earlier mitigation of cardiovascular risk factors in asymptomatic women.

Fully Automated Evaluation of Condylar Remodeling after Orthognathic Surgery in Skeletal Class II Patients Using Deep Learning and Landmarks.

Jia W, Wu H, Mei L, Wu J, Wang M, Cui Z

pubmed logopapersMay 17 2025
Condylar remodeling is a key prognostic indicator in maxillofacial surgery for skeletal class II patients. This study aimed to develop and validate a fully automated method leveraging landmark-guided segmentation and registration for efficient assessment of condylar remodeling. A V-Net-based deep learning workflow was developed to automatically segment the mandible and localize anatomical landmarks from CT images. Cutting planes were computed based on the landmarks to segment the condylar and ramus volumes from the mandible mask. The stable ramus served as a reference for registering pre- and post-operative condyles using the Iterative Closest Point (ICP) algorithm. Condylar remodeling was subsequently assessed through mesh registration, heatmap visualization, and quantitative metrics of surface distance and volumetric change. Experts also rated the concordance between automated assessments and clinical diagnoses. In the test set, condylar segmentation achieved a Dice coefficient of 0.98, and landmark prediction yielded a mean absolute error of 0.26 mm. The automated evaluation process was completed in 5.22 seconds, approximately 150 times faster than manual assessments. The method accurately quantified condylar volume changes, ranging from 2.74% to 50.67% across patients. Expert ratings for all test cases averaged 9.62. This study introduced a consistent, accurate, and fully automated approach for condylar remodeling evaluation. The well-defined anatomical landmarks guided precise segmentation and registration, while deep learning supported an end-to-end automated workflow. The test results demonstrated its broad clinical applicability across various degrees of condylar remodeling and high concordance with expert assessments. By integrating anatomical landmarks and deep learning, the proposed method improves efficiency by 150 times without compromising accuracy, thereby facilitating an efficient and accurate assessment of orthognathic prognosis. The personalized 3D condylar remodeling models aid in visualizing sequelae, such as joint pain or skeletal relapse, and guide individualized management of TMJ disorders.

A Robust Automated Segmentation Method for White Matter Hyperintensity of Vascular-origin.

He H, Jiang J, Peng S, He C, Sun T, Fan F, Song H, Sun D, Xu Z, Wu S, Lu D, Zhang J

pubmed logopapersMay 17 2025
White matter hyperintensity (WMH) is a primary manifestation of small vessel disease (SVD), leading to vascular cognitive impairment and other disorders. Accurate WMH quantification is vital for diagnosis and prognosis, but current automatic segmentation methods often fall short, especially across different datasets. The aims of this study are to develop and validate a robust deep learning segmentation method for WMH of vascular-origin. In this study, we developed a transformer-based method for the automatic segmentation of vascular-origin WMH using both 3D T1 and 3D T2-FLAIR images. Our initial dataset comprised 126 participants with varying WMH burdens due to SVD, each with manually segmented WMH masks used for training and testing. External validation was performed on two independent datasets: the WMH Segmentation Challenge 2017 dataset (170 subjects) and an in-house vascular risk factor dataset (70 subjects), which included scans acquired on eight different MRI systems at field strengths of 1.5T, 3T, and 5T. This approach enabled a comprehensive assessment of the method's generalizability across diverse imaging conditions. We further compared our method against LGA, LPA, BIANCA, UBO-detector and TrUE-Net in optimized settings. Our method consistently outperformed others, achieving a median Dice coefficient of 0.78±0.09 in our primary dataset, 0.72±0.15 in the external dataset 1, and 0.72±0.14 in the external dataset 2. The relative volume errors were 0.15±0.14, 0.50±0.86, and 0.47±1.02, respectively. The true positive rates were 0.81±0.13, 0.92±0.09, and 0.92±0.12, while the false positive rates were 0.20±0.09, 0.40±0.18, and 0.40±0.19. None of the external validation datasets were used for model training; instead, they comprise previously unseen MRI scans acquired from different scanners and protocols. This setup closely reflects real-world clinical scenarios and further demonstrates the robustness and generalizability of our model across diverse MRI systems and acquisition settings. As such, the proposed method provides a reliable solution for WMH segmentation in large-scale cohort studies.

ML-Driven Alzheimer 's disease prediction: A deep ensemble modeling approach.

Jumaili MLF, Sonuç E

pubmed logopapersMay 17 2025
Alzheimer's disease (AD) is a progressive neurological disorder characterized by cognitive decline due to brain cell death, typically manifesting later in life.Early and accurate detection is critical for effective disease management and treatment. This study proposes an ensemble learning framework that combines five deep learning architectures (VGG16, VGG19, ResNet50, InceptionV3, and EfficientNetB7) to improve the accuracy of AD diagnosis. We use a comprehensive dataset of 3,714 MRI brain scans collected from specialized clinics in Iraq, categorized into three classes: NonDemented (834 images), MildDemented (1,824 images), and VeryDemented (1,056 images). The proposed voting ensemble model achieves a diagnostic accuracy of 99.32% on our dataset. The effectiveness of the model is further validated on two external datasets: OASIS (achieving 86.6% accuracy) and ADNI (achieving 99.5% accuracy), demonstrating competitive performance compared to existing approaches. Moreover, the proposed model exhibits high precision and recall across all stages of dementia, providing a reliable and robust tool for early AD detection. This study highlights the effectiveness of ensemble learning in AD diagnosis and shows promise for clinical applications.

Evaluation of synthetic images derived from a neural network in pediatric brain magnetic resonance imaging.

Nagaraj UD, Meineke J, Sriwastwa A, Tkach JA, Leach JL, Doneva M

pubmed logopapersMay 17 2025
Synthetic MRI (SyMRI) is a technique used to estimate tissue properties and generate multiple MR sequence contrasts from a single acquisition. However, image quality can be suboptimal. To evaluate a neural network approach using artificial intelligence-based direct contrast synthesis (AI-DCS) of the multi-contrast weighted images to improve image quality. This prospective, IRB approved study enrolled 50 pediatric patients undergoing clinical brain MRI. In addition to the standard of care (SOC) clinical protocol, 2D multi-delay multi-echo (MDME) sequence was obtained. SOC 3D T1-weighted (T1W), 2D T2-weighted (T2W) and 2D T2W fluid-attenuated inversion recovery (FLAIR) images from 35 patients were used to train a neural network generating synthetic T1W, T2W, and FLAIR images. Quantitative analysis of grey matter (GM) and white matter (WM) apparent signal to noise (aSNR) and grey-white matter (GWM) apparent contrast to noise (aCNR) ratios was performed. 8 patients were evaluated. When compared to SyMRI, T1W AI-DCS had better overall image quality, reduced noise/artifacts, and better subjective SNR in 100 % (16/16) of evaluations. When compared to SyMRI, T2W AI-DCS overall image quality and diagnostic confidence was better in 93.8 % (15/16) and 87.5 % (14/16) of evaluations, respectively. When compared to SyMRI, FLAIR AI-DCS was better in 93.8 % (15/16) of evaluations in overall image quality and in 100 % (16/16) of evaluations for noise/artifacts and subjective SNR. Quantitative analysis revealed higher WM aSNR compared with SyMRI (p < 0.05) for T1W, T2W and FLAIR. AI-DCS demonstrates better overall image quality than SyMRI on T1W, T2W and FLAIR.

Computer-aided assessment for enlarged fetal heart with deep learning model.

Nurmaini S, Sapitri AI, Roseno MT, Rachmatullah MN, Mirani P, Bernolian N, Darmawahyuni A, Tutuko B, Firdaus F, Islami A, Arum AW, Bastian R

pubmed logopapersMay 16 2025
Enlarged fetal heart conditions may indicate congenital heart diseases or other complications, making early detection through prenatal ultrasound essential. However, manual assessments by sonographers are often subjective, time-consuming, and inconsistent. This paper proposes a deep learning approach using the You Only Look Once (YOLO) architecture to automate fetal heart enlargement assessment. Using a set of ultrasound videos, YOLOv8 with a CBAM module demonstrated superior performance compared to YOLOv11 with self-attention. Incorporating the ResNeXtBlock-a residual network with cardinality-additionally enhanced accuracy and prediction consistency. The model exhibits strong capability in detecting fetal heart enlargement, offering a reliable computer-aided tool for sonographers during prenatal screenings. Further validation is required to confirm its clinical applicability. By improving early and accurate detection, this approach has the potential to enhance prenatal care, facilitate timely interventions, and contribute to better neonatal health outcomes.

Automated CT segmentation for lower extremity tissues in lymphedema evaluation using deep learning.

Na S, Choi SJ, Ko Y, Urooj B, Huh J, Cha S, Jung C, Cheon H, Jeon JY, Kim KW

pubmed logopapersMay 16 2025
Clinical assessment of lymphedema, particularly for lymphedema severity and fluid-fibrotic lesions, remains challenging with traditional methods. We aimed to develop and validate a deep learning segmentation tool for automated tissue component analysis in lower extremity CT scans. For development datasets, lower extremity CT venography scans were collected in 118 patients with gynecologic cancers for algorithm training. Reference standards were created by segmentation of fat, muscle, and fluid-fibrotic tissue components using 3D slicer. A deep learning model based on the Unet++ architecture with an EfficientNet-B7 encoder was developed and trained. Segmentation accuracy of the deep learning model was validated in an internal validation set (n = 10) and an external validation set (n = 10) using Dice similarity coefficient (DSC) and volumetric similarity (VS). A graphical user interface (GUI) tool was developed for the visualization of the segmentation results. Our deep learning algorithm achieved high segmentation accuracy. Mean DSCs for each component and all components ranged from 0.945 to 0.999 in the internal validation set and 0.946 to 0.999 in the external validation set. Similar performance was observed in the VS, with mean VSs for all components ranging from 0.97 to 0.999. In volumetric analysis, mean volumes of the entire leg and each component did not differ significantly between reference standard and deep learning measurements (p > 0.05). Our GUI displays lymphedema mapping, highlighting segmented fat, muscle, and fluid-fibrotic components in the entire leg. Our deep learning algorithm provides an automated segmentation tool enabling accurate segmentation, volume measurement of tissue component, and lymphedema mapping. Question Clinical assessment of lymphedema remains challenging, particularly for tissue segmentation and quantitative severity evaluation. Findings A deep learning algorithm achieved DSCs > 0.95 and VS > 0.97 for fat, muscle, and fluid-fibrotic components in internal and external validation datasets. Clinical relevance The developed deep learning tool accurately segments and quantifies lower extremity tissue components on CT scans, enabling automated lymphedema evaluation and mapping with high segmentation accuracy.

A deep learning-based approach to automated rib fracture detection and CWIS classification.

Marting V, Borren N, van Diepen MR, van Lieshout EMM, Wijffels MME, van Walsum T

pubmed logopapersMay 16 2025
Trauma-induced rib fractures are a common injury. The number and characteristics of these fractures influence whether a patient is treated nonoperatively or surgically. Rib fractures are typically diagnosed using CT scans, yet 19.2-26.8% of fractures are still missed during assessment. Another challenge in managing rib fractures is the interobserver variability in their classification. Purpose of this study was to develop and assess an automated method that detects rib fractures in CT scans, and classifies them according to the Chest Wall Injury Society (CWIS) classification. 198 CT scans were collected, of which 170 were used for training and internal validation, and 28 for external validation. Fractures and their classifications were manually annotated in each of the scans. A detection and classification network was trained for each of the three components of the CWIS classifications. In addition, a rib number labeling network was trained for obtaining the rib number of a fracture. Experiments were performed to assess the method performance. On the internal test set, the method achieved a detection sensitivity of 80%, at a precision of 87%, and an F1-score of 83%, with a mean number of FPPS (false positives per scan) of 1.11. Classification sensitivity varied, with the lowest being 25% for complex fractures and the highest being 97% for posterior fractures. The correct rib number was assigned to 94% of the detected fractures. The custom-trained nnU-Net correctly labeled 95.5% of all ribs and 98.4% of fractured ribs in 30 patients. The detection and classification performance on the external validation dataset was slightly better, with a fracture detection sensitivity of 84%, precision of 85%, F1-score of 84%, FPPS of 0.96 and 95% of the fractures were assigned the correct rib number. The method developed is able to accurately detect and classify rib fractures in CT scans, there is room for improvement in the (rare and) underrepresented classes in the training set.
Page 142 of 1751742 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.