Sort by:
Page 28 of 45448 results

Diagnostic Performance of ChatGPT-4o in Detecting Hip Fractures on Pelvic X-rays.

Erdem TE, Kirilmaz A, Kekec AF

pubmed logopapersJun 1 2025
Hip fractures are a major orthopedic problem, especially in the elderly population. Hip fractures are usually diagnosed by clinical evaluation and imaging, especially X-rays. In recent years, new approaches to fracture detection have emerged with the use of artificial intelligence (AI) and deep learning techniques in medical imaging. In this study, we aimed to evaluate the diagnostic performance of ChatGPT-4o, an artificial intelligence model, in diagnosing hip fractures. A total of 200 anteroposterior pelvic X-ray images were retrospectively analyzed. Half of the images belonged to patients with surgically confirmed hip fractures, including both displaced and non-displaced types, while the other half represented patients with soft tissue trauma and no fractures. Each image was evaluated by ChatGPT-4o through a standardized prompt, and its predictions (fracture vs. no fracture) were compared against the gold standard diagnoses. Diagnostic performance metrics such as sensitivity, specificity, accuracy, positive predictive value (PPV), negative predictive value (NPV), receiver operating characteristic (ROC) curve, Cohen's kappa, and F1 score were calculated. ChatGPT-4o demonstrated an overall accuracy of 82.5% in detecting hip fractures on pelvic radiographs, with a sensitivity of 78.0% and specificity of 87.0%. PPVs and NPVs were 85.7% and 79.8%, respectively. The area under the ROC curve (AUC) was 0.825, indicating good discriminative performance. Among 22 false-negative cases, 68.2% were non-displaced fractures, suggesting the model had greater difficulty identifying subtle radiographic findings. Cohen's kappa coefficient was 0.65, showing substantial agreement with actual diagnoses. Chi-square analysis revealed a strong correlation (χ² = 82.59, <i>P</i> < 0.001), while McNemar's test (<i>P</i> = 0.176) showed no significant asymmetry in error distribution. ChatGPT-4o shows promising accuracy in identifying hip fractures on pelvic X-rays, especially when fractures are displaced. However, its sensitivity drops significantly for non-displaced fractures, leading to many false negatives. This highlights the need for caution when interpreting negative AI results, particularly when clinical suspicion remains high. While not a replacement for expert assessment, ChatGPT-4o may assist in settings with limited specialist access.

Coarse for Fine: Bounding Box Supervised Thyroid Ultrasound Image Segmentation Using Spatial Arrangement and Hierarchical Prediction Consistency.

Chi J, Lin G, Li Z, Zhang W, Chen JH, Huang Y

pubmed logopapersJun 1 2025
Weakly-supervised learning methods have become increasingly attractive for medical image segmentation, but suffered from a high dependence on quantifying the pixel-wise affinities of low-level features, which are easily corrupted in thyroid ultrasound images, resulting in segmentation over-fitting to weakly annotated regions without precise delineation of target boundaries. We propose a dual-branch weakly-supervised learning framework to optimize the backbone segmentation network by calibrating semantic features into rational spatial distribution under the indirect, coarse guidance of the bounding box mask. Specifically, in the spatial arrangement consistency branch, the maximum activations sampled from the preliminary segmentation prediction and the bounding box mask along the horizontal and vertical dimensions are compared to measure the rationality of the approximate target localization. In the hierarchical prediction consistency branch, the target and background prototypes are encapsulated from the semantic features under the combined guidance of the preliminary segmentation prediction and the bounding box mask. The secondary segmentation prediction induced from the prototypes is compared with the preliminary prediction to quantify the rationality of the elaborated target and background semantic feature perception. Experiments on three thyroid datasets illustrate that our model outperforms existing weakly-supervised methods for thyroid gland and nodule segmentation and is comparable to the performance of fully-supervised methods with reduced annotation time. The proposed method has provided a weakly-supervised segmentation strategy by simultaneously considering the target's location and the rationality of target and background semantic features distribution. It can improve the applicability of deep learning based segmentation in the clinical practice.

Advancing Intracranial Aneurysm Detection: A Comprehensive Systematic Review and Meta-analysis of Deep Learning Models Performance, Clinical Integration, and Future Directions.

Delfan N, Abbasi F, Emamzadeh N, Bahri A, Parvaresh Rizi M, Motamedi A, Moshiri B, Iranmehr A

pubmed logopapersJun 1 2025
Cerebral aneurysms pose a significant risk to patient safety, particularly when ruptured, emphasizing the need for early detection and accurate prediction. Traditional diagnostic methods, reliant on clinician-based evaluations, face challenges in sensitivity and consistency, prompting the exploration of deep learning (DL) systems for improved performance. This systematic review and meta-analysis assessed the performance of DL models in detecting and predicting intracranial aneurysms compared to clinician-based evaluations. Imaging modalities included CT angiography (CTA), digital subtraction angiography (DSA), and time-of-flight MR angiography (TOF-MRA). Data on lesion-wise sensitivity, specificity, and the impact of DL assistance on clinician performance were analyzed. Subgroup analyses evaluated DL sensitivity by aneurysm size and location, and interrater agreement was measured using Fleiss' κ. DL systems achieved an overall lesion-wise sensitivity of 90 % and specificity of 94 %, outperforming human diagnostics. Clinician specificity improved significantly with DL assistance, increasing from 83 % to 85 % in the patient-wise scenario and from 93 % to 95 % in the lesion-wise scenario. Similarly, clinician sensitivity also showed notable improvement with DL assistance, rising from 82 % to 96 % in the patient-wise scenario and from 82 % to 88 % in the lesion-wise scenario. Subgroup analysis showed DL sensitivity varied with aneurysm size and location, reaching 100 % for aneurysms larger than 10 mm. Additionally, DL assistance improved interrater agreement among clinicians, with Fleiss' κ increasing from 0.668 to 0.862. DL models demonstrate transformative potential in managing cerebral aneurysms by enhancing diagnostic accuracy, reducing missed cases, and supporting clinical decision-making. However, further validation in diverse clinical settings and seamless integration into standard workflows are necessary to fully realize the benefits of DL-driven diagnostics.

Accuracy of an Automated Bone Scan Index Measurement System Enhanced by Deep Learning of the Female Skeletal Structure in Patients with Breast Cancer.

Fukai S, Daisaki H, Yamashita K, Kuromori I, Motegi K, Umeda T, Shimada N, Takatsu K, Terauchi T, Koizumi M

pubmed logopapersJun 1 2025
VSBONE<sup>®</sup> BSI (VSBONE), an automated bone scan index (BSI) measurement system was updated from version 2.1 (ver.2) to 3.0 (ver.3). VSBONE ver.3 incorporates deep learning of the skeletal structures of 957 new women, and it can be applied in patients with breast cancer. However, the performance of the updated VSBONE remains unclear. This study aimed to validate the diagnostic accuracy of the VSBONE system in patients with breast cancer. In total, 220 Japanese patients with breast cancer who underwent bone scintigraphy with single-photon emission computed tomography/computed tomography (SPECT/CT) were retrospectively analyzed. The patients were diagnosed with active bone metastases (<i>n</i> = 20) and non-bone metastases (<i>n</i> = 200) according to the physician's radiographic image interpretation. The patients were assessed using the VSBONE ver.2 and VSBONE ver.3, and the BSI findings were compared with the interpretation results by the physicians. The occurrence of segmentation errors, the association of BSI between VSBONE ver.2 and VSBONE ver.3, and the diagnostic accuracy of the systems were evaluated. VSBONE ver.2 and VSBONE ver.3 had segmentation errors in four and two patients. Significant positive linear correlations were confirmed in both versions of the BSI (<i>r</i> = 0.92). The diagnostic accuracy was 54.1% in VSBOBE ver.2, and 80.5% in VSBONE ver.3 <i>(P</i> < 0.001), respectively. The diagnostic accuracy of VSBONE was improved through deep learning of the female skeletal structures. The updated VSBONE ver.3 can be a reliable automated system for measuring BSI in patients with breast cancer.

Prediction of Lymph Node Metastasis in Lung Cancer Using Deep Learning of Endobronchial Ultrasound Images With Size on CT and PET-CT Findings.

Oh JE, Chung HS, Gwon HR, Park EY, Kim HY, Lee GK, Kim TS, Hwangbo B

pubmed logopapersJun 1 2025
Echo features of lymph nodes (LNs) influence target selection during endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA). This study evaluates deep learning's diagnostic capabilities on EBUS images for detecting mediastinal LN metastasis in lung cancer, emphasising the added value of integrating a region of interest (ROI), LN size on CT, and PET-CT findings. We analysed 2901 EBUS images from 2055 mediastinal LN stations in 1454 lung cancer patients. ResNet18-based deep learning models were developed to classify images of true positive malignant and true negative benign LNs diagnosed by EBUS-TBNA using different inputs: original images, ROI images, and CT size and PET-CT data. Model performance was assessed using the area under the receiver operating characteristic curve (AUROC) and other diagnostic metrics. The model using only original EBUS images showed the lowest AUROC (0.870) and accuracy (80.7%) in classifying LN images. Adding ROI information slightly increased the AUROC (0.896) without a significant difference (p = 0.110). Further adding CT size resulted in a minimal change in AUROC (0.897), while adding PET-CT (original + ROI + PET-CT) showed a significant improvement (0.912, p = 0.008 vs. original; p = 0.002 vs. original + ROI + CT size). The model combining original and ROI EBUS images with CT size and PET-CT findings achieved the highest AUROC (0.914, p = 0.005 vs. original; p = 0.018 vs. original + ROI + PET-CT) and accuracy (82.3%). Integrating an ROI, LN size on CT, and PET-CT findings into the deep learning analysis of EBUS images significantly enhances the diagnostic capability of models for detecting mediastinal LN metastasis in lung cancer, with the integration of PET-CT data having a substantial impact.

Implementation costs and cost-effectiveness of ultraportable chest X-ray with artificial intelligence in active case finding for tuberculosis in Nigeria.

Garg T, John S, Abdulkarim S, Ahmed AD, Kirubi B, Rahman MT, Ubochioma E, Creswell J

pubmed logopapersJun 1 2025
Availability of ultraportable chest x-ray (CXR) and advancements in artificial intelligence (AI)-enabled CXR interpretation are promising developments in tuberculosis (TB) active case finding (ACF) but costing and cost-effectiveness analyses are limited. We provide implementation cost and cost-effectiveness estimates of different screening algorithms using symptoms, CXR and AI in Nigeria. People 15 years and older were screened for TB symptoms and offered a CXR with AI-enabled interpretation using qXR v3 (Qure.ai) at lung health camps. Sputum samples were tested on Xpert MTB/RIF for individuals reporting symptoms or with qXR abnormality scores ≥0.30. We conducted a retrospective costing using a combination of top-down and bottom-up approaches while utilizing itemized expense data from a health system perspective. We estimated costs in five screening scenarios: abnormality score ≥0.30 and ≥0.50; cough ≥ 2 weeks; any symptom; abnormality score ≥0.30 or any symptom. We calculated total implementation costs, cost per bacteriologically-confirmed case detected, and assessed cost-effectiveness using incremental cost-effectiveness ratio (ICER) as additional cost per additional case. Overall, 3205 people with presumptive TB were identified, 1021 were tested, and 85 people with bacteriologically-confirmed TB were detected. Abnormality ≥ 0.30 or any symptom (US$65704) had the highest costs while cough ≥ 2 weeks was the lowest (US$40740). The cost per case was US$1198 for cough ≥ 2 weeks, and lowest for any symptom (US$635). Compared to baseline strategy of cough ≥ 2 weeks, the ICER for any symptom was US$191 per additional case detected and US$ 2096 for Abnormality ≥0.30 OR any symptom algorithm. Using CXR and AI had lower cost per case detected than any symptom screening criteria when asymptomatic TB was higher than 30% of all bacteriologically-confirmed TB detected. Compared to traditional symptom screening, using CXR and AI in combination with symptoms detects more cases at lower cost per case detected and is cost-effective. TB programs should explore adoption of CXR and AI for screening in ACF.

Explicit Abnormality Extraction for Unsupervised Motion Artifact Reduction in Magnetic Resonance Imaging.

Zhou Y, Li H, Liu J, Kong Z, Huang T, Ahn E, Lv Z, Kim J, Feng DD

pubmed logopapersJun 1 2025
Motion artifacts compromise the quality of magnetic resonance imaging (MRI) and pose challenges to achieving diagnostic outcomes and image-guided therapies. In recent years, supervised deep learning approaches have emerged as successful solutions for motion artifact reduction (MAR). One disadvantage of these methods is their dependency on acquiring paired sets of motion artifact-corrupted (MA-corrupted) and motion artifact-free (MA-free) MR images for training purposes. Obtaining such image pairs is difficult and therefore limits the application of supervised training. In this paper, we propose a novel UNsupervised Abnormality Extraction Network (UNAEN) to alleviate this problem. Our network is capable of working with unpaired MA-corrupted and MA-free images. It converts the MA-corrupted images to MA-reduced images by extracting abnormalities from the MA-corrupted images using a proposed artifact extractor, which intercepts the residual artifact maps from the MA-corrupted MR images explicitly, and a reconstructor to restore the original input from the MA-reduced images. The performance of UNAEN was assessed by experimenting with various publicly available MRI datasets and comparing them with state-of-the-art methods. The quantitative evaluation demonstrates the superiority of UNAEN over alternative MAR methods and visually exhibits fewer residual artifacts. Our results substantiate the potential of UNAEN as a promising solution applicable in real-world clinical environments, with the capability to enhance diagnostic accuracy and facilitate image-guided therapies.

P2TC: A Lightweight Pyramid Pooling Transformer-CNN Network for Accurate 3D Whole Heart Segmentation.

Cui H, Wang Y, Zheng F, Li Y, Zhang Y, Xia Y

pubmed logopapersJun 1 2025
Cardiovascular disease is a leading global cause of death, requiring accurate heart segmentation for diagnosis and surgical planning. Deep learning methods have been demonstrated to achieve superior performances in cardiac structures segmentation. However, there are still limitations in 3D whole heart segmentation, such as inadequate spatial context modeling, difficulty in capturing long-distance dependencies, high computational complexity, and limited representation of local high-level semantic information. To tackle the above problems, we propose a lightweight Pyramid Pooling Transformer-CNN (P2TC) network for accurate 3D whole heart segmentation. The proposed architecture comprises a dual encoder-decoder structure with a 3D pyramid pooling Transformer for multi-scale information fusion and a lightweight large-kernel Convolutional Neural Network (CNN) for local feature extraction. The decoder has two branches for precise segmentation and contextual residual handling. The first branch is used to generate segmentation masks for pixel-level classification based on the features extracted by the encoder to achieve accurate segmentation of cardiac structures. The second branch highlights contextual residuals across slices, enabling the network to better handle variations and boundaries. Extensive experimental results on the Multi-Modality Whole Heart Segmentation (MM-WHS) 2017 challenge dataset demonstrate that P2TC outperforms the most advanced methods, achieving the Dice scores of 92.6% and 88.1% in Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) modalities respectively, which surpasses the baseline model by 1.5% and 1.7%, and achieves state-of-the-art segmentation results.

MedKAFormer: When Kolmogorov-Arnold Theorem Meets Vision Transformer for Medical Image Representation.

Wang G, Zhu Q, Song C, Wei B, Li S

pubmed logopapersJun 1 2025
Vision Transformers (ViTs) suffer from high parameter complexity because they rely on Multi-layer Perceptrons (MLPs) for nonlinear representation. This issue is particularly challenging in medical image analysis, where labeled data is limited, leading to inadequate feature representation. Existing methods have attempted to optimize either the patch embedding stage or the non-embedding stage of ViTs. Still, they have struggled to balance effective modeling, parameter complexity, and data availability. Recently, the Kolmogorov-Arnold Network (KAN) was introduced as an alternative to MLPs, offering a potential solution to the large parameter issue in ViTs. However, KAN cannot be directly integrated into ViT due to challenges such as handling 2D structured data and dimensionality catastrophe. To solve this problem, we propose MedKAFormer, the first ViT model to incorporate the Kolmogorov-Arnold (KA) theorem for medical image representation. It includes a Dynamic Kolmogorov-Arnold Convolution (DKAC) layer for flexible nonlinear modeling in the patch embedding stage. Additionally, it introduces a Nonlinear Sparse Token Mixer (NSTM) and a Nonlinear Dynamic Filter (NDF) in the non-embedding stage. These components provide comprehensive nonlinear representation while reducing model overfitting. MedKAFormer reduces parameter complexity by 85.61% compared to ViT-Base and achieves competitive results on 14 medical datasets across various imaging modalities and structures.

Comparing efficiency of an attention-based deep learning network with contemporary radiological workflow for pulmonary embolism detection on CTPA: A retrospective study.

Singh G, Singh A, Kainth T, Suman S, Sakla N, Partyka L, Phatak T, Prasanna P

pubmed logopapersJun 1 2025
Pulmonary embolism (PE) is the third most fatal cardiovascular disease in the United States. Currently, Computed Tomography Pulmonary Angiography (CTPA) serves as diagnostic gold standard for detecting PE. However, its efficacy is limited by factors such as contrast bolus timing, physician-dependent diagnostic accuracy, and time taken for scan interpretation. To address these limitations, we propose an AI-based PE triaging model (AID-PE) designed to predict the presence and key characteristics of PE on CTPA. This model aims to enhance diagnostic accuracy, efficiency, and the speed of PE identification. We trained AID-PE on the RSNA-STR PE CT (RSPECT) Dataset, N = 7279 and subsequently tested it on an in-house dataset (n = 106). We evaluated efficiency in a separate dataset (D<sub>4</sub>, n = 200) by comparing the time from scan to report in standard PE detection workflow versus AID-PE. A comparative analysis showed that AID-PE had an AUC/accuracy of 0.95/0.88. In contrast, a Convolutional Neural Network (CNN) classifier and a CNN-Long Short-Term Memory (LSTM) network without an attention module had an AUC/accuracy of 0.5/0.74 and 0.88/0.65, respectively. Our model achieved AUCs of 0.82 and 0.95 for detecting PE on the validation dataset and the independent test set, respectively. On D<sub>4</sub>, AID-PE took an average of 1.32 s to screen for PE across 148 CTPA studies, compared to an average of 40 min in contemporary workflow. AID-PE outperformed a baseline CNN classifier and a single-stage CNN-LSTM network without an attention module. Additionally, its efficiency is comparable to the current radiological workflow.
Page 28 of 45448 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.