Sort by:
Page 17 of 25248 results

Machine learning-based approaches for distinguishing viral and bacterial pneumonia in paediatrics: A scoping review.

Rickard D, Kabir MA, Homaira N

pubmed logopapersMay 8 2025
Pneumonia is the leading cause of hospitalisation and mortality among children under five, particularly in low-resource settings. Accurate differentiation between viral and bacterial pneumonia is essential for guiding appropriate treatment, yet it remains challenging due to overlapping clinical and radiographic features. Advances in machine learning (ML), particularly deep learning (DL), have shown promise in classifying pneumonia using chest X-ray (CXR) images. This scoping review summarises the evidence on ML techniques for classifying viral and bacterial pneumonia using CXR images in paediatric patients. This scoping review was conducted following the Joanna Briggs Institute methodology and the PRISMA-ScR guidelines. A comprehensive search was performed in PubMed, Embase, and Scopus to identify studies involving children (0-18 years) with pneumonia diagnosed through CXR, using ML models for binary or multiclass classification. Data extraction included ML models, dataset characteristics, and performance metrics. A total of 35 studies, published between 2018 and 2025, were included in this review. Of these, 31 studies used the publicly available Kermany dataset, raising concerns about overfitting and limited generalisability to broader, real-world clinical populations. Most studies (n=33) used convolutional neural networks (CNNs) for pneumonia classification. While many models demonstrated promising performance, significant variability was observed due to differences in methodologies, dataset sizes, and validation strategies, complicating direct comparisons. For binary classification (viral vs bacterial pneumonia), a median accuracy of 92.3% (range: 80.8% to 97.9%) was reported. For multiclass classification (healthy, viral pneumonia, and bacterial pneumonia), the median accuracy was 91.8% (range: 76.8% to 99.7%). Current evidence is constrained by a predominant reliance on a single dataset and variability in methodologies, which limit the generalisability and clinical applicability of findings. To address these limitations, future research should focus on developing diverse and representative datasets while adhering to standardised reporting guidelines. Such efforts are essential to improve the reliability, reproducibility, and translational potential of machine learning models in clinical settings.

Ultrasound-based deep learning radiomics for enhanced axillary lymph node metastasis assessment: a multicenter study.

Zhang D, Zhou W, Lu WW, Qin XC, Zhang XY, Luo YH, Wu J, Wang JL, Zhao JJ, Zhang CX

pubmed logopapersMay 8 2025
Accurate preoperative assessment of axillary lymph node metastasis (ALNM) in breast cancer is crucial for guiding treatment decisions. This study aimed to develop a deep-learning radiomics model for assessing ALNM and to evaluate its impact on radiologists' diagnostic accuracy. This multicenter study included 866 breast cancer patients from 6 hospitals. The data were categorized into training, internal test, external test, and prospective test sets. Deep learning and handcrafted radiomics features were extracted from ultrasound images of primary tumors and lymph nodes. The tumor score and LN score were calculated following feature selection, and a clinical-radiomics model was constructed based on these scores along with clinical-ultrasonic risk factors. The model's performance was validated across the 3 test sets. Additionally, the diagnostic performance of radiologists, with and without model assistance, was evaluated. The clinical-radiomics model demonstrated robust discrimination with AUCs of 0.94, 0.92, 0.91, and 0.95 in the training, internal test, external test, and prospective test sets, respectively. It surpassed the clinical model and single score in all sets (P < .05). Decision curve analysis and clinical impact curves validated the clinical utility of the clinical-radiomics model. Moreover, the model significantly improved radiologists' diagnostic accuracy, with AUCs increasing from 0.71 to 0.82 for the junior radiologist and from 0.75 to 0.85 for the senior radiologist. The clinical-radiomics model effectively predicts ALNM in breast cancer patients using noninvasive ultrasound features. Additionally, it enhances radiologists' diagnostic accuracy, potentially optimizing resource allocation in breast cancer management.

Artificial intelligence applied to ultrasound diagnosis of pelvic gynecological tumors: a systematic review and meta-analysis.

Geysels A, Garofalo G, Timmerman S, Barreñada L, De Moor B, Timmerman D, Froyman W, Van Calster B

pubmed logopapersMay 8 2025
To perform a systematic review on artificial intelligence (AI) studies focused on identifying and differentiating pelvic gynecological tumors on ultrasound scans. Studies developing or validating AI models for diagnosing gynecological pelvic tumors on ultrasound scans were eligible for inclusion. We systematically searched PubMed, Embase, Web of Science, and Cochrane Central from their database inception until April 30th, 2024. To assess the quality of the included studies, we adapted the QUADAS-2 risk of bias tool to address the unique challenges of AI in medical imaging. Using multi-level random effects models, we performed a meta-analysis to generate summary estimates of the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. To provide a reference point of current diagnostic support tools for ultrasound examiners, we descriptively compared the pooled performance to that of the well-recognized ADNEX model on external validation. Subgroup analyses were performed to explore sources of heterogeneity. From 9151 records retrieved, 44 studies were eligible: 40 on ovarian, three on endometrial, and one on myometrial pathology. Overall, 95% were at high risk of bias - primarily due to inappropriate study inclusion criteria, the absence of a patient-level split of training and testing image sets, and no calibration assessment. For ovarian tumors, the summary AUC for AI models distinguishing benign from malignant tumors was 0.89 (95% CI: 0.85-0.92). In lower-risk studies (at least three low-risk domains), the summary AUC dropped to 0.87 (0.83-0.90), with deep learning models outperforming radiomics-based machine learning approaches in this subset. Only five studies included an external validation, and six evaluated calibration performance. In a recent systematic review of external validation studies, the ADNEX model had a pooled AUC of 0.93 (0.91-0.94) in studies at low risk of bias. Studies on endometrial and myometrial pathologies were reported individually. Although AI models show promising discriminative performances for diagnosing gynecological tumors on ultrasound, most studies have methodological shortcomings that result in a high risk of bias. In addition, the ADNEX model appears to outperform most AI approaches for ovarian tumors. Future research should emphasize robust study designs - ideally large, multicenter, and prospective cohorts that mirror real-world populations - along with external validation, proper calibration, and standardized reporting. This study was pre-registered with Open Science Framework (OSF): https://doi.org/10.17605/osf.io/bhkst.

Advancement of an automatic segmentation pipeline for metallic artifact removal in post-surgical ACL MRI.

Barnes DA, Murray CJ, Molino J, Beveridge JE, Kiapour AM, Murray MM, Fleming BC

pubmed logopapersMay 8 2025
Magnetic resonance imaging (MRI) has the potential to identify post-operative risk factors for re-tearing an anterior cruciate ligament (ACL) using a combination of imaging signal intensity (SI) and cross-sectional area measurements of the healing ACL. During surgery micro-debris can result from drilling the osseous tunnels for graft and/or suture insertion. The debris presents a limitation when using post-surgical MRI to assess reinjury risk as it causes rapid magnetic field variations during acquisition, leading to signal loss within a voxel. The present study demonstrates how K-means clustering can refine an automatic segmentation algorithm to remove the lost signal intensity values induced by the artifacts in the image. MRI data were obtained from 82 patients enrolled in three prospective clinical trials of ACL surgery. Constructive Interference in Steady State MRIs were collected at 6 months post-operation. Manual segmentation of the ACL with metallic artifacts removed served as the gold standard. The accuracy of the automatic ACL segmentations was compared using Dice coefficient, sensitivity, and precision. The performance of the automatic segmentation was comparable to manual segmentation (Dice coefficient = .81, precision = .81, sensitivity = .82). The normalized average signal intensity was calculated as 1.06 (±0.25) for the automatic and 1.04 (±0.23) for the manual segmentation, yielding a difference of 2%. These metrics emphasize the automatic segmentation model's ability to precisely capture ACL signal intensity while excluding artifact regions. The automatic artifact segmentation model described here could enhance qMRI's clinical utility by allowing for more accurate and time-efficient segmentations of the ACL.

Robust Computation of Subcortical Functional Connectivity Guided by Quantitative Susceptibility Mapping: An Application in Parkinson's Disease Diagnosis.

Qin J, Wu H, Wu C, Guo T, Zhou C, Duanmu X, Tan S, Wen J, Zheng Q, Yuan W, Zhu Z, Chen J, Wu J, He C, Ma Y, Liu C, Xu X, Guan X, Zhang M

pubmed logopapersMay 8 2025
Previous resting state functional MRI (rs-fMRI) analyses of the basal ganglia in Parkinson's disease heavily relied on T1-weighted imaging (T1WI) atlases. However, subcortical structures are characterized by subtle contrast differences, making their accurate delineation challenging on T1WI. In this study, we aimed to introduce and validate a method that incorporates quantitative susceptibility mapping (QSM) into the rs-fMRI analytical pipeline to achieve precise subcortical nuclei segmentation and improve the stability of RSFC measurements in Parkinson's disease. A total of 321 participants (148 patients with Parkinson's Disease and 173 normal controls) were enrolled. We performed cross-modal registration at the individual level for rs-fMRI to QSM (FUNC2QSM) and T1WI (FUNC2T1), respectively.The consistency and accuracy of resting state functional connectivity (RSFC) measurements in two registration approaches were assessed by intraclass correlation coefficient and mutual information. Bootstrap analysis was performed to validate the stability of the RSFC differences between Parkinson's disease and normal controls. RSFC-based machine learning models were constructed for Parkinson's disease classification, using optimized hyperparameters (RandomizedSearchCV with 5-fold cross-validation). The consistency of RSFC measurements between the two registration methods was poor, whereas the QSM-guided approach showed better mutual information values, suggesting higher registration accuracy. The disruptions of RSFC identified with the QSM-guided approach were more stable and reliable, as confirmed by bootstrap analysis. In classification models, the QSM-guided method consistently outperformed the T1WI-guided method, achieving higher test-set ROC-AUC values (FUNC2QSM: 0.87-0.90, FUNC2T1: 0.67-0.70). The QSM-guided approach effectively enhanced the accuracy of subcortical segmentation and the stability of RSFC measurement, thus facilitating future biomarker development in Parkinson's disease.

Impact of tracer uptake rate on quantification accuracy of myocardial blood flow in PET: A simulation study.

Hong X, Sanaat A, Salimi Y, Nkoulou R, Arabi H, Lu L, Zaidi H

pubmed logopapersMay 8 2025
Cardiac perfusion PET is commonly used to assess ischemia and cardiovascular risk, which enables quantitative measurements of myocardial blood flow (MBF) through kinetic modeling. However, the estimation of kinetic parameters is challenging due to the noisy nature of short dynamic frames and limited sample data points. This work aimed to investigate the errors in MBF estimation in PET through a simulation study and to evaluate different parameter estimation approaches, including a deep learning (DL) method. Simulated studies were generated using digital phantoms based on cardiac segmentations from 55 clinical CT images. We employed the irreversible 2-tissue compartmental model and simulated dynamic <sup>13</sup>N-ammonia PET scans under both rest and stress conditions (220 cases each). The simulations covered a rest K<sub>1</sub> range of 0.6 to 1.2 and a stress K<sub>1</sub> range of 1.2 to 3.6 (unit: mL/min/g) in the myocardium. A transformer-based DL model was trained on the simulated dataset to predict parametric images (PIMs) from noisy PET image frames and was validated using 5-fold cross-validation. We compared the DL method with the voxel-wise nonlinear least squares (NLS) fitting applied to the dynamic images, using either Gaussian filter (GF) smoothing (GF-NLS) or a dynamic nonlocal means (DNLM) algorithm for denoising (DNLM-NLS). Two patients with coronary CT angiography (CTA) and fractional flow reserve (FFR) were enrolled to test the feasibility of applying DL models on clinical PET data. The DL method showed clearer image structures with reduced noise compared to the traditional NLS-based methods. In terms of mean absolute relative error (MARE), as the rest K<sub>1</sub> values increased from 0.6 to 1.2 mL/min/g, the overall bias in myocardium K<sub>1</sub> estimates decreased from approximately 58% to 45% for the NLS-based methods while the DL method showed a reduction in MARE from 42% to 18%. For stress data, as the stress K<sub>1</sub> decreased from 3.6 to 1.2 mL/min/g, the MARE increased from 30% to 70% for the GF-NLS method. In contrast, both the DNLM-NLS (average: 42%) and the DL methods (average: 20%) demonstrated significantly smaller MARE changes as stress K<sub>1</sub> varied. Regarding the regional mean bias (±standard deviation), the GF-NLS method had a bias of 6.30% (±8.35%) of rest K<sub>1</sub>, compared to 1.10% (±8.21%) for DNLM-NLS and 6.28% (±14.05%) for the DL method. For the stress K<sub>1</sub>, the GF-NLS showed a mean bias of 10.72% (±9.34%) compared to 1.69% (±8.82%) for DNLM-NLS and -10.55% (±9.81%) for the DL method. This study showed that an increase in the tracer uptake rate (K<sub>1</sub>) corresponded to improved accuracy and precision in MBF quantification, whereas lower tracer uptake resulted in higher noise in dynamic PET and poorer parameter estimates. Utilizing denoising techniques or DL approaches can mitigate noise-induced bias in PET parametric imaging.

A hybrid AI method for lung cancer classification using explainable AI techniques.

Shivwanshi RR, Nirala NS

pubmed logopapersMay 8 2025
The use of Artificial Intelligence (AI) methods for the analysis of CT (computed tomography) images has greatly contributed to the development of an effective computer-assisted diagnosis (CAD) system for lung cancer (LC). However, complex structures, multiple radiographic interrelations, and the dynamic locations of abnormalities within lung CT images make extracting relevant information to process and implement LC CAD systems difficult. These prominent problems are addressed in this paper by presenting a hybrid method of LC malignancy classification, which may help researchers and experts properly engineer the model's performance by observing how the model makes decisions. The proposed methodology is named IncCat-LCC: Explainer (Inception Net Cat Boost LC Classification: Explainer), which consists of feature extraction (FE) using the handcrafted radiomic Feature (HcRdF) extraction technique, InceptionNet CNN Feature (INCF) extraction, Vision Transformer Feature (ViTF) extraction, and XGBOOST (XGB)-based feature selection, and the GPU based CATBOOST (CB) classification technique. The proposed framework achieves better and highest performance scores for lung nodule multiclass malignancy classification when evaluated using metrics such as accuracy, precision, recall, f-1 score, specificity, and area under the roc curve as 96.74 %, 93.68 %, 96.74 %, 95.19 %, 98.47 % and 99.76 % consecutively for classifying highly normal class. Observing the explainable artificial intelligence (XAI) explanations will help readers understand the model performance and the statistical outcomes of the evaluation parameter. The work presented in this article may improve the existing LC CAD system and help assess the important parameters using XAI to recognize the factors contributing to enhanced performance and reliability.

Machine learning model for diagnosing salivary gland adenoid cystic carcinoma based on clinical and ultrasound features.

Su HZ, Li ZY, Hong LC, Wu YH, Zhang F, Zhang ZB, Zhang XD

pubmed logopapersMay 8 2025
To develop and validate machine learning (ML) models for diagnosing salivary gland adenoid cystic carcinoma (ACC) in the salivary glands based on clinical and ultrasound features. A total of 365 patients with ACC or non-ACC of the salivary glands treated at two centers were enrolled in training cohort, internal and external validation cohorts. Synthetic minority oversampling technique was used to address the class imbalance. The least absolute shrinkage and selection operator (LASSO) regression identified optimal features, which were subsequently utilized to construct predictive models employing five ML algorithms. The performance of the models was evaluated across a comprehensive array of learning metrics, prominently the area under the receiver operating characteristic curve (AUC). Through LASSO regression analysis, six key features-sex, pain symptoms, number, cystic areas, rat tail sign, and polar vessel-were identified and subsequently utilized to develop five ML models. Among these models, the support vector machine (SVM) model demonstrated superior performance, achieving the highest AUCs of 0.899 and 0.913, accuracy of 90.54% and 91.53%, and F1 scores of 0.774 and 0.783 in both the internal and external validation cohorts, respectively. Decision curve analysis further revealed that the SVM model offered enhanced clinical utility compared to the other models. The ML model based on clinical and US features provide an accurate and noninvasive method for distinguishing ACC from non-ACC. This machine learning model, constructed based on clinical and ultrasound characteristics, serves as a valuable tool for the identification of salivary gland adenoid cystic carcinoma. Rat tail sign and polar vessel on US predict adenoid cystic carcinoma (ACC). Machine learning models based on clinical and US features can identify ACC. The support vector machine model performed robustly and accurately.

Cross-Institutional Evaluation of Large Language Models for Radiology Diagnosis Extraction: A Prompt-Engineering Perspective.

Moassefi M, Houshmand S, Faghani S, Chang PD, Sun SH, Khosravi B, Triphati AG, Rasool G, Bhatia NK, Folio L, Andriole KP, Gichoya JW, Erickson BJ

pubmed logopapersMay 8 2025
The rapid evolution of large language models (LLMs) offers promising opportunities for radiology report annotation, aiding in determining the presence of specific findings. This study evaluates the effectiveness of a human-optimized prompt in labeling radiology reports across multiple institutions using LLMs. Six distinct institutions collected 500 radiology reports: 100 in each of 5 categories. A standardized Python script was distributed to participating sites, allowing the use of one common locally executed LLM with a standard human-optimized prompt. The script executed the LLM's analysis for each report and compared predictions to reference labels provided by local investigators. Models' performance using accuracy was calculated, and results were aggregated centrally. The human-optimized prompt demonstrated high consistency across sites and pathologies. Preliminary analysis indicates significant agreement between the LLM's outputs and investigator-provided reference across multiple institutions. At one site, eight LLMs were systematically compared, with Llama 3.1 70b achieving the highest performance in accurately identifying the specified findings. Comparable performance with Llama 3.1 70b was observed at two additional centers, demonstrating the model's robust adaptability to variations in report structures and institutional practices. Our findings illustrate the potential of optimized prompt engineering in leveraging LLMs for cross-institutional radiology report labeling. This approach is straightforward while maintaining high accuracy and adaptability. Future work will explore model robustness to diverse report structures and further refine prompts to improve generalizability.

Predicting treatment response to systemic therapy in advanced gallbladder cancer using multiphase enhanced CT images.

Wu J, Zheng Z, Li J, Shen X, Huang B

pubmed logopapersMay 8 2025
Accurate estimation of treatment response can help clinicians identify patients who would potentially benefit from systemic therapy. This study aimed to develop and externally validate a model for predicting treatment response to systemic therapy in advanced gallbladder cancer (GBC). We recruited 399 eligible GBC patients across four institutions. Multivariable logistic regression analysis was performed to identify independent clinical factors related to therapeutic efficacy. This deep learning (DL) radiomics signature was developed for predicting treatment response using multiphase enhanced CT images. Then, the DL radiomic-clinical (DLRSC) model was built by combining the DL signature and significant clinical factors, and its predictive performance was evaluated using area under the curve (AUC). Gradient-weighted class activation mapping analysis was performed to help clinicians better understand the predictive results. Furthermore, patients were stratified into low- and high-score groups by the DLRSC model. The progression-free survival (PFS) and overall survival (OS) between the two different groups were compared. Multivariable analysis revealed that tumor size was a significant predictor of efficacy. The DLRSC model showed great predictive performance, with AUCs of 0.86 (95% CI, 0.82-0.89) and 0.84 (95% CI, 0.80-0.87) in the internal and external test datasets, respectively. This model showed great discrimination, calibration, and clinical utility. Moreover, Kaplan-Meier survival analysis revealed that low-score group patients who were insensitive to systemic therapy predicted by the DLRSC model had worse PFS and OS. The DLRSC model allows for predicting treatment response in advanced GBC patients receiving systemic therapy. The survival benefit provided by the DLRSC model was also assessed. Question No effective tools exist for identifying patients who would potentially benefit from systemic therapy in clinical practice. Findings Our combined model allows for predicting treatment response to systemic therapy in advanced gallbladder cancer. Clinical relevance With the help of this model, clinicians could inform patients of the risk of potential ineffective treatment. Such a strategy can reduce unnecessary adverse events and effectively help reallocate societal healthcare resources.
Page 17 of 25248 results
Show
per page
Get Started

Upload your X-ray image and get interpretation.

Upload now →

Disclaimer: X-ray Interpreter's AI-generated results are for informational purposes only and not a substitute for professional medical advice. Always consult a healthcare professional for medical diagnosis and treatment.