Sort by:
Page 141 of 1781774 results

Diagnosis of trigeminal neuralgia based on plain skull radiography using convolutional neural network.

Han JH, Ji SY, Kim M, Kwon JE, Park JB, Kang H, Hwang K, Kim CY, Kim T, Jeong HG, Ahn YH, Chung HT

pubmed logopapersMay 29 2025
This study aimed to determine whether trigeminal neuralgia can be diagnosed using convolutional neural networks (CNNs) based on plain X-ray skull images. A labeled dataset of 166 skull images from patients aged over 16 years with trigeminal neuralgia was compiled, alongside a control dataset of 498 images from patients with unruptured intracranial aneurysms. The images were randomly partitioned into training, validation, and test datasets in a 6:2:2 ratio. Classifier performance was assessed using accuracy and the area under the receiver operating characteristic (AUROC) curve. Gradient-weighted class activation mapping was applied to identify regions of interest. External validation was conducted using a dataset obtained from another institution. The CNN achieved an overall accuracy of 87.2%, with sensitivity and specificity of 0.72 and 0.91, respectively, and an AUROC of 0.90 on the test dataset. In most cases, the sphenoid body and clivus were identified as key areas for predicting trigeminal neuralgia. Validation on the external dataset yielded an accuracy of 71.0%, highlighting the potential of deep learning-based models in distinguishing X-ray skull images of patients with trigeminal neuralgia from those of control individuals. Our preliminary results suggest that plain x-ray can be potentially used as an adjunct to conventional MRI, ideally with CISS sequences, to aid in the clinical diagnosis of TN. Further refinement could establish this approach as a valuable screening tool.

Standardizing Heterogeneous MRI Series Description Metadata Using Large Language Models.

Kamel PI, Doo FX, Savani D, Kanhere A, Yi PH, Parekh VS

pubmed logopapersMay 29 2025
MRI metadata, particularly free-text series descriptions (SDs) used to identify sequences, are highly heterogeneous due to variable inputs by manufacturers and technologists. This variability poses challenges in correctly identifying series for hanging protocols and dataset curation. The purpose of this study was to evaluate the ability of large language models (LLMs) to automatically classify MRI SDs. We analyzed non-contrast brain MRIs performed between 2016 and 2022 at our institution, identifying all unique SDs in the metadata. A practicing neuroradiologist manually classified the SD text into: "T1," "T2," "T2/FLAIR," "SWI," "DWI," ADC," or "Other." Then, various LLMs, including GPT 3.5 Turbo, GPT-4, GPT-4o, Llama 3 8b, and Llama 3 70b, were asked to classify each SD into one of the sequence categories. Model performances were compared to ground truth classification using area under the curve (AUC) as the primary metric. Additionally, GPT-4o was tasked with generating regular expression templates to match each category. In 2510 MRI brain examinations, there were 1395 unique SDs, with 727/1395 (52.1%) appearing only once, indicating high variability. GPT-4o demonstrated the highest performance, achieving an average AUC of 0.983 ± 0.020 for all series with detailed prompting. GPT models significantly outperformed Llama models, with smaller differences within the GPT family. Regular expression generation was inconsistent, demonstrating an average AUC of 0.774 ± 0.161 for all sequences. Our findings suggest that LLMs are effective for interpreting and standardizing heterogeneous MRI SDs.

ROC Analysis of Biomarker Combinations in Fragile X Syndrome-Specific Clinical Trials: Evaluating Treatment Efficacy via Exploratory Biomarkers

Norris, J. E., Berry-Kravis, E. M., Harnett, M. D., Reines, S. A., Reese, M., Auger, E. K., Outterson, A., Furman, J., Gurney, M. E., Ethridge, L. E.

medrxiv logopreprintMay 29 2025
Fragile X Syndrome (FXS) is a rare neurodevelopmental disorder caused by a trinucleotide repeat expansion on the 5 untranslated region of the FMR1 gene. FXS is characterized by intellectual disability, anxiety, sensory hypersensitivity, and difficulties with executive function. A recent phase 2 placebo-controlled clinical trial assessing BPN14770, a first-in-class phosphodiesterase 4D allosteric inhibitor, in 30 adult males (age 18-41 years) with FXS demonstrated cognitive improvements on the NIH Toolbox Cognitive Battery in domains related to language and caregiver reports of improvement in both daily functioning and language. However, individual physiological measures from electroencephalography (EEG) demonstrated only marginal significance for trial efficacy. A secondary analysis of resting state EEG data collected as part of the phase 2 clinical trial evaluating BPN14770 was conducted using a machine learning classification algorithm to classify trial conditions (i.e., baseline, drug, placebo) via linear EEG variable combinations. The algorithm identified a composite of peak alpha frequencies (PAF) across multiple brain regions as a potential biomarker demonstrating BPN14770 efficacy. Increased PAF from baseline was associated with drug but not placebo. Given the relationship between PAF and cognitive function among typically developed adults and those with intellectual disability, as well as previously reported reductions in alpha frequency and power in FXS, PAF represents a potential physiological measure of BPN14770 efficacy.

Comparative assessment of fairness definitions and bias mitigation strategies in machine learning-based diagnosis of Alzheimer's disease from MR images

Maria Eleftheria Vlontzou, Maria Athanasiou, Christos Davatzikos, Konstantina S. Nikita

arxiv logopreprintMay 29 2025
The present study performs a comprehensive fairness analysis of machine learning (ML) models for the diagnosis of Mild Cognitive Impairment (MCI) and Alzheimer's disease (AD) from MRI-derived neuroimaging features. Biases associated with age, race, and gender in a multi-cohort dataset, as well as the influence of proxy features encoding these sensitive attributes, are investigated. The reliability of various fairness definitions and metrics in the identification of such biases is also assessed. Based on the most appropriate fairness measures, a comparative analysis of widely used pre-processing, in-processing, and post-processing bias mitigation strategies is performed. Moreover, a novel composite measure is introduced to quantify the trade-off between fairness and performance by considering the F1-score and the equalized odds ratio, making it appropriate for medical diagnostic applications. The obtained results reveal the existence of biases related to age and race, while no significant gender bias is observed. The deployed mitigation strategies yield varying improvements in terms of fairness across the different sensitive attributes and studied subproblems. For race and gender, Reject Option Classification improves equalized odds by 46% and 57%, respectively, and achieves harmonic mean scores of 0.75 and 0.80 in the MCI versus AD subproblem, whereas for age, in the same subproblem, adversarial debiasing yields the highest equalized odds improvement of 40% with a harmonic mean score of 0.69. Insights are provided into how variations in AD neuropathology and risk factors, associated with demographic characteristics, influence model fairness.

Prediction of clinical stages of cervical cancer via machine learning integrated with clinical features and ultrasound-based radiomics.

Zhang M, Zhang Q, Wang X, Peng X, Chen J, Yang H

pubmed logopapersMay 29 2025
To investigate the prediction of a model constructed by combining machine learning (ML) with clinical features and ultrasound radiomics in the clinical staging of cervical cancer. General clinical and ultrasound data of 227 patients with cervical cancer who received transvaginal ultrasonography were retrospectively analyzed. The region of interest (ROI) radiomics profiles of the original image and derived image were retrieved and profile screening was performed. The chosen profiles were employed in radiomics model and Radscore formula construction. Prediction models were developed utilizing several ML algorithms by Python based on an integrated dataset of clinical features and ultrasound radiomics. Model performances were evaluated via AUC. Plot calibration curves and clinical decision curves were used to assess model efficacy. The model developed by support vector machine (SVM) emerged as the superior model. Integrating clinical characteristics with ultrasound radiomics, it showed notable performance metrics in both the training and validation datasets. Specifically, in the training set, the model obtained an AUC of 0.88 (95% Confidence Interval (CI): 0.83-0.93), alongside a 0.84 accuracy, 0.68 sensitivity, and 0.91 specificity. When validated, the model maintained an AUC of 0.77 (95% CI: 0.63-0.88), with 0.77 accuracy, 0.62 sensitivity, and 0.83 specificity. The calibration curve aligned closely with the perfect calibration line. Additionally, based on the clinical decision curve analysis, the model offers clinical utility over wide-ranging threshold possibilities. The clinical- and radiomics-based SVM model provides a noninvasive tool for predicting cervical cancer stage, integrating ultrasound radiomics and key clinical factors (age, abortion history) to improve risk stratification. This approach could guide personalized treatment (surgery vs. chemoradiation) and optimize staging accuracy, particularly in resource-limited settings where advanced imaging is scarce.

Automated classification of midpalatal suture maturation stages from CBCTs using an end-to-end deep learning framework.

Milani OH, Mills L, Nikho A, Tliba M, Allareddy V, Ansari R, Cetin AE, Elnagar MH

pubmed logopapersMay 29 2025
Accurate classification of midpalatal suture maturation stages is critical for orthodontic diagnosis, treatment planning, and the assessment of maxillary growth. Cone Beam Computed Tomography (CBCT) imaging offers detailed insights into this craniofacial structure but poses unique challenges for deep learning image recognition model design due to its high dimensionality, noise artifacts, and variability in image quality. To address these challenges, we propose a novel technique that highlights key image features through a simple filtering process to improve image clarity prior to analysis, thereby enhancing the learning process and better aligning with the distribution of the input data domain. Our preprocessing steps include region-of-interest extraction, followed by high-pass and Sobel filtering for emphasis of low-level features. The feature extraction integrates Convolutional Neural Networks (CNN) architectures, such as EfficientNet and ResNet18, alongside our novel Multi-Filter Convolutional Residual Attention Network (MFCRAN) enhanced with Discrete Cosine Transform (DCT) layers. Moreover, to better capture the inherent order within the data classes, we augment the supervised training process with a ranking loss by attending to the relationship within the label domain. Furthermore, to adhere to diagnostic constraints while training the model, we introduce a tailored data augmentation strategy to improve classification accuracy and robustness. In order to validate our method, we employed a k-fold cross-validation protocol on a private dataset comprising 618 CBCT images, annotated into five stages (A, B, C, D, and E) by expert evaluators. The experimental results demonstrate the effectiveness of our proposed approach, achieving the highest classification accuracy of 79.02%, significantly outperforming competing architectures, which achieved accuracies ranging from 71.87 to 78.05%. This work introduces a novel and fully automated framework for midpalatal suture maturation classification, marking a substantial advancement in orthodontic diagnostics and treatment planning.

Ultrasound image-based contrastive fusion non-invasive liver fibrosis staging algorithm.

Dong X, Tan Q, Xu S, Zhang J, Zhou M

pubmed logopapersMay 29 2025
The diagnosis of liver fibrosis is usually based on histopathological examination of liver puncture specimens. Although liver puncture is accurate, it has invasive risks and high economic costs, which are difficult for some patients to accept. Therefore, this study uses deep learning technology to build a liver fibrosis diagnosis model to achieve non-invasive staging of liver fibrosis, avoid complications, and reduce costs. This study uses ultrasound examination to obtain pure liver parenchyma image section data. With the consent of the patient, combined with the results of percutaneous liver puncture biopsy, the degree of liver fibrosis indicated by ultrasound examination data is judged. The concept of Fibrosis Contrast Layer (FCL) is creatively introduced in our experimental method, which can help our model more keenly capture the significant differences in the characteristics of liver fibrosis of various grades. Finally, through label fusion (LF), the characteristics of liver specimens of the same fibrosis stage are abstracted and fused to improve the accuracy and stability of the diagnostic model. Experimental evaluation demonstrated that our model achieved an accuracy of 85.6%, outperforming baseline models such as ResNet (81.9%), InceptionNet (80.9%), and VGG (80.8%). Even under a small-sample condition (30% data), the model maintained an accuracy of 84.8%, significantly outperforming traditional deep-learning models exhibiting sharp performance declines. The training results show that in the whole sample data set and 30% small sample data set training environments, the FCLLF model's test performance results are better than those of traditional deep learning models such as VGG, ResNet, and InceptionNet. The performance of the FCLLF model is more stable, especially in the small sample data set environment. Our proposed FCLLF model effectively improves the accuracy and stability of liver fibrosis staging using non-invasive ultrasound imaging.

Deep Modeling and Optimization of Medical Image Classification

Yihang Wu, Muhammad Owais, Reem Kateb, Ahmad Chaddad

arxiv logopreprintMay 29 2025
Deep models, such as convolutional neural networks (CNNs) and vision transformer (ViT), demonstrate remarkable performance in image classification. However, those deep models require large data to fine-tune, which is impractical in the medical domain due to the data privacy issue. Furthermore, despite the feasible performance of contrastive language image pre-training (CLIP) in the natural domain, the potential of CLIP has not been fully investigated in the medical field. To face these challenges, we considered three scenarios: 1) we introduce a novel CLIP variant using four CNNs and eight ViTs as image encoders for the classification of brain cancer and skin cancer, 2) we combine 12 deep models with two federated learning techniques to protect data privacy, and 3) we involve traditional machine learning (ML) methods to improve the generalization ability of those deep models in unseen domain data. The experimental results indicate that maxvit shows the highest averaged (AVG) test metrics (AVG = 87.03\%) in HAM10000 dataset with multimodal learning, while convnext\_l demonstrates remarkable test with an F1-score of 83.98\% compared to swin\_b with 81.33\% in FL model. Furthermore, the use of support vector machine (SVM) can improve the overall test metrics with AVG of $\sim 2\%$ for swin transformer series in ISIC2018. Our codes are available at https://github.com/AIPMLab/SkinCancerSimulation.

Can Large Language Models Challenge CNNS in Medical Image Analysis?

Shibbir Ahmed, Shahnewaz Karim Sakib, Anindya Bijoy Das

arxiv logopreprintMay 29 2025
This study presents a multimodal AI framework designed for precisely classifying medical diagnostic images. Utilizing publicly available datasets, the proposed system compares the strengths of convolutional neural networks (CNNs) and different large language models (LLMs). This in-depth comparative analysis highlights key differences in diagnostic performance, execution efficiency, and environmental impacts. Model evaluation was based on accuracy, F1-score, average execution time, average energy consumption, and estimated $CO_2$ emission. The findings indicate that although CNN-based models can outperform various multimodal techniques that incorporate both images and contextual information, applying additional filtering on top of LLMs can lead to substantial performance gains. These findings highlight the transformative potential of multimodal AI systems to enhance the reliability, efficiency, and scalability of medical diagnostics in clinical settings.

CT-Based Radiomics for Predicting PD-L1 Expression in Non-small Cell Lung Cancer: A Systematic Review and Meta-analysis.

Salimi M, Vadipour P, Khosravi A, Salimi B, Mabani M, Rostami P, Seifi S

pubmed logopapersMay 29 2025
The efficacy of immunotherapy in non-small cell lung cancer (NSCLC) is intricately associated with baseline PD-L1 expression rates. The standard method for measuring PD-L1 is immunohistochemistry, which is invasive and may not capture tumor heterogeneity. The primary aim of the current study is to assess whether CT-based radiomics models can accurately predict PD-L1 expression status in NSCLC and evaluate their quality and potential gaps in their design. Scopus, PubMed, Web of Science, Embase, and IEEE databases were systematically searched up until February 14, 2025, to retrieve relevant studies. Data from validation cohorts of models that classified patients by tumor proportion score (TPS) of 1% (TPS1) and 50% (TPS50) were extracted and analyzed separately. Quality assessment was performed through METRICS and QUADAS-2 tools. Diagnostic test accuracy meta-analysis was conducted using a bivariate random-effects approach to pool values of performance metrics. The qualitative synthesis included twenty-two studies, and the meta-analysis analyzed 11 studies with 997 individual subjects. The pooled AUC, sensitivity, and specificity of TPS1 models were 0.85, 0.76, and 0.79, respectively. The pooled AUC, sensitivity, and specificity of TPS50 models were 0.88, 0.72, and 0.86, accordingly. The QUADAS-2 tool identified a substantial risk of bias regarding the flow and timing and index test domains. Certain methodological limitations were highlighted by the METRICS score, which averaged 58.1% and ranged from 24% to 83.4%. CT-based radiomics demonstrates strong potential as a non-invasive method for predicting PD-L1 expression in NSCLC. While promising, significant methodological gaps must be addressed to achieve the generalizability and reliability required for clinical application.
Page 141 of 1781774 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.