Sort by:
Page 98 of 1071070 results

A novel framework for esophageal cancer grading: combining CT imaging, radiomics, reproducibility, and deep learning insights.

Alsallal M, Ahmed HH, Kareem RA, Yadav A, Ganesan S, Shankhyan A, Gupta S, Joshi KK, Sameer HN, Yaseen A, Athab ZH, Adil M, Farhood B

pubmed logopapersMay 10 2025
This study aims to create a reliable framework for grading esophageal cancer. The framework combines feature extraction, deep learning with attention mechanisms, and radiomics to ensure accuracy, interpretability, and practical use in tumor analysis. This retrospective study used data from 2,560 esophageal cancer patients across multiple clinical centers, collected from 2018 to 2023. The dataset included CT scan images and clinical information, representing a variety of cancer grades and types. Standardized CT imaging protocols were followed, and experienced radiologists manually segmented the tumor regions. Only high-quality data were used in the study. A total of 215 radiomic features were extracted using the SERA platform. The study used two deep learning models-DenseNet121 and EfficientNet-B0-enhanced with attention mechanisms to improve accuracy. A combined classification approach used both radiomic and deep learning features, and machine learning models like Random Forest, XGBoost, and CatBoost were applied. These models were validated with strict training and testing procedures to ensure effective cancer grading. This study analyzed the reliability and performance of radiomic and deep learning features for grading esophageal cancer. Radiomic features were classified into four reliability levels based on their ICC (Intraclass Correlation) values. Most of the features had excellent (ICC > 0.90) or good (0.75 < ICC ≤ 0.90) reliability. Deep learning features extracted from DenseNet121 and EfficientNet-B0 were also categorized, and some of them showed poor reliability. The machine learning models, including XGBoost and CatBoost, were tested for their ability to grade cancer. XGBoost with Recursive Feature Elimination (RFE) gave the best results for radiomic features, with an AUC (Area Under the Curve) of 91.36%. For deep learning features, XGBoost with Principal Component Analysis (PCA) gave the best results using DenseNet121, while CatBoost with RFE performed best with EfficientNet-B0, achieving an AUC of 94.20%. Combining radiomic and deep features led to significant improvements, with XGBoost achieving the highest AUC of 96.70%, accuracy of 96.71%, and sensitivity of 95.44%. The combination of both DenseNet121 and EfficientNet-B0 models in ensemble models achieved the best overall performance, with an AUC of 95.14% and accuracy of 94.88%. This study improves esophageal cancer grading by combining radiomics and deep learning. It enhances diagnostic accuracy, reproducibility, and interpretability, while also helping in personalized treatment planning through better tumor characterization. Not applicable.

Performance of fully automated deep-learning-based coronary artery calcium scoring in ECG-gated calcium CT and non-gated low-dose chest CT.

Kim S, Park EA, Ahn C, Jeong B, Lee YS, Lee W, Kim JH

pubmed logopapersMay 10 2025
This study aimed to validate the agreement and diagnostic performance of a deep-learning-based coronary artery calcium scoring (DL-CACS) system for ECG-gated and non-gated low-dose chest CT (LDCT) across multivendor datasets. In this retrospective study, datasets from Seoul National University Hospital (SNUH, 652 paired ECG-gated and non-gated CT scans) and the Stanford public dataset (425 ECG-gated and 199 non-gated CT scans) were analyzed. Agreement metrics included intraclass correlation coefficient (ICC), coefficient of determination (R²), and categorical agreement (κ). Diagnostic performance was assessed using categorical accuracy and the area under the receiver operating characteristic curve (AUROC). DL-CACS demonstrated excellent performance for ECG-gated CT in both datasets (SNUH: R² = 0.995, ICC = 0.997, κ = 0.97, AUROC = 0.99; Stanford: R² = 0.989, ICC = 0.990, κ = 0.97, AUROC = 0.99). For non-gated CT using manual LDCT CAC scores as a reference, performance was similarly high (R² = 0.988, ICC = 0.994, κ = 0.96, AUROC = 0.98-0.99). When using ECG-gated CT scores as the reference, performance for non-gated CT was slightly lower but remained robust (SNUH: R² = 0.948, ICC = 0.968, κ = 0.88, AUROC = 0.98-0.99; Stanford: R² = 0.949, ICC = 0.948, κ = 0.71, AUROC = 0.89-0.98). DL-CACS provides a reliable and automated solution for CACS, potentially reducing workload while maintaining robust performance in both ECG-gated and non-gated CT settings. Question How accurate and reliable is deep-learning-based coronary artery calcium scoring (DL-CACS) in ECG-gated CT and non-gated low-dose chest CT (LDCT) across multivendor datasets? Findings DL-CACS showed near-perfect performance for ECG-gated CT. For non-gated LDCT, performance was excellent using manual scores as the reference and lower but reliable when using ECG-gated CT scores. Clinical relevance DL-CACS provides a reliable and automated solution for CACS, potentially reducing workload and improving diagnostic workflow. It supports cardiovascular risk stratification and broader clinical adoption, especially in settings where ECG-gated CT is unavailable.

Intra- and Peritumoral Radiomics Based on Ultrasound Images for Preoperative Differentiation of Follicular Thyroid Adenoma, Carcinoma, and Follicular Tumor With Uncertain Malignant Potential.

Fu Y, Mei F, Shi L, Ma Y, Liang H, Huang L, Fu R, Cui L

pubmed logopapersMay 10 2025
Differentiating between follicular thyroid adenoma (FTA), carcinoma (FTC), and follicular tumor with uncertain malignant potential (FT-UMP) remains challenging due to their overlapping ultrasound characteristics. This retrospective study aimed to enhance preoperative diagnostic accuracy by utilizing intra- and peritumoral radiomics based on ultrasound images. We collected post-thyroidectomy ultrasound images from 774 patients diagnosed with FTA (n = 429), FTC (n = 158), or FT-UMP (n = 187) between January 2018 and December 2023. Six peritumoral regions were expanded by 5%-30% in 5% increments, with the segment-anything model utilizing prompt learning to detect the field of view and constrain the expanded boundaries. A stepwise classification strategy addressing three tasks was implemented: distinguishing FTA from the other types (task 1), differentiating FTC from FT-UMP (task 2), and classifying all three tumors. Diagnostic models were developed by combining radiomic features from tumor and peritumoral regions with clinical characteristics. Clinical characteristics combined with intratumoral and 5% peritumoral radiomic features performed best across all tasks (Test set: area under the curves, 0.93 for task 1 and 0.90 for task 2; diagnostic accuracy, 79.9%). The DeLong test indicated that all peritumoral radiomics significantly improved intratumoral radiomics performance and clinical characteristics (p < 0.04). The 5% peritumoral regions showed the best performance, though not all results were significant (p = 0.01-0.91). Ultrasound-based intratumoral and peritumoral radiomics can significantly enhance preoperative diagnostic accuracy for FTA, FTC, and FT-UMP, leading to improved treatment strategies and patient outcomes. Furthermore, the 5% peritumoral area may indicate regions of potential tumor invasion requiring further investigation.

Machine learning approaches for classifying major depressive disorder using biological and neuropsychological markers: A meta-analysis.

Zhang L, Jian L, Long Y, Ren Z, Calhoun VD, Passos IC, Tian X, Xiang Y

pubmed logopapersMay 10 2025
Traditional diagnostic methods for major depressive disorder (MDD), which rely on subjective assessments, may compromise diagnostic accuracy. In contrast, machine learning models have the potential to classify and diagnose MDD more effectively, reducing the risk of misdiagnosis associated with conventional methods. The aim of this meta-analysis is to evaluate the overall classification accuracy of machine learning models in MDD and examine the effects of machine learning algorithms, biomarkers, diagnostic comparison groups, validation procedures, and participant age on classification performance. As of September 2024, a total of 176 studies were ultimately included in the meta-analysis, encompassing a total of 60,926 participants. A random-effects model was applied to analyze the extracted data, resulting in an overall classification accuracy of 0.825 (95% CI [0.810; 0.839]). Convolutional neural networks significantly outperformed support vector machines (SVM) when using electroencephalography and magnetoencephalography data. Additionally, SVM demonstrated significantly better performance with functional magnetic resonance imaging data compared to graph neural networks and gaussian process classification. The sample size was negatively correlated to classification accuracy. Furthermore, evidence of publication bias was also detected. Therefore, while this study indicates that machine learning models show high accuracy in distinguishing MDD from healthy controls and other psychiatric disorders, further research is required before these findings can be generalized to large-scale clinical practice.

Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease Classification

Daniel Strick, Carlos Garcia, Anthony Huang

arxiv logopreprintMay 10 2025
Deep learning for radiologic image analysis is a rapidly growing field in biomedical research and is likely to become a standard practice in modern medicine. On the publicly available NIH ChestX-ray14 dataset, containing X-ray images that are classified by the presence or absence of 14 different diseases, we reproduced an algorithm known as CheXNet, as well as explored other algorithms that outperform CheXNet's baseline metrics. Model performance was primarily evaluated using the F1 score and AUC-ROC, both of which are critical metrics for imbalanced, multi-label classification tasks in medical imaging. The best model achieved an average AUC-ROC score of 0.85 and an average F1 score of 0.39 across all 14 disease classifications present in the dataset.

Deeply Explainable Artificial Neural Network

David Zucker

arxiv logopreprintMay 10 2025
While deep learning models have demonstrated remarkable success in numerous domains, their black-box nature remains a significant limitation, especially in critical fields such as medical image analysis and inference. Existing explainability methods, such as SHAP, LIME, and Grad-CAM, are typically applied post hoc, adding computational overhead and sometimes producing inconsistent or ambiguous results. In this paper, we present the Deeply Explainable Artificial Neural Network (DxANN), a novel deep learning architecture that embeds explainability ante hoc, directly into the training process. Unlike conventional models that require external interpretation methods, DxANN is designed to produce per-sample, per-feature explanations as part of the forward pass. Built on a flow-based framework, it enables both accurate predictions and transparent decision-making, and is particularly well-suited for image-based tasks. While our focus is on medical imaging, the DxANN architecture is readily adaptable to other data modalities, including tabular and sequential data. DxANN marks a step forward toward intrinsically interpretable deep learning, offering a practical solution for applications where trust and accountability are essential.

Preoperative radiomics models using CT and MRI for microsatellite instability in colorectal cancer: a systematic review and meta-analysis.

Capello Ingold G, Martins da Fonseca J, Kolenda Zloić S, Verdan Moreira S, Kago Marole K, Finnegan E, Yoshikawa MH, Daugėlaitė S, Souza E Silva TX, Soato Ratti MA

pubmed logopapersMay 10 2025
Microsatellite instability (MSI) is a novel predictive biomarker for chemotherapy and immunotherapy response, as well as prognostic indicator in colorectal cancer (CRC). The current standard for MSI identification is polymerase chain reaction (PCR) testing or the immunohistochemical analysis of tumor biopsy samples. However, tumor heterogeneity and procedure complications pose challenges to these techniques. CT and MRI-based radiomics models offer a promising non-invasive approach for this purpose. A systematic search of PubMed, Embase, Cochrane Library and Scopus was conducted to identify studies evaluating the diagnostic performance of CT and MRI-based radiomics models for detecting MSI status in CRC. Pooled area under the curve (AUC), sensitivity, and specificity were calculated in RStudio using a random-effects model. Forest plots and a summary ROC curve were generated. Heterogeneity was assessed using I² statistics and explored through sensitivity analyses, threshold effect assessment, subgroup analyses and meta-regression. 17 studies with a total of 6,045 subjects were included in the analysis. All studies extracted radiomic features from CT or MRI images of CRC patients with confirmed MSI status to train machine learning models. The pooled AUC was 0.815 (95% CI: 0.784-0.840) for CT-based studies and 0.900 (95% CI: 0.819-0.943) for MRI-based studies. Significant heterogeneity was identified and addressed through extensive analysis. Radiomics models represent a novel and promising tool for predicting MSI status in CRC patients. These findings may serve as a foundation for future studies aimed at developing and validating improved models, ultimately enhancing the diagnosis, treatment, and prognosis of colorectal cancer.

Multiparameter MRI-based model integrating radiomics and deep learning for preoperative staging of laryngeal squamous cell carcinoma.

Xie K, Jiang H, Chen X, Ning Y, Yu Q, Lv F, Liu R, Zhou Y, Xu L, Yue Q, Peng J

pubmed logopapersMay 9 2025
The accurate preoperative staging of laryngeal squamous cell carcinoma (LSCC) provides valuable guidance for clinical decision-making. The objective of this study was to establish a multiparametric MRI model using radiomics and deep learning (DL) to preoperatively distinguish between Stages I-II and III-IV of LSCC. Data from 401 histologically confirmed LSCC patients were collected from two centers (training set: 213; internal test set: 91; external test set: 97). Radiomics features were extracted from the MRI images, and seven radiomics models based on single and combined sequences were developed via random forest (RF). A DL model was constructed via ResNet 18, where DL features were extracted from its final fully connected layer. These features were fused with crucial radiomics features to create a combined model. The performance of the models was assessed using the area under the receiver operating characteristic (ROC) curve (AUC) and compared with the radiologist performances. The predictive capability of the combined model for Progression-Free Survival (PFS) was evaluated via Kaplan-Meier survival analysis and the Harrell's Concordance Index (C-index). In the external test set, the combined model had an AUC of 0.877 (95% CI 0.807-0.946), outperforming the DL model (AUC: 0.811) and the optimal radiomics model (AUC: 0.835). The combined model significantly outperformed both the DL (p = 0.017) and the optimal radiomics models (p = 0.039), and the radiologists (both p < 0.050). Moreover, the combined model demonstrated great prognostic predictive value in patients with LSCC, achieving a C-index of 0.624 for PFS. This combined model enhances preoperative LSCC staging, aiding in making more informed clinical decisions.

Resting-state functional MRI metrics to detect freezing of gait in Parkinson's disease: a machine learning approach.

Vicidomini C, Fontanella F, D'Alessandro T, Roviello GN, De Stefano C, Stocchi F, Quarantelli M, De Pandis MF

pubmed logopapersMay 9 2025
Among the symptoms that can occur in Parkinson's disease (PD), Freezing of Gait (FOG) is a disabling phenomenon affecting a large proportion of patients, and it remains not fully understood. Accurate classification of FOG in PD is crucial for tailoring effective interventions and is necessary for a better understanding of its underlying mechanisms. In the present work, we applied four Machine Learning (ML) classifiers (Decision Tree - DT, Random Forest - RF, Multilayer Perceptron - MLP, Logistic Regression - LOG) to different four metrics derived from resting-state functional Magnetic Resonance Imaging (rs-fMRI) data processing to assess their accuracy in automatically classifying PD patients based on the presence or absence of Freezing of Gait (FOG). To validate our approach, we applied the same methodologies to distinguish PD patients from a group of Healthy Subject (HS). The performance of the four ML algorithms was validated by repeated k-fold cross-validation on randomly selected independent training and validation subsets. The results showed that when discriminating PD from HS, the best performance was achieved using RF applied to fractional Amplitude of Low-Frequency Fluctuations (fALFF) data (AUC 96.8 ± 2 %). Similarly, when discriminating PD-FOG from PD-nFOG, the RF algorithm was again the best performer on all four metrics, with AUCs above 90 %. Finally, trying to unbox how AI system black-box choices were made, we extracted features' importance scores for the best-performing method(s) and discussed them based on the results obtained to date in rs-fMRI studies on FOG in PD and, more generally, in PD. In summary, regions that were more frequently selected when differentiating both PD from HS and PD-FOG from PD-nFOG patients were mainly relevant to the extrapyramidal system, as well as visual and default mode networks. In addition, the salience network and the supplementary motor area played an additional major role in differentiating PD-FOG from PD-nFOG patients.

Comparison between multimodal foundation models and radiologists for the diagnosis of challenging neuroradiology cases with text and images.

Le Guellec B, Bruge C, Chalhoub N, Chaton V, De Sousa E, Gaillandre Y, Hanafi R, Masy M, Vannod-Michel Q, Hamroun A, Kuchcinski G

pubmed logopapersMay 9 2025
The purpose of this study was to compare the ability of two multimodal models (GPT-4o and Gemini 1.5 Pro) with that of radiologists to generate differential diagnoses from textual context alone, key images alone, or a combination of both using complex neuroradiology cases. This retrospective study included neuroradiology cases from the "Diagnosis Please" series published in the Radiology journal between January 2008 and September 2024. The two multimodal models were asked to provide three differential diagnoses from textual context alone, key images alone, or the complete case. Six board-certified neuroradiologists solved the cases in the same setting, randomly assigned to two groups: context alone first and images alone first. Three radiologists solved the cases without, and then with the assistance of Gemini 1.5 Pro. An independent radiologist evaluated the quality of the image descriptions provided by GPT-4o and Gemini for each case. Differences in correct answers between multimodal models and radiologists were analyzed using McNemar test. GPT-4o and Gemini 1.5 Pro outperformed radiologists using clinical context alone (mean accuracy, 34.0 % [18/53] and 44.7 % [23.7/53] vs. 16.4 % [8.7/53]; both P < 0.01). Radiologists outperformed GPT-4o and Gemini 1.5 Pro using images alone (mean accuracy, 42.0 % [22.3/53] vs. 3.8 % [2/53], and 7.5 % [4/53]; both P < 0.01) and the complete cases (48.0 % [25.6/53] vs. 34.0 % [18/53], and 38.7 % [20.3/53]; both P < 0.001). While radiologists improved their accuracy when combining multimodal information (from 42.1 % [22.3/53] for images alone to 50.3 % [26.7/53] for complete cases; P < 0.01), GPT-4o and Gemini 1.5 Pro did not benefit from the multimodal context (from 34.0 % [18/53] for text alone to 35.2 % [18.7/53] for complete cases for GPT-4o; P = 0.48, and from 44.7 % [23.7/53] to 42.8 % [22.7/53] for Gemini 1.5 Pro; P = 0.54). Radiologists benefited significantly from the suggestion of Gemini 1.5 Pro, increasing their accuracy from 47.2 % [25/53] to 56.0 % [27/53] (P < 0.01). Both GPT-4o and Gemini 1.5 Pro correctly identified the imaging modality in 53/53 (100 %) and 51/53 (96.2 %) cases, respectively, but frequently failed to identify key imaging findings (43/53 cases [81.1 %] with incorrect identification of key imaging findings for GPT-4o and 50/53 [94.3 %] for Gemini 1.5). Radiologists show a specific ability to benefit from the integration of textual and visual information, whereas multimodal models mostly rely on the clinical context to suggest diagnoses.
Page 98 of 1071070 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.