Sort by:
Page 161 of 1701699 results

Evaluating an information theoretic approach for selecting multimodal data fusion methods.

Zhang T, Ding R, Luong KD, Hsu W

pubmed logopapersMay 10 2025
Interest has grown in combining radiology, pathology, genomic, and clinical data to improve the accuracy of diagnostic and prognostic predictions toward precision health. However, most existing works choose their datasets and modeling approaches empirically and in an ad hoc manner. A prior study proposed four partial information decomposition (PID)-based metrics to provide a theoretical understanding of multimodal data interactions: redundancy, uniqueness of each modality, and synergy. However, these metrics have only been evaluated in a limited collection of biomedical data, and the existing work does not elucidate the effect of parameter selection when calculating the PID metrics. In this work, we evaluate PID metrics on a wider range of biomedical data, including clinical, radiology, pathology, and genomic data, and propose potential improvements to the PID metrics. We apply the PID metrics to seven different modality pairs across four distinct cohorts (datasets). We compare and interpret trends in the resulting PID metrics and downstream model performance in these multimodal cohorts. The downstream tasks being evaluated include predicting the prognosis (either overall survival or recurrence) of patients with non-small cell lung cancer, prostate cancer, and glioblastoma. We found that, while PID metrics are informative, solely relying on these metrics to decide on a fusion approach does not always yield a machine learning model with optimal performance. Of the seven different modality pairs, three had poor (0%), three had moderate (66%-89%), and only one had perfect (100%) consistency between the PID values and model performance. We propose two improvements to the PID metrics (determining the optimal parameters and uncertainty estimation) and identified areas where PID metrics could be further improved. The current PID metrics are not accurate enough for estimating the multimodal data interactions and need to be improved before they can serve as a reliable tool. We propose improvements and provide suggestions for future work. Code: https://github.com/zhtyolivia/pid-multimodal.

Deeply Explainable Artificial Neural Network

David Zucker

arxiv logopreprintMay 10 2025
While deep learning models have demonstrated remarkable success in numerous domains, their black-box nature remains a significant limitation, especially in critical fields such as medical image analysis and inference. Existing explainability methods, such as SHAP, LIME, and Grad-CAM, are typically applied post hoc, adding computational overhead and sometimes producing inconsistent or ambiguous results. In this paper, we present the Deeply Explainable Artificial Neural Network (DxANN), a novel deep learning architecture that embeds explainability ante hoc, directly into the training process. Unlike conventional models that require external interpretation methods, DxANN is designed to produce per-sample, per-feature explanations as part of the forward pass. Built on a flow-based framework, it enables both accurate predictions and transparent decision-making, and is particularly well-suited for image-based tasks. While our focus is on medical imaging, the DxANN architecture is readily adaptable to other data modalities, including tabular and sequential data. DxANN marks a step forward toward intrinsically interpretable deep learning, offering a practical solution for applications where trust and accountability are essential.

Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease Classification

Daniel Strick, Carlos Garcia, Anthony Huang

arxiv logopreprintMay 10 2025
Deep learning for radiologic image analysis is a rapidly growing field in biomedical research and is likely to become a standard practice in modern medicine. On the publicly available NIH ChestX-ray14 dataset, containing X-ray images that are classified by the presence or absence of 14 different diseases, we reproduced an algorithm known as CheXNet, as well as explored other algorithms that outperform CheXNet's baseline metrics. Model performance was primarily evaluated using the F1 score and AUC-ROC, both of which are critical metrics for imbalanced, multi-label classification tasks in medical imaging. The best model achieved an average AUC-ROC score of 0.85 and an average F1 score of 0.39 across all 14 disease classifications present in the dataset.

Machine learning approaches for classifying major depressive disorder using biological and neuropsychological markers: A meta-analysis.

Zhang L, Jian L, Long Y, Ren Z, Calhoun VD, Passos IC, Tian X, Xiang Y

pubmed logopapersMay 10 2025
Traditional diagnostic methods for major depressive disorder (MDD), which rely on subjective assessments, may compromise diagnostic accuracy. In contrast, machine learning models have the potential to classify and diagnose MDD more effectively, reducing the risk of misdiagnosis associated with conventional methods. The aim of this meta-analysis is to evaluate the overall classification accuracy of machine learning models in MDD and examine the effects of machine learning algorithms, biomarkers, diagnostic comparison groups, validation procedures, and participant age on classification performance. As of September 2024, a total of 176 studies were ultimately included in the meta-analysis, encompassing a total of 60,926 participants. A random-effects model was applied to analyze the extracted data, resulting in an overall classification accuracy of 0.825 (95% CI [0.810; 0.839]). Convolutional neural networks significantly outperformed support vector machines (SVM) when using electroencephalography and magnetoencephalography data. Additionally, SVM demonstrated significantly better performance with functional magnetic resonance imaging data compared to graph neural networks and gaussian process classification. The sample size was negatively correlated to classification accuracy. Furthermore, evidence of publication bias was also detected. Therefore, while this study indicates that machine learning models show high accuracy in distinguishing MDD from healthy controls and other psychiatric disorders, further research is required before these findings can be generalized to large-scale clinical practice.

Intra- and Peritumoral Radiomics Based on Ultrasound Images for Preoperative Differentiation of Follicular Thyroid Adenoma, Carcinoma, and Follicular Tumor With Uncertain Malignant Potential.

Fu Y, Mei F, Shi L, Ma Y, Liang H, Huang L, Fu R, Cui L

pubmed logopapersMay 10 2025
Differentiating between follicular thyroid adenoma (FTA), carcinoma (FTC), and follicular tumor with uncertain malignant potential (FT-UMP) remains challenging due to their overlapping ultrasound characteristics. This retrospective study aimed to enhance preoperative diagnostic accuracy by utilizing intra- and peritumoral radiomics based on ultrasound images. We collected post-thyroidectomy ultrasound images from 774 patients diagnosed with FTA (n = 429), FTC (n = 158), or FT-UMP (n = 187) between January 2018 and December 2023. Six peritumoral regions were expanded by 5%-30% in 5% increments, with the segment-anything model utilizing prompt learning to detect the field of view and constrain the expanded boundaries. A stepwise classification strategy addressing three tasks was implemented: distinguishing FTA from the other types (task 1), differentiating FTC from FT-UMP (task 2), and classifying all three tumors. Diagnostic models were developed by combining radiomic features from tumor and peritumoral regions with clinical characteristics. Clinical characteristics combined with intratumoral and 5% peritumoral radiomic features performed best across all tasks (Test set: area under the curves, 0.93 for task 1 and 0.90 for task 2; diagnostic accuracy, 79.9%). The DeLong test indicated that all peritumoral radiomics significantly improved intratumoral radiomics performance and clinical characteristics (p < 0.04). The 5% peritumoral regions showed the best performance, though not all results were significant (p = 0.01-0.91). Ultrasound-based intratumoral and peritumoral radiomics can significantly enhance preoperative diagnostic accuracy for FTA, FTC, and FT-UMP, leading to improved treatment strategies and patient outcomes. Furthermore, the 5% peritumoral area may indicate regions of potential tumor invasion requiring further investigation.

Radiomics prediction of surgery in ulcerative colitis refractory to medical treatment.

Sakamoto K, Okabayashi K, Seishima R, Shigeta K, Kiyohara H, Mikami Y, Kanai T, Kitagawa Y

pubmed logopapersMay 10 2025
The surgeries in drug-resistant ulcerative colitis are determined by complex factors. This study evaluated the predictive performance of radiomics analysis on the basis of whether patients with ulcerative colitis in hospital were in the surgical or medical treatment group by discharge from hospital. This single-center retrospective cohort study used CT at admission of patients with US admitted from 2015 to 2022. The target of prediction was whether the patient would undergo surgery by the time of discharge. Radiomics features were extracted using the rectal wall at the level of the tailbone tip of the CT as the region of interest. CT data were randomly classified into a training cohort and a validation cohort, and LASSO regression was performed using the training cohort to create a formula for calculating the radiomics score. A total of 147 patients were selected, and data from 184 CT scans were collected. Data from 157 CT scans matched the selection criteria and were included. Five features were used for the radiomics score. Univariate logistic regression analysis of clinical information detected a significant influence of severity (p < 0.001), number of drugs used until surgery (p < 0.001), Lichtiger score (p = 0.024), and hemoglobin (p = 0.010). Using a nomogram combining these items, we found that the discriminatory power in the surgery and medical treatment groups was AUC 0.822 (95% confidence interval (CI) 0.841-0.951) for the training cohort and AUC 0.868 (95% CI 0.729-1.000) for the validation cohort, indicating a good ability to discriminate the outcomes. Radiomics analysis of CT images of patients with US at the time of admission, combined with clinical data, showed high predictive ability regarding a treatment strategy of surgery or medical treatment.

APD-FFNet: A Novel Explainable Deep Feature Fusion Network for Automated Periodontitis Diagnosis on Dental Panoramic Radiography.

Resul ES, Senirkentli GB, Bostanci E, Oduncuoglu BF

pubmed logopapersMay 9 2025
This study introduces APD-FFNet, a novel, explainable deep learning architecture for automated periodontitis diagnosis using panoramic radiographs. A total of 337 panoramic radiographs, annotated by a periodontist, served as the dataset. APD-FFNet combines custom convolutional and transformer-based layers within a deep feature fusion framework that captures both local and global contextual features. Performance was evaluated using accuracy, the F1 score, the area under the receiver operating characteristic curve, the Jaccard similarity coefficient, and the Matthews correlation coefficient. McNemar's test confirmed statistical significance, and SHapley Additive exPlanations provided interpretability insights. APD-FFNet achieved 94% accuracy, a 93.88% F1 score, 93.47% area under the receiver operating characteristic curve, 88.47% Jaccard similarity coefficient, and 88.46% Matthews correlation coefficient, surpassing comparable approaches. McNemar's test validated these findings (p < 0.05). Explanations generated by SHapley Additive exPlanations highlighted important regions in each radiograph, supporting clinical applicability. By merging convolutional and transformer-based layers, APD-FFNet establishes a new benchmark in automated, interpretable periodontitis diagnosis, with low hyperparameter sensitivity facilitating its integration into regular dental practice. Its adaptable design suggests broader relevance to other medical imaging domains. This is the first feature fusion method specifically devised for periodontitis diagnosis, supported by an expert-curated dataset and advanced explainable artificial intelligence. Its robust accuracy, low hyperparameter sensitivity, and transparent outputs set a new standard for automated periodontal analysis.

Neural Network-based Automated Classification of 18F-FDG PET/CT Lesions and Prognosis Prediction in Nasopharyngeal Carcinoma Without Distant Metastasis.

Lv Y, Zheng D, Wang R, Zhou Z, Gao Z, Lan X, Qin C

pubmed logopapersMay 9 2025
To evaluate the diagnostic performance of the PET Assisted Reporting System (PARS) in nasopharyngeal carcinoma (NPC) patients without distant metastasis, and to investigate the prognostic significance of the metabolic parameters. Eighty-three NPC patients who underwent pretreatment 18F-FDG PET/CT were retrospectively collected. First, the sensitivity, specificity, and accuracy of PARS for diagnosing malignant lesions were calculated, using histopathology as the gold standard. Next, metabolic parameters of the primary tumor were derived using both PARS and manual segmentation. The differences and consistency between the 2 methods were analyzed. Finally, the prognostic value of PET metabolic parameters was evaluated. Prognostic analysis of progression-free survival (PFS) and overall survival (OS) was conducted. PARS demonstrated high patient-based accuracy (97.2%), sensitivity (88.9%), and specificity (97.4%), and 96.7%, 84.0%, and 96.9% based on lesions. Manual segmentation yielded higher metabolic tumor volume (MTV) and total lesion glycolysis (TLG) than PARS. Metabolic parameters from both methods were highly correlated and consistent. ROC analysis showed metabolic parameters exhibited differences in prognostic prediction, but generally performed well in predicting 3-year PFS and OS overall. MTV and age were independent prognostic factors; Cox proportional-hazards models incorporating them showed significant predictive improvements when combined. Kaplan-Meier analysis confirmed better prognosis in the low-risk group based on combined indicators (χ² = 42.25, P < 0.001; χ² = 20.44, P < 0.001). Preliminary validation of PARS in NPC patients without distant metastasis shows high diagnostic sensitivity and accuracy for lesion identification and classification, and metabolic parameters correlate well with manual. MTV reflects prognosis, and its combination with age enhances prognostic prediction and risk stratification.

Predicting Knee Osteoarthritis Severity from Radiographic Predictors: Data from the Osteoarthritis Initiative.

Nurmirinta TAT, Turunen MJ, Tohka J, Mononen ME, Liukkonen MK

pubmed logopapersMay 9 2025
In knee osteoarthritis (KOA) treatment, preventive measures to reduce its onset risk are a key factor. Among individuals with radiographically healthy knees, however, future knee joint integrity and condition cannot be predicted by clinically applicable methods. We investigated if knee joint morphology derived from widely accessible and cost-effective radiographs could be helpful in predicting future knee joint integrity and condition. We combined knee joint morphology with known risk predictors such as age, height, and weight. Baseline data were utilized as predictors, and the maximal severity of KOA after 8 years served as a target variable. The three KOA categories in this study were based on Kellgren-Lawrence grading: healthy, moderate, and severe. We employed a two-stage machine learning model that utilized two random forest algorithms. We trained three models: the subject demographics (SD) model utilized only SD; the image model utilized only knee joint morphology from radiographs; the merged model utilized combined predictors. The training data comprised an 8-year follow-up of 1222 knees from 683 individuals. The SD- model obtained a weighted F1 score (WF1) of 77.2% and a balanced accuracy (BA) of 65.6%. The Image-model performance metrics were lowest, with a WF1 of 76.5% and BA of 63.8%. The top-performing merged model achieved a WF1 score of 78.3% and a BA of 68.2%. Our two-stage prediction model provided improved results based on performance metrics, suggesting potential for application in clinical settings.

Comparison between multimodal foundation models and radiologists for the diagnosis of challenging neuroradiology cases with text and images.

Le Guellec B, Bruge C, Chalhoub N, Chaton V, De Sousa E, Gaillandre Y, Hanafi R, Masy M, Vannod-Michel Q, Hamroun A, Kuchcinski G

pubmed logopapersMay 9 2025
The purpose of this study was to compare the ability of two multimodal models (GPT-4o and Gemini 1.5 Pro) with that of radiologists to generate differential diagnoses from textual context alone, key images alone, or a combination of both using complex neuroradiology cases. This retrospective study included neuroradiology cases from the "Diagnosis Please" series published in the Radiology journal between January 2008 and September 2024. The two multimodal models were asked to provide three differential diagnoses from textual context alone, key images alone, or the complete case. Six board-certified neuroradiologists solved the cases in the same setting, randomly assigned to two groups: context alone first and images alone first. Three radiologists solved the cases without, and then with the assistance of Gemini 1.5 Pro. An independent radiologist evaluated the quality of the image descriptions provided by GPT-4o and Gemini for each case. Differences in correct answers between multimodal models and radiologists were analyzed using McNemar test. GPT-4o and Gemini 1.5 Pro outperformed radiologists using clinical context alone (mean accuracy, 34.0 % [18/53] and 44.7 % [23.7/53] vs. 16.4 % [8.7/53]; both P < 0.01). Radiologists outperformed GPT-4o and Gemini 1.5 Pro using images alone (mean accuracy, 42.0 % [22.3/53] vs. 3.8 % [2/53], and 7.5 % [4/53]; both P < 0.01) and the complete cases (48.0 % [25.6/53] vs. 34.0 % [18/53], and 38.7 % [20.3/53]; both P < 0.001). While radiologists improved their accuracy when combining multimodal information (from 42.1 % [22.3/53] for images alone to 50.3 % [26.7/53] for complete cases; P < 0.01), GPT-4o and Gemini 1.5 Pro did not benefit from the multimodal context (from 34.0 % [18/53] for text alone to 35.2 % [18.7/53] for complete cases for GPT-4o; P = 0.48, and from 44.7 % [23.7/53] to 42.8 % [22.7/53] for Gemini 1.5 Pro; P = 0.54). Radiologists benefited significantly from the suggestion of Gemini 1.5 Pro, increasing their accuracy from 47.2 % [25/53] to 56.0 % [27/53] (P < 0.01). Both GPT-4o and Gemini 1.5 Pro correctly identified the imaging modality in 53/53 (100 %) and 51/53 (96.2 %) cases, respectively, but frequently failed to identify key imaging findings (43/53 cases [81.1 %] with incorrect identification of key imaging findings for GPT-4o and 50/53 [94.3 %] for Gemini 1.5). Radiologists show a specific ability to benefit from the integration of textual and visual information, whereas multimodal models mostly rely on the clinical context to suggest diagnoses.
Page 161 of 1701699 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.