Latest Papers on Radiology AI. Category: papers, Order: Best Match, Limit: 10.

Children Are Not Small Adults: Addressing Limited Generalizability of an Adult Deep Learning CT Organ Segmentation Model to the Pediatric Population.

Chatterjee D, Kanhere A, Doo FX, Zhao J, Chan A, Welsh A, Kulkarni P, Trang A, Parekh VS, Yi PH

•papers•Jun 1 2025

Deep learning (DL) tools developed on adult data sets may not generalize well to pediatric patients, posing potential safety risks. We evaluated the performance of TotalSegmentator, a state-of-the-art adult-trained CT organ segmentation model, on a subset of organs in a pediatric CT dataset and explored optimization strategies to improve pediatric segmentation performance. TotalSegmentator was retrospectively evaluated on abdominal CT scans from an external adult dataset (n = 300) and an external pediatric data set (n = 359). Generalizability was quantified by comparing Dice scores between adult and pediatric external data sets using Mann-Whitney U tests. Two DL optimization approaches were then evaluated: (1) 3D nnU-Net model trained on only pediatric data, and (2) an adult nnU-Net model fine-tuned on the pediatric cases. Our results show TotalSegmentator had significantly lower overall mean Dice scores on pediatric vs. adult CT scans (0.73 vs. 0.81, P < .001) demonstrating limited generalizability to pediatric CT scans. Stratified by organ, there was lower mean pediatric Dice score for four organs (P < .001, all): right and left adrenal glands (right adrenal, 0.41 [0.39-0.43] vs. 0.69 [0.66-0.71]; left adrenal, 0.35 [0.32-0.37] vs. 0.68 [0.65-0.71]); duodenum (0.47 [0.45-0.49] vs. 0.67 [0.64-0.69]); and pancreas (0.73 [0.72-0.74] vs. 0.79 [0.77-0.81]). Performance on pediatric CT scans improved by developing pediatric-specific models and fine-tuning an adult-trained model on pediatric images where both methods significantly improved segmentation accuracy over TotalSegmentator for all organs, especially for smaller anatomical structures (e.g., > 0.2 higher mean Dice for adrenal glands; P < .001).

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Leveraging Ensemble Models and Follow-up Data for Accurate Prediction of mRS Scores from Radiomic Features of DSC-PWI Images.

Yassin MM, Zaman A, Lu J, Yang H, Cao A, Hassan H, Han T, Miao X, Shi Y, Guo Y, Luo Y, Kang Y

•papers•Jun 1 2025

Predicting long-term clinical outcomes based on the early DSC PWI MRI scan is valuable for prognostication, resource management, clinical trials, and patient expectations. Current methods require subjective decisions about which imaging features to assess and may require time-consuming postprocessing. This study's goal was to predict multilabel 90-day modified Rankin Scale (mRS) score in acute ischemic stroke patients by combining ensemble models and different configurations of radiomic features generated from Dynamic susceptibility contrast perfusion-weighted imaging. In Follow-up studies, a total of 70 acute ischemic stroke (AIS) patients underwent magnetic resonance imaging within 24 hours poststroke and had a follow-up scan. In the single study, 150 DSC PWI Image scans for AIS patients. The DRF are extracted from DSC-PWI Scans. Then Lasso algorithm is applied for feature selection, then new features are generated from initial and follow-up scans. Then we applied different ensemble models to classify between three classes normal outcome (0, 1 mRS score), moderate outcome (2,3,4 mRS score), and severe outcome (5,6 mRS score). ANOVA and post-hoc Tukey HSD tests confirmed significant differences in model style performance across various studies and classification techniques. Stacking models consistently on average outperformed others, achieving an Accuracy of 0.68 ± 0.15, Precision of 0.68 ± 0.17, Recall of 0.65 ± 0.14, and F1 score of 0.63 ± 0.15 in the follow-up time study. Techniques like Bo_Smote showed significantly higher recall and F1 scores, highlighting their robustness and effectiveness in handling imbalanced data. Ensemble models, particularly Bagging and Stacking, demonstrated superior performance, achieving nearly 0.93 in Accuracy, 0.95 in Precision, 0.94 in Recall, and 0.94 in F1 metrics in follow-up conditions, significantly outperforming single models. Ensemble models based on radiomics generated from combining Initial and follow-up scans can be used to predict multilabel 90-day stroke outcomes with reduced subjectivity and user burden.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Automated Neural Architecture Search for Cardiac Amyloidosis Classification from [18F]-Florbetaben PET Images.

Bargagna F, Zigrino D, De Santi LA, Genovesi D, Scipioni M, Favilli B, Vergaro G, Emdin M, Giorgetti A, Positano V, Santarelli MF

•papers•Jun 1 2025

Medical image classification using convolutional neural networks (CNNs) is promising but often requires extensive manual tuning for optimal model definition. Neural architecture search (NAS) automates this process, reducing human intervention significantly. This study applies NAS to [18F]-Florbetaben PET cardiac images for classifying cardiac amyloidosis (CA) sub-types (amyloid light chain (AL) and transthyretin amyloid (ATTR)) and controls. Following data preprocessing and augmentation, an evolutionary cell-based NAS approach with a fixed network macro-structure is employed, automatically deriving cells' micro-structure. The algorithm is executed five times, evaluating 100 mutating architectures per run on an augmented dataset of 4048 images (originally 597), totaling 5000 architectures evaluated. The best network (NAS-Net) achieves 76.95% overall accuracy. K-fold analysis yields mean ± SD percentages of sensitivity, specificity, and accuracy on the test dataset: AL subjects (98.7 ± 2.9, 99.3 ± 1.1, 99.7 ± 0.7), ATTR-CA subjects (93.3 ± 7.8, 78.0 ± 2.9, 70.9 ± 3.7), and controls (35.8 ± 14.6, 77.1 ± 2.0, 96.7 ± 4.4). NAS-derived network performance rivals manually determined networks in the literature while using fewer parameters, validating its automatic approach's efficacy.

PET Classification Cardiac Methodology In Silico Academic Lab Benchmark SOTA

Deep Learning Approaches for Brain Tumor Detection and Classification Using MRI Images (2020 to 2024): A Systematic Review.

Bouhafra S, El Bahi H

•papers•Jun 1 2025

Brain tumor is a type of disease caused by uncontrolled cell proliferation in the brain leading to serious health issues such as memory loss and motor impairment. Therefore, early diagnosis of brain tumors plays a crucial role to extend the survival of patients. However, given the busy nature of the work of radiologists and aiming to reduce the likelihood of false diagnoses, advancing technologies including computer-aided diagnosis and artificial intelligence have shown an important role in assisting radiologists. In recent years, a number of deep learning-based methods have been applied for brain tumor detection and classification using MRI images and achieved promising results. The main objective of this paper is to present a detailed review of the previous researches in this field. In addition, This work summarizes the existing limitations and significant highlights. The study systematically reviews 60 articles researches published between 2020 and January 2024, extensively covering methods such as transfer learning, autoencoders, transformers, and attention mechanisms. The key findings formulated in this paper provide an analytic comparison and future directions. The review aims to provide a comprehensive understanding of automatic techniques that may be useful for professionals and academic communities working on brain tumor classification and detection.

MRI Classification Neurological Review Post Market Academic Lab Benchmark SOTA

MR Image Fusion-Based Parotid Gland Tumor Detection.

Sunnetci KM, Kaba E, Celiker FB, Alkan A

•papers•Jun 1 2025

The differentiation of benign and malignant parotid gland tumors is of major significance as it directly affects the treatment process. In addition, it is also a vital task in terms of early and accurate diagnosis of parotid gland tumors and the determination of treatment planning accordingly. As in other diseases, the differentiation of tumor types involves several challenging, time-consuming, and laborious processes. In the study, Magnetic Resonance (MR) images of 114 patients with parotid gland tumors are used for training and testing purposes by Image Fusion (IF). After the Apparent Diffusion Coefficient (ADC), Contrast-enhanced T1-w (T1C-w), and T2-w sequences are cropped, IF (ADC, T1C-w), IF (ADC, T2-w), IF (T1C-w, T2-w), and IF (ADC, T1C-w, T2-w) datasets are obtained for different combinations of these sequences using a two-dimensional Discrete Wavelet Transform (DWT)-based fusion technique. For each of these four datasets, ResNet18, GoogLeNet, and DenseNet-201 architectures are trained separately, and thus, 12 models are obtained in the study. A Graphical User Interface (GUI) application that contains the most successful of these trained architectures for each data is also designed to support the users. The designed GUI application not only allows the fusing of different sequence images but also predicts whether the label of the fused image is benign or malignant. The results show that the DenseNet-201 models for IF (ADC, T1C-w), IF (ADC, T2-w), and IF (ADC, T1C-w, T2-w) are better than the others, with accuracies of 95.45%, 95.96%, and 92.93%, respectively. It is also noted in the study that the most successful model for IF (T1C-w, T2-w) is ResNet18, and its accuracy is equal to 94.95%.

MRI Classification Retrospective Clinical In Silico Academic Lab

A Large Language Model to Detect Negated Expressions in Radiology Reports.

Su Y, Babore YB, Kahn CE

•papers•Jun 1 2025

Natural language processing (NLP) is crucial to extract information accurately from unstructured text to provide insights for clinical decision-making, quality improvement, and medical research. This study compared the performance of a rule-based NLP system and a medical-domain transformer-based model to detect negated concepts in radiology reports. Using a corpus of 984 de-identified radiology reports from a large U.S.-based academic health system (1000 consecutive reports, excluding 16 duplicates), the investigators compared the rule-based medspaCy system and the Clinical Assertion and Negation Classification Bidirectional Encoder Representations from Transformers (CAN-BERT) system to detect negated expressions of terms from RadLex, the Unified Medical Language System Metathesaurus, and the Radiology Gamuts Ontology. Power analysis determined a sample size of 382 terms to achieve α = 0.05 and β = 0.8 for McNemar's test; based on an estimate of 15% negated terms, 2800 randomly selected terms were annotated manually as negated or not negated. Precision, recall, and F1 of the two models were compared using McNemar's test. Of the 2800 terms, 387 (13.8%) were negated. For negation detection, medspaCy attained a recall of 0.795, precision of 0.356, and F1 of 0.492. CAN-BERT achieved a recall of 0.785, precision of 0.768, and F1 of 0.777. Although recall was not significantly different, CAN-BERT had significantly better precision (χ2 = 304.64; p < 0.001). The transformer-based CAN-BERT model detected negated terms in radiology reports with high precision and recall; its precision significantly exceeded that of the rule-based medspaCy system. Use of this system will improve data extraction from textual reports to support information retrieval, AI model training, and discovery of causal relationships.

Mixed Modality Classification Retrospective Clinical In Silico Academic Lab GenAI

Deep Learning Classification of Ischemic Stroke Territory on Diffusion-Weighted MRI: Added Value of Augmenting the Input with Image Transformations.

Koska IO, Selver A, Gelal F, Uluc ME, Çetinoğlu YK, Yurttutan N, Serindere M, Dicle O

•papers•Jun 1 2025

Our primary aim with this study was to build a patient-level classifier for stroke territory in DWI using AI to facilitate fast triage of stroke to a dedicated stroke center. A retrospective collection of DWI images of 271 and 122 consecutive acute ischemic stroke patients from two centers was carried out. Pretrained MobileNetV2 and EfficientNetB0 architectures were used to classify territorial subtypes as middle cerebral artery, posterior circulation, or watershed infarcts along with normal slices. Various input combinations using edge maps, thresholding, and hard attention versions were explored. The effect of augmenting the three-channel inputs of pre-trained models on classification performance was analyzed. ROC analyses and confusion matrix-derived performance metrics of the models were reported. Of the 271 patients included in this study, 151 (55.7%) were male and 120 (44.3%) were female. One hundred twenty-nine patients had MCA (47.6%), 65 patients had posterior circulation (24%), and 77 patients had watershed (28.0%) infarcts for center 1. Of the 122 patients from center 2, 78 (64%) were male and 44 (34%) were female. Fifty-two patients (43%) had MCA, 51 patients had posterior circulation (42%), and 19 (15%) patients had watershed infarcts. The Mobile-Crop model had the best performance with 0.95 accuracy and a 0.91 mean f1 score for slice-wise classification and 0.88 accuracy on external test sets, along with a 0.92 mean AUC. In conclusion, modified pre-trained models may be augmented with the transformation of images to provide a more accurate classification of affected territory by stroke in DWI.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Identification of Bipolar Disorder and Schizophrenia Based on Brain CT and Deep Learning Methods.

Li M, Hou X, Yan W, Wang D, Yu R, Li X, Li F, Chen J, Wei L, Liu J, Wang H, Zeng Q

•papers•Jun 1 2025

With the increasing prevalence of mental illness, accurate clinical diagnosis of mental illness is crucial. Compared with MRI, CT has the advantages of wide application, low price, short scanning time, and high patient cooperation. This study aims to construct a deep learning (DL) model based on CT images to make identification of bipolar disorder (BD) and schizophrenia (SZ). A total of 506 patients (BD = 227, SZ = 279) and 179 healthy controls (HC) was collected from January 2022 to May 2023 at two hospitals, and divided into an internal training set and an internal validation set according to a ratio of 4:1. An additional 65 patients (BD = 35, SZ = 30) and 40 HC were recruited from different hospitals, and served as an external test set. All subjects accepted the conventional brain CT examination. The DenseMD model for identify BD and SZ using multiple instance learning was developed and compared with other classical DL models. The results showed that DenseMD performed excellently with an accuracy of 0.745 in the internal validation set, whereas the accuracy of the ResNet-18, ResNeXt-50, and DenseNet-121model was 0.672, 0.664, and 0.679, respectively. For the external test set, DenseMD again outperformed other models with an accuracy of 0.724; however, the accuracy of the ResNet-18, ResNeXt-50, and DenseNet-121model was 0.657, 0.638, and 0.676, respectively. Therefore, the potential of DL models for identification of BD and SZ based on brain CT images was established, and identification ability of the DenseMD model was better than other classical DL models.

CT Classification Neurological Retrospective Clinical In Silico Academic Lab

Automatic Segmentation of Ultrasound-Guided Quadratus Lumborum Blocks Based on Artificial Intelligence.

Wang Q, He B, Yu J, Zhang B, Yang J, Liu J, Ma X, Wei S, Li S, Zheng H, Tang Z

•papers•Jun 1 2025

Ultrasound-guided quadratus lumborum block (QLB) technology has become a widely used perioperative analgesia method during abdominal and pelvic surgeries. Due to the anatomical complexity and individual variability of the quadratus lumborum muscle (QLM) on ultrasound images, nerve blocks heavily rely on anesthesiologist experience. Therefore, using artificial intelligence (AI) to identify different tissue regions in ultrasound images is crucial. In our study, we retrospectively collected 112 patients (3162 images) and developed a deep learning model named Q-VUM, which is a U-shaped network based on the Visual Geometry Group 16 (VGG16) network. Q-VUM precisely segments various tissues, including the QLM, the external oblique muscle, the internal oblique muscle, the transversus abdominis muscle (collectively referred to as the EIT), and the bones. Furthermore, we evaluated Q-VUM. Our model demonstrated robust performance, achieving mean intersection over union (mIoU), mean pixel accuracy, dice coefficient, and accuracy values of 0.734, 0.829, 0.841, and 0.944, respectively. The IoU, recall, precision, and dice coefficient achieved for the QLM were 0.711, 0.813, 0.850, and 0.831, respectively. Additionally, the Q-VUM predictions showed that 85% of the pixels in the blocked area fell within the actual blocked area. Finally, our model exhibited stronger segmentation performance than did the common deep learning segmentation networks (0.734 vs. 0.720 and 0.720, respectively). In summary, we proposed a model named Q-VUM that can accurately identify the anatomical structure of the quadratus lumborum in real time. This model aids anesthesiologists in precisely locating the nerve block site, thereby reducing potential complications and enhancing the effectiveness of nerve block procedures.

Ultrasound Segmentation Abdominal Retrospective Clinical In Silico Academic Lab

A Robust Deep Learning Method with Uncertainty Estimation for the Pathological Classification of Renal Cell Carcinoma Based on CT Images.

Yao N, Hu H, Chen K, Huang H, Zhao C, Guo Y, Li B, Nan J, Li Y, Han C, Zhu F, Zhou W, Tian L

•papers•Jun 1 2025

This study developed and validated a deep learning-based diagnostic model with uncertainty estimation to aid radiologists in the preoperative differentiation of pathological subtypes of renal cell carcinoma (RCC) based on computed tomography (CT) images. Data from 668 consecutive patients with pathologically confirmed RCC were retrospectively collected from Center 1, and the model was trained using fivefold cross-validation to classify RCC subtypes into clear cell RCC (ccRCC), papillary RCC (pRCC), and chromophobe RCC (chRCC). An external validation with 78 patients from Center 2 was conducted to evaluate the performance of the model. In the fivefold cross-validation, the area under the receiver operating characteristic curve (AUC) for the classification of ccRCC, pRCC, and chRCC was 0.868 (95% CI, 0.826-0.923), 0.846 (95% CI, 0.812-0.886), and 0.839 (95% CI, 0.802-0.88), respectively. In the external validation set, the AUCs were 0.856 (95% CI, 0.838-0.882), 0.787 (95% CI, 0.757-0.818), and 0.793 (95% CI, 0.758-0.831) for ccRCC, pRCC, and chRCC, respectively. The model demonstrated robust performance in predicting the pathological subtypes of RCC, while the incorporated uncertainty emphasized the importance of understanding model confidence. The proposed approach, integrated with uncertainty estimation, offers clinicians a dual advantage: accurate RCC subtype predictions complemented by diagnostic confidence metrics, thereby promoting informed decision-making for patients with RCC.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab

Children Are Not Small Adults: Addressing Limited Generalizability of an Adult Deep Learning CT Organ Segmentation Model to the Pediatric Population.

Leveraging Ensemble Models and Follow-up Data for Accurate Prediction of mRS Scores from Radiomic Features of DSC-PWI Images.

Automated Neural Architecture Search for Cardiac Amyloidosis Classification from [18F]-Florbetaben PET Images.

Deep Learning Approaches for Brain Tumor Detection and Classification Using MRI Images (2020 to 2024): A Systematic Review.

MR Image Fusion-Based Parotid Gland Tumor Detection.

A Large Language Model to Detect Negated Expressions in Radiology Reports.

Deep Learning Classification of Ischemic Stroke Territory on Diffusion-Weighted MRI: Added Value of Augmenting the Input with Image Transformations.

Identification of Bipolar Disorder and Schizophrenia Based on Brain CT and Deep Learning Methods.

Automatic Segmentation of Ultrasound-Guided Quadratus Lumborum Blocks Based on Artificial Intelligence.

A Robust Deep Learning Method with Uncertainty Estimation for the Pathological Classification of Renal Cell Carcinoma Based on CT Images.

Ready to Sharpen Your Edge?