Latest Papers on Radiology AI.

A deep learning-based clinical decision support system for glioma grading using ensemble learning and knowledge distillation.

Liu Y, Shi Z, Xiao C, Wang B

•papers•Jul 10 2025

Gliomas are the most common malignant primary brain tumors, and grading their severity, particularly the diagnosis of low-grade gliomas, remains a challenging task for clinicians and radiologists. With advancements in deep learning and medical image processing technologies, the development of Clinical Decision Support Systems (CDSS) for glioma grading offers significant benefits for clinical treatment. This study proposes a CDSS for glioma grading, integrating a novel feature extraction framework. The method is based on combining ensemble learning and knowledge distillation: teacher models were constructed through ensemble learning, while uncertainty-weighted ensemble averaging is applied during student model training to refine knowledge transfer. This approach bridges the teacher-student performance gap, enhancing grading accuracy, reliability, and clinical applicability with lightweight deployment. Experimental results show 85.96 % Accuracy (5.2 % improvement over baseline), with Precision (83.90 %), Recall (87.40 %), and F1-score (83.90 %) increasing by 7.5 %, 5.1 %, and 5.1 % respectively. The teacher-student performance gap is reduced to 3.2 %, confirming effectiveness. Furthermore, the developed CDSS not only ensures rapid and accurate glioma grading but also includes critical features influencing the grading results, seamlessly integrating a methodology for generating comprehensive diagnostic reports. Consequently, the glioma grading CDSS represents a practical clinical decision support tool capable of delivering accurate and efficient auxiliary diagnostic decisions for physicians and patients.

MRI Classification Neurological Methodology In Silico

Intratumoral and peritumoral radiomics based on 2D ultrasound imaging in breast cancer was used to determine the optimal peritumoral range for predicting KI-67 expression.

Huang W, Zheng S, Zhang X, Qi L, Li M, Zhang Q, Zhen Z, Yang X, Kong C, Li D, Hua G

•papers•Jul 10 2025

Currently, radiomics focuses on intratumoral regions and fixed peritumoral regions, and lacks an optimal peritumoral region taken to predict KI-67 expression. The aim of this study was to develop a machine learning model to analyze ultrasound radiomics features with different regions of peri-tumor fetch values to determine the optimal peri-tumor region for predicting KI-67 expression. A total of 453 breast cancer patients were included. They were randomly assigned to training and validation sets in a 7:3 ratio. In the training cohort, machine learning models were constructed for intra-tumor and different peri-tumor regions (2 mm, 4 mm, 6 mm, 8 mm, 10 mm), identifying the relevant Ki-67 features for each ROI and comparing the different models to determine the best model. These models were validated using a test cohort to find the most accurate peri-tumor region for Ki-67 prediction. The area under the receiver operating characteristic curve (AUC) was used to evaluate the performance of predicting KI-67 expression, and the Delong test was used to assess the difference between each AUC.SHAP (Shapley Additive Decomposition) was performed to analyze the optimal prediction model and quantify the contribution of major radiomics features. In the validation cohort, the SVM model with the combination of intratumoral and peritumoral 6 mm regions showed the highest prediction effect, with an AUC of 0.9342.The intratumoral and peritumoral 6-mm SVM models showed statistically significant differences (P < 0.05) compared to the other models. SHAP analysis showed that peri-tumoral 6 mm features were more important than intratumoral features. SVM models using intratumoral and peritumoral 6 mm regions showed the best results in prediction of KI-67 expression.

Ultrasound Classification Breast Retrospective Clinical In Silico Academic Lab

Acute Management of Nasal Bone Fractures: A Systematic Review and Practice Management Guideline.

Paliwoda ED, Newman-Plotnick H, Buzzetta AJ, Post NK, LaClair JR, Trandafirescu M, Gildener-Leapman N, Kpodzo DS, Edwards K, Tafen M, Schalet BJ

•papers•Jul 10 2025

Nasal bone fractures represent the most common facial skeletal injury, challenging both function and aesthetics. This Preferred Reporting Items for Systematic Reviews and Meta-Analyses-based review analyzed 23 studies published within the past 5 years, selected from 998 records retrieved from PubMed, Embase, and Web of Science. Data from 1780 participants were extracted, focusing on diagnostic methods, surgical techniques, anesthesia protocols, and long-term outcomes. Ultrasound and artificial intelligence-based algorithms improved diagnostic accuracy, while telephone triage streamlined necessary encounters. Navigation-assisted reduction, ballooning, and septal reduction with polydioxanone plates improved outcomes. Anesthetic approaches ranged from local nerve blocks to general anesthesia with intraoperative administration of lidocaine, alongside techniques to manage pain from nasal pack removal postoperatively. Long-term follow-up demonstrated improved quality of life, breathing function, and aesthetic satisfaction with timely and individualized treatment. This review highlights the trend toward personalized, technology-assisted approaches in nasal fracture management, highlighting areas for future research.

Ultrasound Detection Review Concept

BSN with Explicit Noise-Aware Constraint for Self-Supervised Low-Dose CT Denoising.

Wang P, Li D, Zhang Y, Chen G, Wang Y, Ma J, He J

•papers•Jul 10 2025

Although supervised deep learning methods have made significant advances in low-dose computed tomography (LDCT) image denoising, these approaches typically require pairs of low-dose and normal-dose CT images for training, which are often unavailable in clinical settings. Self-supervised deep learning (SSDL) has great potential to cast off the dependence on paired training datasets. However, existing SSDL methods are limited by the neighboring noise independence assumptions, making them ineffective for handling spatially correlated noises in LDCT images. To address this issue, this paper introduces a novel SSDL approach, named, Noise-Aware Blind Spot Network (NA-BSN), for high-quality LDCT imaging, while mitigating the dependence on the assumption of neighboring noise independence. NA-BSN achieves high-quality image reconstruction without referencing clean data through its explicit noise-aware constraint mechanism during the self-supervised learning process. Specifically, it is experimentally observed and theoretical proven that the l1 norm value of CT images in a downsampled space follows a certain descend trend with increasing of the radiation dose, which is then used to construct the explicit noise-aware constraint in the architecture of BSN for self-supervised LDCT image denoising. Various clinical datasets are adopted to validate the performance of the presented NA-BSN method. Experimental results reveal that NA-BSN significantly reduces the spatially correlated CT noises and retains crucial image details in various complex scenarios, such as different types of scanning machines, scanning positions, dose-level settings, and reconstruction kernels.

CT Reconstruction Methodology In Silico Academic Lab

Research on a deep learning-based model for measurement of X-ray imaging parameters of atlantoaxial joint.

Wu Y, Zheng Y, Zhu J, Chen X, Dong F, He L, Zhu J, Cheng G, Wang P, Zhou S

•papers•Jul 10 2025

To construct a deep learning-based SCNet model, in order to automatically measure X-ray imaging parameters related to atlantoaxial subluxation (AAS) in cervical open-mouth view radiographs, and the accuracy and reliability of the model were evaluated. A total of 1973 cervical open-mouth view radiographs were collected from picture archiving and communication system (PACS) of two hospitals(Hospitals A and B). Among them, 365 images of Hospital A were randomly selected as the internal test dataset for evaluating the model's performance, and the remaining 1364 images of Hospital A were used as the training dataset and validation dataset for constructing the model and tuning the model hyperparameters, respectively. The 244 images of Hospital B were used as an external test dataset to evaluate the robustness and generalizability of our model. The model identified and marked landmarks in the images for the parameters of the lateral atlanto-dental space (LADS), atlas lateral mass inclination (ALI), lateral mass width (LW), axis spinous process deviation distance (ASDD). The measured results of landmarks on the internal test dataset and external test dataset were compared with the mean values of manual measurement by three radiologists as the reference standard. Percentage of correct key-points (PCK), intra-class correlation coefficient (ICC), mean absolute error (MAE), Pearson correlation coefficient (r), mean square error (MSE), root mean square error (RMSE) and Bland-Altman plot were used to evaluate the performance of the SCNet model. (1) Within the 2 mm distance threshold, the PCK of the SCNet model predicted landmarks in internal test dataset images was 98.6-99.7%, and the PCK in the external test dataset images was 98-100%. (2) In the internal test dataset, for the parameters LADS, ALI, LW, and ASDD, there were strong correlation and consistency between the SCNet model predictions and the manual measurements (ICC = 0.80-0.96, r = 0.86-0.96, MAE = 0.47-2.39 mm/°, MSE = 0.38-8.55 mm2/°2, RMSE = 0.62-2.92 mm/°). (3) The same four parameters also showed strong correlation and consistency between SCNet and manual measurements in the external test dataset (ICC = 0.81-0.91, r = 0.82-0.91, MAE = 0.46-2.29 mm/°, MSE = 0.29-8.23mm2/°2, RMSE = 0.54-2.87 mm/°). The SCNet model constructed based on deep learning algorithm in this study can accurately identify atlantoaxial vertebral landmarks in cervical open-mouth view radiographs and automatically measure the AAS-related imaging parameters. Furthermore, the independent external test set demonstrates that the model exhibits a certain degree of robustness and generalization capability under meet radiographic standards.

X-Ray Detection Neurological Retrospective Clinical In Silico Academic Lab

Multiparametric ultrasound techniques are superior to AI-assisted ultrasound for assessment of solid thyroid nodules: a prospective study.

Li Y, Li X, Yan L, Xiao J, Yang Z, Zhang M, Luo Y

•papers•Jul 10 2025

To evaluate the diagnostic performance of multiparametric ultrasound (mpUS) and AI-assisted B-mode ultrasound (AI-US), and their potential to reduce unnecessary biopsies to B-mode for solid thyroid nodules. This prospective study enrolled 226 solid thyroid nodules with 145 malignant and 81 benign pathological results from 189 patients (35 men and 154 women; age range, 19-73 years; mean age, 45 years). Each nodule was examined using B-mode, microvascular flow imaging (MVFI), elastography with elasticity contrast index (ECI), and an AI system. Image data were recorded for each modality. Ten readers with different experience levels independently evaluated the B-mode images of each nodule to make a "benign" or "malignant" diagnosis in both an unblinded and blinded manner to the AI reports. The most accurate ECI value and MVFI mode were selected and combined with the dichotomous prediction of all readers. Descriptive statistics and AUCs were used to evaluate the diagnostic performances of mpUS and AI-US. Triple mpUS with B-mode, MVFI, and ECI exhibited the highest diagnostic performance (average AUC = 0.811 vs. 0.677 for B-mode, p = 0.001), followed by AI-US (average AUC = 0.718, p = 0.315). Triple mpUS significantly reduced the unnecessary biopsy rate by up to 12% (p = 0.007). AUC and specificity were significantly higher for triple mpUS than for AI-US mode (both p < 0.05). Compared to AI-US, triple mpUS (B-mode, MVFI, and ECI) exhibited better diagnostic performance for thyroid cancer diagnosis, and resulted in a significant reduction in unnecessary biopsy rate. AI systems are expected to take advantage of multi-modal information to facilitate diagnoses.

Ultrasound Classification Prospective Clinical Pilot Academic Lab

Recurrence prediction of invasive ductal carcinoma from preoperative contrast-enhanced computed tomography using deep convolutional neural network.

Umezu M, Kondo Y, Ichikawa S, Sasaki Y, Kaneko K, Ozaki T, Koizumi N, Seki H

•papers•Jul 10 2025

Predicting the risk of breast cancer recurrence is crucial for guiding therapeutic strategies, including enhanced surveillance and the consideration of additional treatment after surgery. In this study, we developed a deep convolutional neural network (DCNN) model to predict recurrence within six years after surgery using preoperative contrast-enhanced computed tomography (CECT) images, which are widely available and effective for detecting distant metastases. This retrospective study included preoperative CECT images from 133 patients with invasive ductal carcinoma. The images were classified into recurrence and no-recurrence groups using ResNet-101 and DenseNet-201. Classification performance was evaluated using the area under the receiver operating curve (AUC) with leave-one-patient-out cross-validation. At the optimal threshold, the classification accuracies for ResNet-101 and DenseNet-201 were 0.73 and 0.72, respectively. The median (interquartile range) AUC of DenseNet-201 (0.70 [0.69-0.72]) was statistically higher than that of ResNet-101 (0.68 [0.66-0.68]) (p < 0.05). These results suggest the potential of preoperative CECT-based DCNN models to predict breast cancer recurrence without the need for additional invasive procedures.

CT Classification Breast Retrospective Clinical In Silico Academic Lab

Data Extraction and Curation from Radiology Reports for Pancreatic Cyst Surveillance Using Large Language Models.

Choubey AP, Eguia E, Hollingsworth A, Chatterjee S, D'Angelica MI, Jarnagin WR, Wei AC, Schattner MA, Do RKG, Soares KC

•papers•Jul 10 2025

Manual curation of radiographic features in pancreatic cyst registries for data abstraction and longitudinal evaluation is time consuming and limits widespread implementation. We examined the feasibility and accuracy of using large language models (LLMs) to extract clinical variables from radiology reports. A single center retrospective study included patients under surveillance for pancreatic cysts. Nine radiographic elements used to monitor cyst progression were included: cyst size, main pancreatic duct (MPD) size (continuous variable), number of lesions, MPD dilation ≥5mm (categorical), branch duct dilation, presence of solid component, calcific lesion, pancreatic atrophy, and pancreatitis. LLMs (GPT) on the OpenAI GPT-4 platform were employed to extract elements of interest with a zero-shot learning approach using prompting to facilitate annotation without any training data. A manually annotated institutional cyst database was used as the ground truth (GT) for comparison. Overall, 3198 longitudinal scans from 991 patients were included. GPT successfully extracted the selected radiographic elements with high accuracy. Among categorical variables, accuracy ranged from 97% for solid component to 99% for calcific lesions. In the continuous variables, accuracy varied from 92% for cyst size to 97% for MPD size. However, Cohen's Kappa was higher for cyst size (0.92) compared to MPD size (0.82). Lowest accuracy (81%) was noted in the multi-class variable for number of cysts. LLM can accurately extract and curate data from radiology reports for pancreatic cyst surveillance and can be reliably used to assemble longitudinal databases. Future application of this work may potentiate the development of artificial intelligence-based surveillance models.

Mixed Modality LLM Radiology Report Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Understanding Dataset Bias in Medical Imaging: A Case Study on Chest X-rays

Ethan Dack, Chengliang Dai

•preprint•Jul 10 2025

Recent work has revisited the infamous task Name that dataset and established that in non-medical datasets, there is an underlying bias and achieved high Accuracies on the dataset origin task. In this work, we revisit the same task applied to popular open-source chest X-ray datasets. Medical images are naturally more difficult to release for open-source due to their sensitive nature, which has led to certain open-source datasets being extremely popular for research purposes. By performing the same task, we wish to explore whether dataset bias also exists in these datasets. % We deliberately try to increase the difficulty of the task by dataset transformations. We apply simple transformations of the datasets to try to identify bias. Given the importance of AI applications in medical imaging, it's vital to establish whether modern methods are taking shortcuts or are focused on the relevant pathology. We implement a range of different network architectures on the datasets: NIH, CheXpert, MIMIC-CXR and PadChest. We hope this work will encourage more explainable research being performed in medical imaging and the creation of more open-source datasets in the medical domain. The corresponding code will be released upon acceptance.

X-Ray Classification Chest Methodology In Silico Open Dataset Open Code Reproducibility

Artificial Intelligence for Low-Dose CT Lung Cancer Screening: Comparison of Utilization Scenarios.

Lee M, Hwang EJ, Lee JH, Nam JG, Lim WH, Park H, Park CM, Choi H, Park J, Goo JM

•papers•Jul 10 2025

BACKGROUND. Artificial intelligence (AI) tools for evaluating low-dose CT (LDCT) lung cancer screening examinations are used predominantly for assisting radiologists' interpretations. Alternate utilization scenarios (e.g., use of AI as a prescreener or backup) warrant consideration. OBJECTIVE. The purpose of this study was to evaluate the impact of different AI utilization scenarios on diagnostic outcomes and interpretation times for LDCT lung cancer screening. METHODS. This retrospective study included 366 individuals (358 men, 8 women; mean age, 64 years) who underwent LDCT from May 2017 to December 2017 as part of an earlier prospective lung cancer screening trial. Examinations were interpreted by one of five readers, who reviewed their assigned cases in two sessions (with and without a commercial AI computer-aided detection tool). These interpretations were used to reconstruct simulated AI utilization scenarios: as an assistant (i.e., radiologists interpret all examinations with AI assistance), as a prescreener (i.e., radiologists only interpret examinations with a positive AI result), or as backup (i.e., radiologists reinterpret examinations when AI suggests a missed finding). A group of thoracic radiologists determined the reference standard. Diagnostic outcomes and mean interpretation times were assessed. Decision-curve analysis was performed. RESULTS. Compared with interpretation without AI (recall rate, 22.1%; per-nodule sensitivity, 64.2%; per-examination specificity, 88.8%; mean interpretation time, 164 seconds), AI as an assistant showed higher recall rate (30.3%; p < .001), lower per-examination specificity (81.1%), and no significant change in per-nodule sensitivity (64.8%; p = .86) or mean interpretation time (161 seconds; p = .48); AI as a prescreener showed lower recall rate (20.8%; p = .02) and mean interpretation time (143 seconds; p = .001), higher per-examination specificity (90.3%; p = .04), and no significant difference in per-nodule sensitivity (62.9%; p = .16); and AI as a backup showed increased recall rate (33.6%; p < .001), per-examination sensitivity (66.4%; p < .001), and mean interpretation time (225 seconds; p = .001), with lower per-examination specificity (79.9%; p < .001). Among scenarios, only AI as a prescreener demonstrated higher net benefit than interpretation without AI; AI as an assistant had the least net benefit. CONCLUSION. Different AI implementation approaches yield varying outcomes. The findings support use of AI as a prescreener as the preferred scenario. CLINICAL IMPACT. An approach whereby radiologists only interpret LDCT examinations with a positive AI result can reduce radiologists' workload while preserving sensitivity.

CT Detection Chest Retrospective Clinical In Silico Academic Lab

Filter Papers

Tags

A deep learning-based clinical decision support system for glioma grading using ensemble learning and knowledge distillation.

Intratumoral and peritumoral radiomics based on 2D ultrasound imaging in breast cancer was used to determine the optimal peritumoral range for predicting KI-67 expression.

Acute Management of Nasal Bone Fractures: A Systematic Review and Practice Management Guideline.

BSN with Explicit Noise-Aware Constraint for Self-Supervised Low-Dose CT Denoising.

Research on a deep learning-based model for measurement of X-ray imaging parameters of atlantoaxial joint.

Multiparametric ultrasound techniques are superior to AI-assisted ultrasound for assessment of solid thyroid nodules: a prospective study.

Recurrence prediction of invasive ductal carcinoma from preoperative contrast-enhanced computed tomography using deep convolutional neural network.

Data Extraction and Curation from Radiology Reports for Pancreatic Cyst Surveillance Using Large Language Models.

Understanding Dataset Bias in Medical Imaging: A Case Study on Chest X-rays

Artificial Intelligence for Low-Dose CT Lung Cancer Screening: Comparison of Utilization Scenarios.

Ready to Sharpen Your Edge?