Latest Papers on Radiology AI. Tags: In Silico

Explainability Through Human-Centric Design for XAI in Lung Cancer Detection

Amy Rafferty, Rishi Ramaesh, Ajitha Rajan

•preprint•May 14 2025

Deep learning models have shown promise in lung pathology detection from chest X-rays, but widespread clinical adoption remains limited due to opaque model decision-making. In prior work, we introduced ClinicXAI, a human-centric, expert-guided concept bottleneck model (CBM) designed for interpretable lung cancer diagnosis. We now extend that approach and present XpertXAI, a generalizable expert-driven model that preserves human-interpretable clinical concepts while scaling to detect multiple lung pathologies. Using a high-performing InceptionV3-based classifier and a public dataset of chest X-rays with radiology reports, we compare XpertXAI against leading post-hoc explainability methods and an unsupervised CBM, XCBs. We assess explanations through comparison with expert radiologist annotations and medical ground truth. Although XpertXAI is trained for multiple pathologies, our expert validation focuses on lung cancer. We find that existing techniques frequently fail to produce clinically meaningful explanations, omitting key diagnostic features and disagreeing with radiologist judgments. XpertXAI not only outperforms these baselines in predictive accuracy but also delivers concept-level explanations that better align with expert reasoning. While our focus remains on explainability in lung cancer detection, this work illustrates how human-centric model design can be effectively extended to broader diagnostic contexts - offering a scalable path toward clinically meaningful explainable AI in medical diagnostics.

X-Ray Classification Chest Retrospective Clinical In Silico Academic Lab Ethics

Development and Validation of Ultrasound Hemodynamic-based Prediction Models for Acute Kidney Injury After Renal Transplantation.

Ni ZH, Xing TY, Hou WH, Zhao XY, Tao YL, Zhou FB, Xing YQ

•papers•May 14 2025

Acute kidney injury (AKI) post-renal transplantation often has a poor prognosis. This study aimed to identify patients with elevated risks of AKI after kidney transplantation. A retrospective analysis was conducted on 422 patients who underwent kidney transplants from January 2020 to April 2023. Participants from 2020 to 2022 were randomized to training group (n=261) and validation group 1 (n=113), and those in 2023, as validation group 2 (n=48). Risk factors were determined by employing logistic regression analysis alongside the least absolute shrinkage and selection operator, making use of ultrasound hemodynamic, clinical, and laboratory information. Models for prediction were developed using logistic regression analysis and six machine-learning techniques. The evaluation of the logistic regression model encompassed its discrimination, calibration, and applicability in clinical settings, and a nomogram was created to illustrate the model. SHapley Additive exPlanations were used to explain and visualize the best of the six machine learning models. The least absolute shrinkage and selection operator combined with logistic regression identified and incorporated five risk factors into the predictive model. The logistic regression model (AUC=0.927 in the validation set 1; AUC=0.968 in the validation set 2) and the random forest model (AUC=0.946 in the validation set 1;AUC=0.996 in the validation set 2) showed good performance post-validation, with no significant difference in their predictive accuracy. These findings can assist clinicians in the early identification of patients at high risk for AKI, allowing for timely interventions and potentially enhancing the prognosis following kidney transplantation.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab

Automated whole-breast ultrasound tumor diagnosis using attention-inception network.

Zhang J, Huang YS, Wang YW, Xiang H, Lin X, Chang RF

•papers•May 14 2025

Automated Whole-Breast Ultrasound (ABUS) has been widely used as an important tool in breast cancer diagnosis due to the ability of this technique to provide complete three-dimensional (3D) images of breasts. To eliminate the risk of misdiagnosis, computer-aided diagnosis (CADx) systems have been proposed to assist radiologists. Convolutional neural networks (CNNs), renowned for the automatic feature extraction capabilities, have developed rapidly in medical image analysis, and this study proposes a CADx system based on 3D CNN for ABUS. This study used a private dataset collected at Sun Yat-Sen University Cancer Center (SYSUCC) from 396 breast tumor patients. First, the tumor volume of interest (VOI) was extracted and resized, and then the tumor was enhanced by histogram equalization. Second, a 3D U-Net++ was employed to segment the tumor mask. Finally, the VOI, the enhanced VOI, and the corresponding tumor mask were fed into a 3D Attention-Inception network to classify the tumor as benign or malignant. The experiment results indicate an accuracy of 89.4%, a sensitivity of 91.2%, a specificity of 87.6%, and an area under the receiver operating characteristic curve (AUC) of 0.9262, which suggests that the proposed CADx system for ABUS images rivals the performance of experienced radiologists in tumor diagnosis tasks. This study proposes a CADx system consisting of a 3D U-Net++ tumor segmentation model and a 3D attention inception neural network tumor classification model for diagnosis in ABUS images. The results indicate that the proposed CADx system is effective and efficient in tumor diagnosis tasks.

Ultrasound Segmentation Breast Retrospective Clinical In Silico Academic Lab

Single View Echocardiographic Analysis for Left Ventricular Outflow Tract Obstruction Prediction in Hypertrophic Cardiomyopathy: A Deep Learning Approach

Kim, J., Park, J., Jeon, J., Yoon, Y. E., Jang, Y., Jeong, H., Lee, S.-A., Choi, H.-M., Hwang, I.-C., Cho, G.-Y., Chang, H.-J.

•preprint•May 14 2025

BackgroundAccurate left ventricular outflow tract obstruction (LVOTO) assessment is crucial for hypertrophic cardiomyopathy (HCM) management and prognosis. Traditional methods, requiring multiple views, Doppler, and provocation, is often infeasible, especially where resources are limited. This study aimed to develop and validate a deep learning (DL) model capable of predicting severe LVOTO in HCM patients using only the parasternal long-axis (PLAX) view from transthoracic echocardiography (TTE). MethodsA DL model was trained on PLAX videos extracted from TTE examinations (developmental dataset, n=1,007) to capture both morphological and dynamic motion features, generating a DL index for LVOTO (DLi-LVOTO, range 0-100). Performance was evaluated in an internal test dataset (ITDS, n=87) and externally validated in the distinct hospital dataset (DHDS, n=1,334) and the LVOTO reduction treatment dataset (n=156). ResultsThe model achieved high accuracy in detecting severe LVOTO (pressure gradient[≥] 50mmHg), with area under the receiver operating characteristics curve (AUROC) of 0.97 (95% confidence interval: 0.92-1.00) in ITDS and 0.93 (0.92-0.95) in DHDS. At a DLi-LVOTO threshold of 70, the model demonstrated a specificity of 97.3% and negative predictive value (NPV) of 96.1% in ITDS. In DHDS, a cutoff of 60 yielded a specificity of 94.6% and NPV of 95.5%. DLi-LVOTO also decreased significantly after surgical myectomy or Mavacamten treatment, correlating with reductions in peak pressure gradient (p<0.001 for all). ConclusionsOur DL-based approach predicts severe LVOTO using only the PLAX view from TTE, serving as a complementary tool, particularly in resource-limited settings or when Doppler is unavailable, and for monitoring treatment response.

Ultrasound Classification Cardiac Retrospective Clinical In Silico Academic Lab

Assessing artificial intelligence in breast screening with stratified results on 306 839 mammograms across geographic regions, age, breast density and ethnicity: A Retrospective Investigation Evaluating Screening (ARIES) study.

Oberije CJG, Currie R, Leaver A, Redman A, Teh W, Sharma N, Fox G, Glocker B, Khara G, Nash J, Ng AY, Kecskemethy PD

•papers•May 14 2025

Evaluate an Artificial Intelligence (AI) system in breast screening through stratified results across age, breast density, ethnicity and screening centres, from different UK regions. A large-scale retrospective study evaluating two variations of using AI as an independent second reader in double reading was executed. Stratifications were conducted for clinical and operational metrics. Data from 306 839 mammography cases screened between 2017 and 2021 were used and included three different UK regions.The impact on safety and effectiveness was assessed using clinical metrics: cancer detection rate and positive predictive value, stratified according to age, breast density and ethnicity. Operational impact was assessed through reading workload and recall rate, measured overall and per centre.Non-inferiority was tested for AI workflows compared with human double reading, and when passed, superiority was tested. AI interval cancer (IC) flag rate was assessed to estimate additional cancer detection opportunity with AI that cannot be assessed retrospectively. The AI workflows passed non-inferiority or superiority tests for every metric across all subgroups, with workload savings between 38.3% and 43.7%. The AI standalone flagged 41.2% of ICs overall, ranging between 33.3% and 46.8% across subgroups, with the highest detection rate for dense breasts. Human double reading and AI workflows showed the same performance disparities across subgroups. The AI integrations maintained or improved performance at all metrics for all subgroups while achieving significant workload reduction. Moreover, complementing these integrations with AI as an additional reader can improve cancer detection. The granularity of assessment showed that screening with the AI-system integrations was as safe as standard double reading across heterogeneous populations.

Mammography Detection Breast Retrospective Clinical In Silico Startup Benchmark SOTA

Predicting response to anti-VEGF therapy in neovascular age-related macular degeneration using random forest and SHAP algorithms.

Zhang P, Duan J, Wang C, Li X, Su J, Shang Q

•papers•May 14 2025

This study aimed to establish and validate a prediction model based on machine learning methods and SHAP algorithm to predict response to anti-vascular endothelial growth factor (VEGF) therapy in neovascular age-related macular degeneration (AMD). In this retrospective study, we extracted data including demographic characteristics, laboratory test results, and imaging features from optical coherence tomography (OCT) and optical coherence tomography angiography (OCTA). Eight machine learning methods, including Logistic Regression, Gradient Boosting Decision Tree, Random Forest, CatBoost, Support Vector Machine, XGboost, LightGBM, K Nearest Neighbors were employed to develop the predictive model. The machine learning method with optimal performance was selected for further interpretation. Finally, the SHAP algorithm was applied to explain the model's predictions. The study included 145 patients with neovascular AMD. Among the eight models developed, the Random Forest model demonstrated general optimal performance, achieving a high accuracy of 75.86% and the highest area under the receiver operating characteristic curve (AUC) value of 0.91. In this model, important features identified as significant contributors to the response to anti-VEGF therapy in neovascular AMD patients included fractal dimension, total number of end points, total number of junctions, total vessels length, vessels area, average lacunarity, choroidal neovascularization (CNV) type, age, duration and logMAR BCVA. SHAP analysis and visualization provided interpretation at both the factor level and individual level. The Random Forest model for predicting response to anti-VEGF therapy in neovascular AMD using SHAP algorithm proved to be feasible and effective. OCTA imaging features, such as fractal dimension, total number of end points et al, were the most effective predictive factors.

OCT Classification Retrospective Clinical In Silico Academic Lab

Multi-Task Deep Learning for Predicting Metabolic Syndrome from Retinal Fundus Images in a Japanese Health Checkup Dataset

Itoh, T., Nishitsuka, K., Fukuma, Y., Wada, S.

•preprint•May 14 2025

BackgroundRetinal fundus images provide a noninvasive window into systemic health, offering opportunities for early detection of metabolic disorders such as metabolic syndrome (METS). ObjectiveThis study aimed to develop a deep learning model to predict METS from fundus images obtained during routine health checkups, leveraging a multi-task learning approach. MethodsWe retrospectively analyzed 5,000 fundus images from Japanese health checkup participants. Convolutional neural network (CNN) models were trained to classify METS status, incorporating fundus-specific data augmentation strategies and auxiliary regression tasks targeting clinical parameters such as abdominal circumference (AC). Model performance was evaluated using validation accuracy, test accuracy, and the area under the receiver operating characteristic curve (AUC). ResultsModels employing fundus-specific augmentation demonstrated more stable convergence and superior validation accuracy compared to general-purpose augmentation. Incorporating AC as an auxiliary task further enhanced performance across architectures. The final ensemble model with test-time augmentation achieved a test accuracy of 0.696 and an AUC of 0.73178. ConclusionCombining multi-task learning, fundus-specific data augmentation, and ensemble prediction substantially improves deep learning-based METS classification from fundus images. This approach may offer a practical, noninvasive screening tool for metabolic syndrome in general health checkup settings.

OCT Classification Retrospective Clinical In Silico Academic Lab

Segmentation of renal vessels on non-enhanced CT images using deep learning models.

Zhong H, Zhao Y, Zhang Y

•papers•May 13 2025

To evaluate the possibility of performing renal vessel reconstruction on non-enhanced CT images using deep learning models. 177 patients' CT scans in the non-enhanced phase, arterial phase and venous phase were chosen. These data were randomly divided into the training set (n = 120), validation set (n = 20) and test set (n = 37). In training set and validation set, a radiologist marked out the right renal arteries and veins on non-enhanced CT phase images using contrast phases as references. Trained deep learning models were tested and evaluated on the test set. A radiologist performed renal vessel reconstruction on the test set without the contrast phase reference, and the results were used for comparison. Reconstruction using the arterial phase and venous phase was used as the gold standard. Without the contrast phase reference, both radiologist and model could accurately identify artery and vein main trunk. The accuracy was 91.9% vs. 97.3% (model vs. radiologist) in artery and 91.9% vs. 100% in vein, the difference was insignificant. The model had difficulty identify accessory arteries, the accuracy was significantly lower than radiologist (44.4% vs. 77.8%, p = 0.044). The model also had lower accuracy in accessory veins, but the difference was insignificant (64.3% vs. 85.7%, p = 0.094). Deep learning models could accurately recognize the right renal artery and vein main trunk, and accuracy was comparable to that of radiologists. Although the current model still had difficulty recognizing small accessory vessels, further training and model optimization would solve these problems.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab

Evaluation of an artificial intelligence noise reduction tool for conventional X-ray imaging - a visual grading study of pediatric chest examinations at different radiation dose levels using anthropomorphic phantoms.

Hultenmo M, Pernbro J, Ahlin J, Bonnier M, Båth M

•papers•May 13 2025

Noise reduction tools developed with artificial intelligence (AI) may be implemented to improve image quality and reduce radiation dose, which is of special interest in the more radiosensitive pediatric population. The aim of the present study was to examine the effect of the AI-based intelligent noise reduction (INR) on image quality at different dose levels in pediatric chest radiography. Anteroposterior and lateral images of two anthropomorphic phantoms were acquired with both standard noise reduction and INR at different dose levels. In total, 300 anteroposterior and 420 lateral images were included. Image quality was evaluated by three experienced pediatric radiologists. Gradings were analyzed with visual grading characteristics (VGC) resulting in area under the VGC curve (AUC<sub>VGC</sub>) values and associated confidence intervals (CI). Image quality of different anatomical structures and overall clinical image quality were statistically significantly better in the anteroposterior INR images than in the corresponding standard noise reduced images at each dose level. Compared with reference anteroposterior images at a dose level of 100% with standard noise reduction, the image quality of the anteroposterior INR images was graded as significantly better at dose levels of ≥ 80%. Statistical significance was also achieved at lower dose levels for some structures. The assessments of the lateral images showed similar trends but with fewer significant results. The results of the present study indicate that the AI-based INR may potentially be used to improve image quality at a specific dose level or to reduce dose and maintain the image quality in pediatric chest radiography.

X-Ray Reconstruction Chest Retrospective Clinical In Silico Academic Lab

Development of a deep learning method for phase retrieval image enhancement in phase contrast microcomputed tomography.

Ding XF, Duan X, Li N, Khoz Z, Wu FX, Chen X, Zhu N

•papers•May 13 2025

Propagation-based imaging (one method of X-ray phase contrast imaging) with microcomputed tomography (PBI-µCT) offers the potential to visualise low-density materials, such as soft tissues and hydrogel constructs, which are difficult to be identified by conventional absorption-based contrast µCT. Conventional µCT reconstruction produces edge-enhanced contrast (EEC) images which preserve sharp boundaries but are susceptible to noise and do not provide consistent grey value representation for the same material. Meanwhile, phase retrieval (PR) algorithms can convert edge enhanced contrast to area contrast to improve signal-to-noise ratio (SNR), and contrast-to-noise ratio (CNR) but usually results to over-smoothing, thus creating inaccuracies in quantitative analysis. To alleviate these problems, this study developed a deep learning-based method called edge view enhanced phase retrieval (EVEPR), by strategically integrating the complementary spatial features of denoised EEC and PR images, and further applied this method to segment the hydrogel constructs in vivo and ex vivo. EVEPR used paired denoised EEC and PR images to train a deep convolutional neural network (CNN) on a dataset-to-dataset basis. The CNN had been trained on important high-frequency details, for example, edges and boundaries from the EEC image and area contrast from PR images. The CNN predicted result showed enhanced area contrast beyond conventional PR algorithms while improving SNR and CNR. The enhanced CNR especially allowed for the image to be segmented with greater efficiency. EVEPR was applied to in vitro and ex vivo PBI-µCT images of low-density hydrogel constructs. The enhanced visibility and consistency of hydrogel constructs was essential for segmenting such material which usually exhibit extremely poor contrast. The EVEPR images allowed for more accurate segmentation with reduced manual adjustments. The efficiency in segmentation allowed for the generation of a sizeable database of segmented hydrogel scaffolds which were used in conventional data-driven segmentation applications. EVEPR was demonstrated to be a robust post-image processing method capable of significantly enhancing image quality by training a CNN on paired denoised EEC and PR images. This method not only addressed the common issues of over-smoothing and noise susceptibility in conventional PBI-µCT image processing but also allowed for efficient and accurate in vitro and ex vivo image processing applications of low-density materials.

CT Segmentation Methodology In Silico Academic Lab

Filter Papers

Tags

Explainability Through Human-Centric Design for XAI in Lung Cancer Detection

Development and Validation of Ultrasound Hemodynamic-based Prediction Models for Acute Kidney Injury After Renal Transplantation.

Automated whole-breast ultrasound tumor diagnosis using attention-inception network.

Single View Echocardiographic Analysis for Left Ventricular Outflow Tract Obstruction Prediction in Hypertrophic Cardiomyopathy: A Deep Learning Approach

Assessing artificial intelligence in breast screening with stratified results on 306 839 mammograms across geographic regions, age, breast density and ethnicity: A Retrospective Investigation Evaluating Screening (ARIES) study.

Predicting response to anti-VEGF therapy in neovascular age-related macular degeneration using random forest and SHAP algorithms.

Multi-Task Deep Learning for Predicting Metabolic Syndrome from Retinal Fundus Images in a Japanese Health Checkup Dataset

Segmentation of renal vessels on non-enhanced CT images using deep learning models.

Evaluation of an artificial intelligence noise reduction tool for conventional X-ray imaging - a visual grading study of pediatric chest examinations at different radiation dose levels using anthropomorphic phantoms.

Development of a deep learning method for phase retrieval image enhancement in phase contrast microcomputed tomography.

Ready to Sharpen Your Edge?