Sort by:
Page 1 of 214 results
Next

Interpretable Deep Learning Approaches for Reliable GI Image Classification: A Study with the HyperKvasir Dataset

Wahid, S. B., Rothy, Z. T., News, R. K., Rieyan, S. A.

medrxiv logopreprintJul 23 2025
Deep learning has emerged as a promising tool for automating gastrointestinal (GI) disease diagnosis. However, multi-class GI disease classification remains underexplored. This study addresses this gap by presenting a framework that uses advanced models like InceptionNetV3 and ResNet50, combined with boosting algorithms (XGB, LGBM), to classify lower GI abnormalities. InceptionNetV3 with XGB achieved the best recall of 0.81 and an F1 score of 0.90. To assist clinicians in understanding model decisions, the Grad-CAM technique, a form of explainable AI, was employed to highlight the critical regions influencing predictions, fostering trust in these systems. This approach significantly improves both the accuracy and reliability of GI disease diagnosis.

Large Language Model-Based Entity Extraction Reliably Classifies Pancreatic Cysts and Reveals Predictors of Malignancy: A Cross-Sectional and Retrospective Cohort Study

Papale, A. J., Flattau, R., Vithlani, N., Mahajan, D., Ziemba, Y., Zavadsky, T., Carvino, A., King, D., Nadella, S.

medrxiv logopreprintJul 17 2025
Pancreatic cystic lesions (PCLs) are often discovered incidentally on imaging and may progress to pancreatic ductal adenocarcinoma (PDAC). PCLs have a high incidence in the general population, and adherence to screening guidelines can be variable. With the advent of technologies that enable automated text classification, we sought to evaluate various natural language processing (NLP) tools including large language models (LLMs) for identifying and classifying PCLs from radiology reports. We correlated our classification of PCLs to clinical features to identify risk factors for a positive PDAC biopsy. We contrasted a previously described NLP classifier to LLMs for prospective identification of PCLs in radiology. We evaluated various LLMs for PCL classification into low-risk or high-risk categories based on published guidelines. We compared prompt-based PCL classification to specific entity-guided PCL classification. To this end, we developed tools to deidentify radiology and track patients longitudinally based on their radiology reports. Additionally, we used our newly developed tools to evaluate a retrospective database of patients who underwent pancreas biopsy to determine associated factors including those in their radiology reports and clinical features using multivariable logistic regression modelling. Of 14,574 prospective radiology reports, 665 (4.6%) described a pancreatic cyst, including 175 (1.2%) high-risk lesions. Our Entity-Extraction Large Language Model tool achieved recall 0.992 (95% confidence interval [CI], 0.985-0.998), precision 0.988 (0.979-0.996), and F1-score 0.990 (0.985-0.995) for detecting cysts; F1-scores were 0.993 (0.987-0.998) for low-risk and 0.977 (0.952-0.995) for high-risk classification. Among 4,285 biopsy patients, 330 had pancreatic cysts documented [≥]6 months before biopsy. In the final multivariable model (AUC = 0.877), independent predictors of adenocarcinoma were change in duct caliber with upstream atrophy (adjusted odds ratio [AOR], 4.94; 95% CI, 1.30-18.79), mural nodules (AOR, 11.02; 1.81-67.26), older age (AOR, 1.10; 1.05-1.16), lower body mass index (AOR, 0.86; 0.76-0.96), and total bilirubin (AOR, 1.81; 1.18-2.77). Automated NLP-based analysis of radiology reports using LLM-driven entity extraction can accurately identify and risk-stratify PCLs and, when retrospectively applied, reveal factors predicting malignant progression. Widespread implementation may improve surveillance and enable earlier intervention.

A Multi-Modal Deep Learning Framework for Predicting PSA Progression-Free Survival in Metastatic Prostate Cancer Using PSMA PET/CT Imaging

Ghaderi, H., Shen, C., Issa, W., Pomper, M. G., Oz, O. K., Zhang, T., Wang, J., Yang, D. X.

medrxiv logopreprintJul 14 2025
PSMA PET/CT imaging has been increasingly utilized in the management of patients with metastatic prostate cancer (mPCa). Imaging biomarkers derived from PSMA PET may provide improved prognostication and prediction of treatment response for mPCa patients. This study investigates a novel deep learning-derived imaging biomarker framework for outcome prediction using multi-modal PSMA PET/CT and clinical features. A single institution cohort of 99 mPCa patients with 396 lesions was evaluated. Imaging features were extracted from cropped lesion areas and combined with clinical variables including body mass index, ECOG performance status, prostate specific antigen (PSA) level, Gleason score, and treatments received. The PSA progression-free survival (PFS) model was trained using a ResNet architecture with a Cox proportional hazards loss function using five-fold cross-validation. Performance was assessed using concordance index (C-index) and Kaplan-Meier survival analysis. Among evaluated model architectures, the ResNet-18 backbone offered the best performance. The multi-modal deep learning framework achieved a 5-fold cross-validation C-index ranging from 0.75 to 0.94, outperforming models incorporating imaging only (0.70-0.89) and clinical features only (0.53-0.65). Kaplan-Meir survival analysis performed on the deep learning-derived predictions demonstrated clear risk stratification, with a median PSA progression free survival (PFS) of 19.7 months in the high-risk group and 26 months in the low-risk group (P < 0.001). Deep learning-derived imaging biomarker based on PSMA PET/CT can effectively predict PSA PFS for mPCa patients. Further clinical validation in prospective cohorts is warranted.

Explainable AI for Precision Oncology: A Task-Specific Approach Using Imaging, Multi-omics, and Clinical Data

Park, Y., Park, S., Bae, E.

medrxiv logopreprintJul 14 2025
Despite continued advances in oncology, cancer remains a leading cause of global mortality, highlighting the need for diagnostic and prognostic tools that are both accurate and interpretable. Unimodal approaches often fail to capture the biological and clinical complexity of tumors. In this study, we present a suite of task-specific AI models that leverage CT imaging, multi-omics profiles, and structured clinical data to address distinct challenges in segmentation, classification, and prognosis. We developed three independent models across large public datasets. Task 1 applied a 3D U-Net to segment pancreatic tumors from CT scans, achieving a Dice Similarity Coefficient (DSC) of 0.7062. Task 2 employed a hierarchical ensemble of omics-based classifiers to distinguish tumor from normal tissue and classify six major cancer types with 98.67% accuracy. Task 3 benchmarked classical machine learning models on clinical data for prognosis prediction across three cancers (LIHC, KIRC, STAD), achieving strong performance (e.g., C-index of 0.820 in KIRC, AUC of 0.978 in LIHC). Across all tasks, explainable AI methods such as SHAP and attention-based visualization enabled transparent interpretation of model outputs. These results demonstrate the value of tailored, modality-aware models and underscore the clinical potential of applying such tailored AI systems for precision oncology. Technical FoundationsO_LISegmentation (Task 1): A custom 3D U-Net was trained using the Task07_Pancreas dataset from the Medical Segmentation Decathlon (MSD). CT images were preprocessed with MONAI-based pipelines, resampled to (64, 96, 96) voxels, and intensity-windowed to HU ranges of -100 to 240. C_LIO_LIClassification (Task 2): Multi-omics data from TCGA--including gene expression, methylation, miRNA, CNV, and mutation profiles--were log-transformed and normalized. Five modality-specific LightGBM classifiers generated meta-features for a late-fusion ensemble. Stratified 5-fold cross-validation was used for evaluation. C_LIO_LIPrognosis (Task 3): Clinical variables from TCGA were curated and imputed (median/mode), with high-missing-rate columns removed. Survival models (e.g., Cox-PH, Random Forest, XGBoost) were trained with early stopping. No omics or imaging data were used in this task. C_LIO_LIInterpretability: SHAP values were computed for all tree-based models, and attention-based overlays were used in imaging tasks to visualize salient regions. C_LI

Development and International Validation of a Deep Learning Model for Predicting Acute Pancreatitis Severity from CT Scans

Xu, Y., Teutsch, B., Zeng, W., Hu, Y., Rastogi, S., Hu, E. Y., DeGregorio, I. M., Fung, C. W., Richter, B. I., Cummings, R., Goldberg, J. E., Mathieu, E., Appiah Asare, B., Hegedus, P., Gurza, K.-B., Szabo, I. V., Tarjan, H., Szentesi, A., Borbely, R., Molnar, D., Faluhelyi, N., Vincze, A., Marta, K., Hegyi, P., Lei, Q., Gonda, T., Huang, C., Shen, Y.

medrxiv logopreprintJul 7 2025
Background and aimsAcute pancreatitis (AP) is a common gastrointestinal disease with rising global incidence. While most cases are mild, severe AP (SAP) carries high mortality. Early and accurate severity prediction is crucial for optimal management. However, existing severity prediction models, such as BISAP and mCTSI, have modest accuracy and often rely on data unavailable at admission. This study proposes a deep learning (DL) model to predict AP severity using abdominal contrast-enhanced CT (CECT) scans acquired within 24 hours of admission. MethodsWe collected 10,130 studies from 8,335 patients across a multi-site U.S. health system. The model was trained in two stages: (1) self-supervised pretraining on large-scale unlabeled CT studies and (2) fine-tuning on 550 labeled studies. Performance was evaluated against mCTSI and BISAP on a hold-out internal test set (n=100 patients) and externally validated on a Hungarian AP registry (n=518 patients). ResultsOn the internal test set, the model achieved AUROCs of 0.888 (95% CI: 0.800-0.960) for SAP and 0.888 (95% CI: 0.819-0.946) for mild AP (MAP), outperforming mCTSI (p = 0.002). External validation showed robust AUROCs of 0.887 (95% CI: 0.825-0.941) for SAP and 0.858 (95% CI: 0.826-0.888) for MAP, surpassing mCTSI (p = 0.024) and BISAP (p = 0.002). Retrospective simulation suggested the models potential to support admission triage and serve as a second reader during CECT interpretation. ConclusionsThe proposed DL model outperformed standard scoring systems for AP severity prediction, generalized well to external data, and shows promise for providing early clinical decision support and improving resource allocation.

Urethra contours on MRI: multidisciplinary consensus educational atlas and reference standard for artificial intelligence benchmarking

song, y., Nguyen, L., Dornisch, A., Baxter, M. T., Barrett, T., Dale, A., Dess, R. T., Harisinghani, M., Kamran, S. C., Liss, M. A., Margolis, D. J., Weinberg, E. P., Woolen, S. A., Seibert, T. M.

medrxiv logopreprintJul 2 2025
IntroductionThe urethra is a recommended avoidance structure for prostate cancer treatment. However, even subspecialist physicians often struggle to accurately identify the urethra on available imaging. Automated segmentation tools show promise, but a lack of reliable ground truth or appropriate evaluation standards has hindered validation and clinical adoption. This study aims to establish a reference-standard dataset with expert consensus contours, define clinically meaningful evaluation metrics, and assess the performance and generalizability of a deep-learning-based segmentation model. Materials and MethodsA multidisciplinary panel of four experienced subspecialists in prostate MRI generated consensus contours of the male urethra for 71 patients across six imaging centers. Four of those cases were previously used in an international study (PURE-MRI), wherein 62 physicians attempted to contour the prostate and urethra on the patient images. Separately, we developed a deep-learning AI model for urethra segmentation using another 151 cases from one center and evaluated it against the consensus reference standard and compared to human performance using Dice Score, percent urethra Coverage, and Maximum 2D (axial, in-plane) Hausdorff Distance (HD) from the reference standard. ResultsIn the PURE-MRI dataset, the AI model outperformed most physicians, achieving a median Dice of 0.41 (vs. 0.33 for physicians), Coverage of 81% (vs. 36%), and Max 2D HD of 1.8 mm (vs. 1.6 mm). In the larger dataset, performance remained consistent, with a Dice of 0.40, Coverage of 89%, and Max 2D HD of 2.0 mm, indicating strong generalizability across a broader patient population and more varied imaging conditions. ConclusionWe established a multidisciplinary consensus benchmark for segmentation of the urethra. The deep-learning model performs comparably to specialist physicians and demonstrates consistent results across multiple institutions. It shows promise as a clinical decision-support tool for accurate and reliable urethra segmentation in prostate cancer radiotherapy planning and studies of dose-toxicity associations.

AI-based Hepatic Steatosis Detection and Integrated Hepatic Assessment from Cardiac CT Attenuation Scans Enhances All-cause Mortality Risk Stratification: A Multi-center Study

Yi, J., Patel, K., Miller, R. J., Marcinkiewicz, A. M., Shanbhag, A., Hijazi, W., Dharmavaram, N., Lemley, M., Zhou, J., Zhang, W., Liang, J. X., Ramirez, G., Builoff, V., Slipczuk, L., Travin, M., Alexanderson, E., Carvajal-Juarez, I., Packard, R. R., Al-Mallah, M., Ruddy, T. D., Einstein, A. J., Feher, A., Miller, E. J., Acampa, W., Knight, S., Le, V., Mason, S., Calsavara, V. F., Chareonthaitawee, P., Wopperer, S., Kwan, A. C., Wang, L., Berman, D. S., Dey, D., Di Carli, M. F., Slomka, P.

medrxiv logopreprintJun 11 2025
BackgroundHepatic steatosis (HS) is a common cardiometabolic risk factor frequently present but under- diagnosed in patients with suspected or known coronary artery disease. We used artificial intelligence (AI) to automatically quantify hepatic tissue measures for identifying HS from CT attenuation correction (CTAC) scans during myocardial perfusion imaging (MPI) and evaluate their added prognostic value for all-cause mortality prediction. MethodsThis study included 27039 consecutive patients [57% male] with MPI scans from nine sites. We used an AI model to segment liver and spleen on low dose CTAC scans and quantify the liver measures, and the difference of liver minus spleen (LmS) measures. HS was defined as mean liver attenuation < 40 Hounsfield units (HU) or LmS attenuation < -10 HU. Additionally, we used seven sites to develop an AI liver risk index (LIRI) for comprehensive hepatic assessment by integrating the hepatic measures and two external sites to validate its improved prognostic value and generalizability for all-cause mortality prediction over HS. FindingsMedian (interquartile range [IQR]) age was 67 [58, 75] years and body mass index (BMI) was 29.5 [25.5, 34.7] kg/m2, with diabetes in 8950 (33%) patients. The algorithm identified HS in 6579 (24%) patients. During median [IQR] follow-up of 3.58 [1.86, 5.15] years, 4836 (18%) patients died. HS was associated with increased mortality risk overall (adjusted hazard ratio (HR): 1.14 [1.05, 1.24], p=0.0016) and in subpopulations. LIRI provided higher prognostic value than HS after adjustments overall (adjusted HR 1.5 [1.32, 1.69], p<0.0001 vs HR 1.16 [1.02, 1.31], p=0.0204) and in subpopulations. InterpretationsAI-based hepatic measures automatically identify HS from CTAC scans in patients undergoing MPI without additional radiation dose or physician interaction. Integrated liver assessment combining multiple hepatic imaging measures improved risk stratification for all-cause mortality. FundingNational Heart, Lung, and Blood Institute/National Institutes of Health. Research in context Evidence before this studyExisting studies show that fully automated hepatic quantification analysis from chest computed tomography (CT) scans is feasible. While hepatic measures show significant potential for improving risk stratification and patient management, CT attenuation correction (CTAC) scans from patients undergoing myocardial perfusion imaging (MPI) have rarely been utilized for concurrent and automated volumetric hepatic analysis beyond its current utilization for attenuation correction and coronary artery calcium burden assessment. We conducted a literature review on PubMed and Google Scholar on April 1st, 2025, using the following keywords: ("liver" OR "hepatic") AND ("quantification" OR "measure") AND ("risk stratification" OR "survival analysis" OR "prognosis" OR "prognostic prediction") AND ("CT" OR "computed tomography"). Previous studies have established approaches for the identification of hepatic steatosis (HS) and its prognostic value in various small- scale cohorts using either invasive biopsy or non-invasive imaging approaches. However, CT-based non- invasive imaging, existing research predominantly focuses on manual region-of-interest (ROI)-based hepatic quantification from selected CT slices or on identifying hepatic steatosis without comprehensive prognostic assessment in large-scale and multi-site cohorts, which hinders the association evaluation of hepatic steatosis for risk stratification in clinical routine with less precise estimates, weak statistical reliability, and limited subgroup analysis to assess bias effects. No existing studies investigated the prognostic value of hepatic steatosis measured in consecutive patients undergoing MPI. These patients usually present with multiple cardiovascular risk factors such as hypertension, dyslipidemia, diabetes and family history of coronary disease. Whether hepatic measures could provide added prognostic value over existing cardiometabolic factors is unknown. Furthermore, despite the diverse hepatic measures on CT in existing literature, integrated AI-based assessment has not been investigated before though it may improve the risk stratification further over HS. Lastly, previous research relied on dedicated CT scans performed for screening purposes. CTAC scans obtained routinely with MPI had never been utilized for automated HS detection and prognostic evaluation, despite being readily available at no additional cost or radiation exposure. Added value of this studyIn this multi-center (nine sites) international (three countries) study of 27039 consecutive patients undergoing myocardial perfusion imaging (MPI) with PET or SPECT, we used an innovative artificial intelligence (AI)- based approach for automatically segmenting the entire liver and spleen volumes from low-dose ungated CT attenuation correction (CTAC) scans acquired during MPI, followed by the identification of hepatic steatosis. We evaluated the added prognostic value of several key hepatic metrics--liver measures (mean attenuation, coefficient of variation (CoV), entropy, and standard deviation), and similar measures for the difference of liver minus spleen (LmS)--derived from volumetric quantification of CTAC scans with adjustment for existing clinical and MPI variables. A HS imaging criterion (HSIC: a patient has moderate or severe hepatic steatosis if the mean liver attenuation is < 40 Hounsfield unit (HU) or the difference of liver mean attenuation and spleen mean attenuation is < -10 HU) was used to detect HS. These hepatic metrics were assessed for their ability to predict all-cause mortality in a large-scale and multi-center patient cohort. Additionally, we developed and validated an eXtreme Gradient Boosting decision tree model for integrated liver assessment and risk stratification by combining the hepatic metrics with the demographic variables to derive a liver risk index (LIRI). Our results demonstrated strong associations between the hepatic metrics and all-cause mortality, even after adjustment for clinical variables, myocardial perfusion, and atherosclerosis biomarkers. Our results revealed significant differences in the association of HS with mortality in different sex, age, and race subpopulations. Similar differences were also observed in various chronic disease subpopulations such as obese and diabetic subpopulations. These results highlighted the modifying effects of various patient characteristics, partially accounting for the inconsistent association observed in existing studies. Compared with individual hepatic measures, LIRI showed significant improvement compared to HSIC-based HS in mortality prediction in external testing. All these demonstrate the feasibility of HS detection and integrated liver assessment from cardiac low-dose CT scans from MPI, which is also expected to apply for generic chest CT scans which have coverage of liver and spleen while prior studies used dedicated abdominal CT scans for such purposes. Implications of all the available evidenceRoutine point-of-care analysis of hepatic quantification can be seamlessly integrated into all MPI using CTAC scans to noninvasively identify HS at no additional cost or radiation exposure. The automatically derived hepatic metrics enhance risk stratification by providing additional prognostic value beyond existing clinical and imaging factors, and the LIRI enables comprehensive assessment of liver and further improves risk stratification and patient management.

Slide-free surface histology enables rapid colonic polyp interpretation across specialties and foundation AI

Yong, A., Husna, N., Tan, K. H., Manek, G., Sim, R., Loi, R., Lee, O., Tang, S., Soon, G., Chan, D., Liang, K.

medrxiv logopreprintJun 11 2025
Colonoscopy is a mainstay of colorectal cancer screening and has helped to lower cancer incidence and mortality. The resection of polyps during colonoscopy is critical for tissue diagnosis and prevention of colorectal cancer, albeit resulting in increased resource requirements and expense. Discarding resected benign polyps without sending for histopathological processing and confirmatory diagnosis, known as the resect and discard strategy, could enhance efficiency but is not commonly practiced due to endoscopists predominant preference for pathological confirmation. The inaccessibility of histopathology from unprocessed resected tissue hampers endoscopic decisions. We show that intraprocedural fibre-optic microscopy with ultraviolet-C surface excitation (FUSE) of polyps post-resection enables rapid diagnosis, potentially complementing endoscopic interpretation and incorporating pathologist oversight. In a clinical study of 28 patients, slide-free FUSE microscopy of freshly resected polyps yielded mucosal views that greatly magnified the surface patterns observed on endoscopy and revealed previously unavailable histopathological signatures. We term this new cross-specialty readout surface histology. In blinded interpretations of 42 polyps (19 training, 23 reading) by endoscopists and pathologists of varying experience, surface histology differentiated normal/benign, low-grade dysplasia, and high-grade dysplasia and cancer, with 100% performance in classifying high/low risk. This FUSE dataset was also successfully interpreted by foundation AI models pretrained on histopathology slides, illustrating a new potential for these models to not only expedite conventional pathology tasks but also autonomously provide instant expert feedback during procedures that typically lack pathologists. Surface histology readouts during colonoscopy promise to empower endoscopist decisions and broadly enhance confidence and participation in resect and discard. One Sentence SummaryRapid microscopy of resected polyps during colonoscopy yielded accurate diagnoses, promising to enhance colorectal screening.

Rad-Path Correlation of Deep Learning Models for Prostate Cancer Detection on MRI

Verde, A. S. C., de Almeida, J. G., Mendes, F., Pereira, M., Lopes, R., Brito, M. J., Urbano, M., Correia, P. S., Gaivao, A. M., Firpo-Betancourt, A., Fonseca, J., Matos, C., Regge, D., Marias, K., Tsiknakis, M., ProCAncer-I Consortium,, Conceicao, R. C., Papanikolaou, N.

medrxiv logopreprintJun 4 2025
While Deep Learning (DL) models trained on Magnetic Resonance Imaging (MRI) have shown promise for prostate cancer detection, their lack of direct biological validation often undermines radiologists trust and hinders clinical adoption. Radiologic-histopathologic (rad-path) correlation has the potential to validate MRI-based lesion detection using digital histopathology. This study uses automated and manually annotated digital histopathology slides as a standard of reference to evaluate the spatial extent of lesion annotations derived from both radiologist interpretations and DL models previously trained on prostate bi-parametric MRI (bp-MRI). 117 histopathology slides were used as reference. Prospective patients with clinically significant prostate cancer performed a bp-MRI examination before undergoing a robotic radical prostatectomy, and each prostate specimen was sliced using a 3D-printed patient-specific mold to ensure a direct comparison between pre-operative imaging and histopathology slides. The histopathology slides and their corresponding T2-weighted MRI images were co-registered. We trained DL models for cancer detection on large retrospective datasets of T2-w MRI only, bp-MRI and histopathology images and did inference in a prospective patient cohort. We evaluated the spatial extent between detected lesions and between detected lesions and the histopathological and radiological ground-truth, using the Dice similarity coefficient (DSC). The DL models trained on digital histopathology tiles and MRI images demonstrated promising capabilities in lesion detection. A low overlap was observed between the lesion detection masks generated by the histopathology and bp-MRI models, with a DSC = 0.10. However, the overlap was equivalent (DSC = 0.08) between radiologist annotations and histopathology ground truth. A rad-path correlation pipeline was established in a prospective patient cohort with prostate cancer undergoing surgery. The correlation between rad-path DL models was low but comparable to the overlap between annotations. While DL models show promise in prostate cancer detection, challenges remain in integrating MRI-based predictions with histopathological findings.

Interpretable Machine Learning based Detection of Coeliac Disease

Jaeckle, F., Bryant, R., Denholm, J., Romero Diaz, J., Schreiber, B., Shenoy, V., Ekundayomi, D., Evans, S., Arends, M., Soilleux, E.

medrxiv logopreprintJun 4 2025
BackgroundCoeliac disease, an autoimmune disorder affecting approximately 1% of the global population, is typically diagnosed on a duodenal biopsy. However, inter-pathologist agreement on coeliac disease diagnosis is only around 80%. Existing machine learning solutions designed to improve coeliac disease diagnosis often lack interpretability, which is essential for building trust and enabling widespread clinical adoption. ObjectiveTo develop an interpretable AI model capable of segmenting key histological structures in duodenal biopsies, generating explainable segmentation masks, estimating intraepithelial lymphocyte (IEL)-to-enterocyte and villus-to-crypt ratios, and diagnosing coeliac disease. DesignSemantic segmentation models were trained to identify villi, crypts, IELs, and enterocytes using 49 annotated 2048x2048 patches at 40x magnification. IEL-to-enterocyte and villus-to-crypt ratios were calculated from segmentation masks, and a logistic regression model was trained on 172 images to diagnose coeliac disease based on these ratios. Evaluation was performed on an independent test set of 613 duodenal biopsy scans from a separate NHS Trust. ResultsThe villus-crypt segmentation model achieved a mean PR AUC of 80.5%, while the IEL-enterocyte model reached a PR AUC of 82%. The diagnostic model classified WSIs with 96% accuracy, 86% positive predictive value, and 98% negative predictive value on the independent test set. ConclusionsOur interpretable AI models accurately segmented key histological structures and diagnosed coeliac disease in unseen WSIs, demonstrating strong generalization performance. These models provide pathologists with reliable IEL-to-enterocyte and villus-to-crypt ratio estimates, enhancing diagnostic accuracy. Interpretable AI solutions like ours are essential for fostering trust among healthcare professionals and patients, complementing existing black-box methodologies. What is already known on this topicPathologist concordance in diagnosing coeliac disease from duodenal biopsies is consistently reported to be below 80%, highlighting diagnostic variability and the need for improved methods. Several recent studies have leveraged artificial intelligence (AI) to enhance coeliac disease diagnosis. However, most of these models operate as "black boxes," offering limited interpretability and transparency. The lack of explainability in AI-driven diagnostic tools prevents widespread adoption by healthcare professionals and reduces patient trust. What this study addsThis study presents an interpretable semantic segmentation algorithm capable of detecting the four key histological structures essential for diagnosing coeliac disease: crypts, villi, intraepithelial lymphocytes (IELs), and enterocytes. The model accurately estimates the IEL-to-enterocyte ratio and the villus-to-crypt ratio, the latter being an indicator of villous atrophy and crypt hyperplasia, thereby providing objective, reproducible metrics for diagnosis. The segmentation outputs allow for transparent, explainable decision-making, supporting pathologists in coeliac disease diagnosis with improved accuracy and confidence. This study presents an AI model that automates the estimation of the IEL-to-enterocyte ratio--a labour-intensive task currently performed manually by pathologists in limited biopsy regions. By minimising diagnostic variability and alleviating time constraints for pathologists, the model provides an efficient and practical solution to streamline the diagnostic workflow. Tested on an independent dataset from a previously unseen source, the model demonstrates explainability and generalizability, enhancing trust and encouraging adoption in routine clinical practice. Furthermore, this approach could set a new standard for AI-assisted duodenal biopsy evaluation, paving the way for the development of interpretable AI tools in pathology to address the critical challenges of limited pathologist availability and diagnostic inconsistencies.
Page 1 of 214 results
Show
per page
12»

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.