Sort by:
Page 4 of 71702 results

From Explainable to Explained AI: Ideas for Falsifying and Quantifying Explanations

Yoni Schirris, Eric Marcus, Jonas Teuwen, Hugo Horlings, Efstratios Gavves

arxiv logopreprintAug 9 2025
Explaining deep learning models is essential for clinical integration of medical image analysis systems. A good explanation highlights if a model depends on spurious features that undermines generalization and harms a subset of patients or, conversely, may present novel biological insights. Although techniques like GradCAM can identify influential features, they are measurement tools that do not themselves form an explanation. We propose a human-machine-VLM interaction system tailored to explaining classifiers in computational pathology, including multi-instance learning for whole-slide images. Our proof of concept comprises (1) an AI-integrated slide viewer to run sliding-window experiments to test claims of an explanation, and (2) quantification of an explanation's predictiveness using general-purpose vision-language models. The results demonstrate that this allows us to qualitatively test claims of explanations and can quantifiably distinguish competing explanations. This offers a practical path from explainable AI to explained AI in digital pathology and beyond. Code and prompts are available at https://github.com/nki-ai/x2x.

OctreeNCA: Single-Pass 184 MP Segmentation on Consumer Hardware

Nick Lemke, John Kalkhof, Niklas Babendererde, Anirban Mukhopadhyay

arxiv logopreprintAug 9 2025
Medical applications demand segmentation of large inputs, like prostate MRIs, pathology slices, or videos of surgery. These inputs should ideally be inferred at once to provide the model with proper spatial or temporal context. When segmenting large inputs, the VRAM consumption of the GPU becomes the bottleneck. Architectures like UNets or Vision Transformers scale very poorly in VRAM consumption, resulting in patch- or frame-wise approaches that compromise global consistency and inference speed. The lightweight Neural Cellular Automaton (NCA) is a bio-inspired model that is by construction size-invariant. However, due to its local-only communication rules, it lacks global knowledge. We propose OctreeNCA by generalizing the neighborhood definition using an octree data structure. Our generalized neighborhood definition enables the efficient traversal of global knowledge. Since deep learning frameworks are mainly developed for large multi-layer networks, their implementation does not fully leverage the advantages of NCAs. We implement an NCA inference function in CUDA that further reduces VRAM demands and increases inference speed. Our OctreeNCA segments high-resolution images and videos quickly while occupying 90% less VRAM than a UNet during evaluation. This allows us to segment 184 Megapixel pathology slices or 1-minute surgical videos at once.

DiffUS: Differentiable Ultrasound Rendering from Volumetric Imaging

Noe Bertramo, Gabriel Duguey, Vivek Gopalakrishnan

arxiv logopreprintAug 9 2025
Intraoperative ultrasound imaging provides real-time guidance during numerous surgical procedures, but its interpretation is complicated by noise, artifacts, and poor alignment with high-resolution preoperative MRI/CT scans. To bridge the gap between reoperative planning and intraoperative guidance, we present DiffUS, a physics-based, differentiable ultrasound renderer that synthesizes realistic B-mode images from volumetric imaging. DiffUS first converts MRI 3D scans into acoustic impedance volumes using a machine learning approach. Next, we simulate ultrasound beam propagation using ray tracing with coupled reflection-transmission equations. DiffUS formulates wave propagation as a sparse linear system that captures multiple internal reflections. Finally, we reconstruct B-mode images via depth-resolved echo extraction across fan-shaped acquisition geometry, incorporating realistic artifacts including speckle noise and depth-dependent degradation. DiffUS is entirely implemented as differentiable tensor operations in PyTorch, enabling gradient-based optimization for downstream applications such as slice-to-volume registration and volumetric reconstruction. Evaluation on the ReMIND dataset demonstrates DiffUS's ability to generate anatomically accurate ultrasound images from brain MRI data.

Supporting intraoperative margin assessment using deep learning for automatic tumour segmentation in breast lumpectomy micro-PET-CT.

Maris L, Göker M, De Man K, Van den Broeck B, Van Hoecke S, Van de Vijver K, Vanhove C, Keereman V

pubmed logopapersAug 9 2025
Complete tumour removal is vital in curative breast cancer (BCa) surgery to prevent recurrence. Recently, [<sup>18</sup>F]FDG micro-PET-CT of lumpectomy specimens has shown promise for intraoperative margin assessment (IMA). To aid interpretation, we trained a 2D Residual U-Net to delineate invasive carcinoma of no special type in micro-PET-CT lumpectomy images. We collected 53 BCa lamella images from 19 patients with true histopathology-defined tumour segmentations. Group five-fold cross-validation yielded a dice similarity coefficient of 0.71 ± 0.20 for segmentation. Afterwards, an ensemble model was generated to segment tumours and predict margin status. Comparing predicted and true histopathological margin status in a separate set of 31 micro-PET-CT lumpectomy images of 31 patients achieved an F1 score of 84%, closely matching the mean performance of seven physicians who manually interpreted the same images. This model represents an important step towards a decision-support system that enhances micro-PET-CT-based IMA in BCa, facilitating its clinical adoption.

Trustworthy Medical Imaging with Large Language Models: A Study of Hallucinations Across Modalities

Anindya Bijoy Das, Shahnewaz Karim Sakib, Shibbir Ahmed

arxiv logopreprintAug 9 2025
Large Language Models (LLMs) are increasingly applied to medical imaging tasks, including image interpretation and synthetic image generation. However, these models often produce hallucinations, which are confident but incorrect outputs that can mislead clinical decisions. This study examines hallucinations in two directions: image to text, where LLMs generate reports from X-ray, CT, or MRI scans, and text to image, where models create medical images from clinical prompts. We analyze errors such as factual inconsistencies and anatomical inaccuracies, evaluating outputs using expert informed criteria across imaging modalities. Our findings reveal common patterns of hallucination in both interpretive and generative tasks, with implications for clinical reliability. We also discuss factors contributing to these failures, including model architecture and training data. By systematically studying both image understanding and generation, this work provides insights into improving the safety and trustworthiness of LLM driven medical imaging systems.

SPARSE Data, Rich Results: Few-Shot Semi-Supervised Learning via Class-Conditioned Image Translation

Guido Manni, Clemente Lauretti, Loredana Zollo, Paolo Soda

arxiv logopreprintAug 8 2025
Deep learning has revolutionized medical imaging, but its effectiveness is severely limited by insufficient labeled training data. This paper introduces a novel GAN-based semi-supervised learning framework specifically designed for low labeled-data regimes, evaluated across settings with 5 to 50 labeled samples per class. Our approach integrates three specialized neural networks -- a generator for class-conditioned image translation, a discriminator for authenticity assessment and classification, and a dedicated classifier -- within a three-phase training framework. The method alternates between supervised training on limited labeled data and unsupervised learning that leverages abundant unlabeled images through image-to-image translation rather than generation from noise. We employ ensemble-based pseudo-labeling that combines confidence-weighted predictions from the discriminator and classifier with temporal consistency through exponential moving averaging, enabling reliable label estimation for unlabeled data. Comprehensive evaluation across eleven MedMNIST datasets demonstrates that our approach achieves statistically significant improvements over six state-of-the-art GAN-based semi-supervised methods, with particularly strong performance in the extreme 5-shot setting where the scarcity of labeled data is most challenging. The framework maintains its superiority across all evaluated settings (5, 10, 20, and 50 shots per class). Our approach offers a practical solution for medical imaging applications where annotation costs are prohibitive, enabling robust classification performance even with minimal labeled data. Code is available at https://github.com/GuidoManni/SPARSE.

Multivariate Fields of Experts

Stanislas Ducotterd, Michael Unser

arxiv logopreprintAug 8 2025
We introduce the multivariate fields of experts, a new framework for the learning of image priors. Our model generalizes existing fields of experts methods by incorporating multivariate potential functions constructed via Moreau envelopes of the $\ell_\infty$-norm. We demonstrate the effectiveness of our proposal across a range of inverse problems that include image denoising, deblurring, compressed-sensing magnetic-resonance imaging, and computed tomography. The proposed approach outperforms comparable univariate models and achieves performance close to that of deep-learning-based regularizers while being significantly faster, requiring fewer parameters, and being trained on substantially fewer data. In addition, our model retains a relatively high level of interpretability due to its structured design.

Development and validation of a transformer-based deep learning model for predicting distant metastasis in non-small cell lung cancer using <sup>18</sup>FDG PET/CT images.

Hu N, Luo Y, Tang M, Yan G, Yuan S, Li F, Lei P

pubmed logopapersAug 8 2025
This study aimed to develop and validate a hybrid deep learning (DL) model that integrates convolutional neural network (CNN) and vision transformer (ViT) architectures to predict distant metastasis (DM) in patients with non-small cell lung cancer (NSCLC) using <sup>18</sup>F-FDG PET/CT images. A retrospective analysis was conducted on a cohort of consecutively registered patients who were newly diagnosed and untreated for NSCLC. A total of 167 patients with available PET/CT images were included in the analysis. DL features were extracted using a combination of CNN and ViT architectures, followed by feature selection, model construction, and evaluation of model performance using the receiver operating characteristic (ROC) and the area under the curve (AUC). The ViT-based DL model exhibited strong predictive capabilities in both the training and validation cohorts, achieving AUCs of 0.824 and 0.830 for CT features, and 0.602 and 0.694 for PET features, respectively. Notably, the model that integrated both PET and CT features demonstrated a notable AUC of 0.882 in the validation cohort, outperforming models that utilized either PET or CT features alone. Furthermore, this model outperformed the CNN model (ResNet 50), which achieved an AUC of 0.752 [95% CI 0.613, 0.890], p < 0.05. Decision curve analysis further supported the efficacy of the ViT-based DL model. The ViT-based DL developed in this study demonstrates considerable potential in predicting DM in patients with NSCLC, potentially informing the creation of personalized treatment strategies. Future validation through prospective studies with larger cohorts is necessary.

Non-invasive prediction of the secondary enucleation risk in uveal melanoma based on pretreatment CT and MRI prior to stereotactic radiotherapy.

Yedekci Y, Arimura H, Jin Y, Yilmaz MT, Kodama T, Ozyigit G, Yazici G

pubmed logopapersAug 8 2025
The aim of this study was to develop a radiomic model to non-invasively predict the risk of secondary enucleation (SE) in patients with uveal melanoma (UM) prior to stereotactic radiotherapy using pretreatment computed tomography (CT) and magnetic resonance (MR) images. This retrospective study encompasses a cohort of 308 patients diagnosed with UM who underwent stereotactic radiosurgery (SRS) or fractionated stereotactic radiotherapy (FSRT) using the CyberKnife system (Accuray, Sunnyvale, CA, USA) between 2007 and 2018. Each patient received comprehensive ophthalmologic evaluations, including assessment of visual acuity, anterior segment examination, fundus examination, and ultrasonography. All patients were followed up for a minimum of 5 years. The cohort was composed of 65 patients who underwent SE (SE+) and 243 who did not (SE-). Radiomic features were extracted from pretreatment CT and MR images. To develop a robust predictive model, four different machine learning algorithms were evaluated using these features. The stacking model utilizing CT + MR radiomic features achieved the highest predictive performance, with an area under the curve (AUC) of 0.90, accuracy of 0.86, sensitivity of 0.81, and specificity of 0.90. The feature of robust mean absolute deviation derived from the Laplacian-of-Gaussian-filtered MR images was identified as the most significant predictor, demonstrating a statistically significant difference between SE+ and SE- cases (p = 0.005). Radiomic analysis of pretreatment CT and MR images can non-invasively predict the risk of SE in UM patients undergoing SRS/FSRT. The combined CT + MR radiomic model may inform more personalized therapeutic decisions, thereby reducing unnecessary radiation exposure and potentially improving patient outcomes.

Explainable Cryobiopsy AI Model, CRAI, to Predict Disease Progression for Transbronchial Lung Cryobiopsies with Interstitial Pneumonia

Uegami, W., Okoshi, E. N., Lami, K., Nei, Y., Ozasa, M., Kataoka, K., Kitamura, Y., Kohashi, Y., Cooper, L. A. D., Sakanashi, H., Saito, Y., Kondoh, Y., the study group on CRYOSOLUTION,, Fukuoka, J.

medrxiv logopreprintAug 8 2025
BackgroundInterstitial lung disease (ILD) encompasses diverse pulmonary disorders with varied prognoses. Current pathological diagnoses suffer from inter-observer variability,necessitating more standardized approaches. We developed an ensemble model AI for cryobiopsy, CRAI, an artificial intelligence model to analyze transbronchial lung cryobiopsy (TBLC) specimens and predict patient outcomes. MethodsWe developed an explainable AI model, CRAI, to analyze TBLC. CRAI comprises seven modules for detecting histological features, generating 19 pathologically significant findings. A downstream XGBoost classifier was developed to predict disease progression using these findings. The models performance was evaluated using respiratory function changes and survival analysis in cross-validation and external test cohorts. FindingsIn the internal cross-validation (135 cases), the model predicted 105 cases without disease progression and 30 with disease progression. The annual {Delta}%FVC was -1.293 in the non-progressive group versus -5.198 in the progressive group, outperforming most pathologists diagnoses. In the external test cohort (48 cases), the model predicted 38 non-progressive and 10 progressive cases. Survival analysis demonstrated significantly shorter survival times in the progressive group (p=0.034). InterpretationCRAI provides a comprehensive, interpretable approach to analyzing TBLC specimens, offering potential for standardizing ILD diagnosis and predicting disease progression. The model could facilitate early identification of progressive cases and guide personalized therapeutic interventions. FundingNew Energy and Industrial Technology Development Organization (NEDO) and Japanese Ministry of Health, Labor, and Welfare.
Page 4 of 71702 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.