Sort by:
Page 1 of 27262 results
Next

Integrating Background Knowledge in Medical Semantic Segmentation with Logic Tensor Networks

Luca Bergamin, Giovanna Maria Dimitri, Fabio Aiolli

arxiv logopreprintSep 26 2025
Semantic segmentation is a fundamental task in medical image analysis, aiding medical decision-making by helping radiologists distinguish objects in an image. Research in this field has been driven by deep learning applications, which have the potential to scale these systems even in the presence of noise and artifacts. However, these systems are not yet perfected. We argue that performance can be improved by incorporating common medical knowledge into the segmentation model's loss function. To this end, we introduce Logic Tensor Networks (LTNs) to encode medical background knowledge using first-order logic (FOL) rules. The encoded rules span from constraints on the shape of the produced segmentation, to relationships between different segmented areas. We apply LTNs in an end-to-end framework with a SwinUNETR for semantic segmentation. We evaluate our method on the task of segmenting the hippocampus in brain MRI scans. Our experiments show that LTNs improve the baseline segmentation performance, especially when training data is scarce. Despite being in its preliminary stages, we argue that neurosymbolic methods are general enough to be adapted and applied to other medical semantic segmentation tasks.

Beyond Classification Accuracy: Neural-MedBench and the Need for Deeper Reasoning Benchmarks

Miao Jing, Mengting Jia, Junling Lin, Zhongxia Shen, Lijun Wang, Yuanyuan Peng, Huan Gao, Mingkun Xu, Shangyang Li

arxiv logopreprintSep 26 2025
Recent advances in vision-language models (VLMs) have achieved remarkable performance on standard medical benchmarks, yet their true clinical reasoning ability remains unclear. Existing datasets predominantly emphasize classification accuracy, creating an evaluation illusion in which models appear proficient while still failing at high-stakes diagnostic reasoning. We introduce Neural-MedBench, a compact yet reasoning-intensive benchmark specifically designed to probe the limits of multimodal clinical reasoning in neurology. Neural-MedBench integrates multi-sequence MRI scans, structured electronic health records, and clinical notes, and encompasses three core task families: differential diagnosis, lesion recognition, and rationale generation. To ensure reliable evaluation, we develop a hybrid scoring pipeline that combines LLM-based graders, clinician validation, and semantic similarity metrics. Through systematic evaluation of state-of-the-art VLMs, including GPT-4o, Claude-4, and MedGemma, we observe a sharp performance drop compared to conventional datasets. Error analysis shows that reasoning failures, rather than perceptual errors, dominate model shortcomings. Our findings highlight the necessity of a Two-Axis Evaluation Framework: breadth-oriented large datasets for statistical generalization, and depth-oriented, compact benchmarks such as Neural-MedBench for reasoning fidelity. We release Neural-MedBench at https://neuromedbench.github.io/ as an open and extensible diagnostic testbed, which guides the expansion of future benchmarks and enables rigorous yet cost-effective assessment of clinically trustworthy AI.

Uncovering Alzheimer's Disease Progression via SDE-based Spatio-Temporal Graph Deep Learning on Longitudinal Brain Networks

Houliang Zhou, Rong Zhou, Yangying Liu, Kanhao Zhao, Li Shen, Brian Y. Chen, Yu Zhang, Lifang He, Alzheimer's Disease Neuroimaging Initiative

arxiv logopreprintSep 26 2025
Identifying objective neuroimaging biomarkers to forecast Alzheimer's disease (AD) progression is crucial for timely intervention. However, this task remains challenging due to the complex dysfunctions in the spatio-temporal characteristics of underlying brain networks, which are often overlooked by existing methods. To address these limitations, we develop an interpretable spatio-temporal graph neural network framework to predict future AD progression, leveraging dual Stochastic Differential Equations (SDEs) to model the irregularly-sampled longitudinal functional magnetic resonance imaging (fMRI) data. We validate our approach on two independent cohorts, including the Open Access Series of Imaging Studies (OASIS-3) and the Alzheimer's Disease Neuroimaging Initiative (ADNI). Our framework effectively learns sparse regional and connective importance probabilities, enabling the identification of key brain circuit abnormalities associated with disease progression. Notably, we detect the parahippocampal cortex, prefrontal cortex, and parietal lobule as salient regions, with significant disruptions in the ventral attention, dorsal attention, and default mode networks. These abnormalities correlate strongly with longitudinal AD-related clinical symptoms. Moreover, our interpretability strategy reveals both established and novel neural systems-level and sex-specific biomarkers, offering new insights into the neurobiological mechanisms underlying AD progression. Our findings highlight the potential of spatio-temporal graph-based learning for early, individualized prediction of AD progression, even in the context of irregularly-sampled longitudinal imaging data.

Deep-learning-based Radiomics on Mitigating Post-treatment Obesity for Pediatric Craniopharyngioma Patients after Surgery and Proton Therapy

Wenjun Yang, Chia-Ho Hua, Tina Davis, Jinsoo Uh, Thomas E. Merchant

arxiv logopreprintSep 25 2025
Purpose: We developed an artificial neural network (ANN) combining radiomics with clinical and dosimetric features to predict the extent of body mass index (BMI) increase after surgery and proton therapy, with advantage of improved accuracy and integrated key feature selection. Methods and Materials: Uniform treatment protocol composing of limited surgery and proton radiotherapy was given to 84 pediatric craniopharyngioma patients (aged 1-20 years). Post-treatment obesity was classified into 3 groups (<10%, 10-20%, and >20%) based on the normalized BMI increase during a 5-year follow-up. We developed a densely connected 4-layer ANN with radiomics calculated from pre-surgery MRI (T1w, T2w, and FLAIR), combining clinical and dosimetric features as input. Accuracy, area under operative curve (AUC), and confusion matrices were compared with random forest (RF) models in a 5-fold cross-validation. The Group lasso regularization optimized a sparse connection to input neurons to identify key features from high-dimensional input. Results: Classification accuracy of the ANN reached above 0.9 for T1w, T2w, and FLAIR MRI. Confusion matrices showed high true positive rates of above 0.9 while the false positive rates were below 0.2. Approximately 10 key features selected for T1w, T2w, and FLAIR MRI, respectively. The ANN improved classification accuracy by 10% or 5% when compared to RF models without or with radiomic features. Conclusion: The ANN model improved classification accuracy on post-treatment obesity compared to conventional statistics models. The clinical features selected by Group lasso regularization confirmed our practical observation, while the additional radiomic and dosimetric features could serve as imaging markers and mitigation methods on post-treatment obesity for pediatric craniopharyngioma patients.

T2I-Diff: fMRI Signal Generation via Time-Frequency Image Transform and Classifier-Free Denoising Diffusion Models

Hwa Hui Tew, Junn Yong Loo, Yee-Fan Tan, Xinyu Tang, Hernando Ombao, Fuad Noman, Raphael C. -W. Phan, Chee-Ming Ting

arxiv logopreprintSep 25 2025
Functional Magnetic Resonance Imaging (fMRI) is an advanced neuroimaging method that enables in-depth analysis of brain activity by measuring dynamic changes in the blood oxygenation level-dependent (BOLD) signals. However, the resource-intensive nature of fMRI data acquisition limits the availability of high-fidelity samples required for data-driven brain analysis models. While modern generative models can synthesize fMRI data, they often underperform because they overlook the complex non-stationarity and nonlinear BOLD dynamics. To address these challenges, we introduce T2I-Diff, an fMRI generation framework that leverages time-frequency representation of BOLD signals and classifier-free denoising diffusion. Specifically, our framework first converts BOLD signals into windowed spectrograms via a time-dependent Fourier transform, capturing both the underlying temporal dynamics and spectral evolution. Subsequently, a classifier-free diffusion model is trained to generate class-conditioned frequency spectrograms, which are then reverted to BOLD signals via inverse Fourier transforms. Finally, we validate the efficacy of our approach by demonstrating improved accuracy and generalization in downstream fMRI-based brain network classification.

Patch-Based Diffusion for Data-Efficient, Radiologist-Preferred MRI Reconstruction

Rohan Sanda, Asad Aali, Andrew Johnston, Eduardo Reis, Jonathan Singh, Gordon Wetzstein, Sara Fridovich-Keil

arxiv logopreprintSep 25 2025
Magnetic resonance imaging (MRI) requires long acquisition times, raising costs, reducing accessibility, and making scans more susceptible to motion artifacts. Diffusion probabilistic models that learn data-driven priors can potentially assist in reducing acquisition time. However, they typically require large training datasets that can be prohibitively expensive to collect. Patch-based diffusion models have shown promise in learning effective data-driven priors over small real-valued datasets, but have not yet demonstrated clinical value in MRI. We extend the Patch-based Diffusion Inverse Solver (PaDIS) to complex-valued, multi-coil MRI reconstruction, and compare it against a state-of-the-art whole-image diffusion baseline (FastMRI-EDM) for 7x undersampled MRI reconstruction on the FastMRI brain dataset. We show that PaDIS-MRI models trained on small datasets of as few as 25 k-space images outperform FastMRI-EDM on image quality metrics (PSNR, SSIM, NRMSE), pixel-level uncertainty, cross-contrast generalization, and robustness to severe k-space undersampling. In a blinded study with three radiologists, PaDIS-MRI reconstructions were chosen as diagnostically superior in 91.7% of cases, compared to baselines (i) FastMRI-EDM and (ii) classical convex reconstruction with wavelet sparsity. These findings highlight the potential of patch-based diffusion priors for high-fidelity MRI reconstruction in data-scarce clinical settings where diagnostic confidence matters.

Interpreting Convolutional Neural Network Activation Maps with Hand-crafted Radiomics Features on Progression of Pediatric Craniopharyngioma after Irradiation Therapy

Wenjun Yang, Chuang Wang, Tina Davis, Jinsoo Uh, Chia-Ho Hua, Thomas E. Merchant

arxiv logopreprintSep 25 2025
Purpose: Convolutional neural networks (CNNs) are promising in predicting treatment outcome for pediatric craniopharyngioma while the decision mechanisms are difficult to interpret. We compared the activation maps of CNN with hand crafted radiomics features of a densely connected artificial neural network (ANN) to correlate with clinical decisions. Methods: A cohort of 100 pediatric craniopharyngioma patients were included. Binary tumor progression was classified by an ANN and CNN with input of T1w, T2w, and FLAIR MRI. Hand-crafted radiomic features were calculated from the MRI using the LifeX software and key features were selected by Group lasso regularization, comparing to the activation maps of CNN. We evaluated the radiomics models by accuracy, area under receiver operational curve (AUC), and confusion matrices. Results: The average accuracy of T1w, T2w, and FLAIR MRI was 0.85, 0.92, and 0.86 (ANOVA, F = 1.96, P = 0.18) with ANN; 0.83, 0.81, and 0.70 (ANOVA, F = 10.11, P = 0.003) with CNN. The average AUC of ANN was 0.91, 0.97, and 0.90; 0.86, 0.88, and 0.75 of CNN for the 3 MRI, respectively. The activation maps were correlated with tumor shape, min and max intensity, and texture features. Conclusions: The tumor progression for pediatric patients with craniopharyngioma achieved promising accuracy with ANN and CNN model. The activation maps extracted from different levels were interpreted with hand-crafted key features of ANN.

Graph-Radiomic Learning (GrRAiL) Descriptor to Characterize Imaging Heterogeneity in Confounding Tumor Pathologies

Dheerendranath Battalapalli, Apoorva Safai, Maria Jaramillo, Hyemin Um, Gustavo Adalfo Pineda Ortiz, Ulas Bagci, Manmeet Singh Ahluwalia, Marwa Ismail, Pallavi Tiwari

arxiv logopreprintSep 23 2025
A significant challenge in solid tumors is reliably distinguishing confounding pathologies from malignant neoplasms on routine imaging. While radiomics methods seek surrogate markers of lesion heterogeneity on CT/MRI, many aggregate features across the region of interest (ROI) and miss complex spatial relationships among varying intensity compositions. We present a new Graph-Radiomic Learning (GrRAiL) descriptor for characterizing intralesional heterogeneity (ILH) on clinical MRI scans. GrRAiL (1) identifies clusters of sub-regions using per-voxel radiomic measurements, then (2) computes graph-theoretic metrics to quantify spatial associations among clusters. The resulting weighted graphs encode higher-order spatial relationships within the ROI, aiming to reliably capture ILH and disambiguate confounding pathologies from malignancy. To assess efficacy and clinical feasibility, GrRAiL was evaluated in n=947 subjects spanning three use cases: differentiating tumor recurrence from radiation effects in glioblastoma (GBM; n=106) and brain metastasis (n=233), and stratifying pancreatic intraductal papillary mucinous neoplasms (IPMNs) into no+low vs high risk (n=608). In a multi-institutional setting, GrRAiL consistently outperformed state-of-the-art baselines - Graph Neural Networks (GNNs), textural radiomics, and intensity-graph analysis. In GBM, cross-validation (CV) and test accuracies for recurrence vs pseudo-progression were 89% and 78% with >10% test-accuracy gains over comparators. In brain metastasis, CV and test accuracies for recurrence vs radiation necrosis were 84% and 74% (>13% improvement). For IPMN risk stratification, CV and test accuracies were 84% and 75%, showing >10% improvement.

Learning neuroimaging models from health system-scale data

Yiwei Lyu, Samir Harake, Asadur Chowdury, Soumyanil Banerjee, Rachel Gologorsky, Shixuan Liu, Anna-Katharina Meissner, Akshay Rao, Chenhui Zhao, Akhil Kondepudi, Cheng Jiang, Xinhai Hou, Rushikesh S. Joshi, Volker Neuschmelting, Ashok Srinivasan, Dawn Kleindorfer, Brian Athey, Vikas Gulani, Aditya Pandey, Honglak Lee, Todd Hollon

arxiv logopreprintSep 23 2025
Neuroimaging is a ubiquitous tool for evaluating patients with neurological diseases. The global demand for magnetic resonance imaging (MRI) studies has risen steadily, placing significant strain on health systems, prolonging turnaround times, and intensifying physician burnout \cite{Chen2017-bt, Rula2024-qp-1}. These challenges disproportionately impact patients in low-resource and rural settings. Here, we utilized a large academic health system as a data engine to develop Prima, the first vision language model (VLM) serving as an AI foundation for neuroimaging that supports real-world, clinical MRI studies as input. Trained on over 220,000 MRI studies, Prima uses a hierarchical vision architecture that provides general and transferable MRI features. Prima was tested in a 1-year health system-wide study that included 30K MRI studies. Across 52 radiologic diagnoses from the major neurologic disorders, including neoplastic, inflammatory, infectious, and developmental lesions, Prima achieved a mean diagnostic area under the ROC curve of 92.0, outperforming other state-of-the-art general and medical AI models. Prima offers explainable differential diagnoses, worklist priority for radiologists, and clinical referral recommendations across diverse patient demographics and MRI systems. Prima demonstrates algorithmic fairness across sensitive groups and can help mitigate health system biases, such as prolonged turnaround times for low-resource populations. These findings highlight the transformative potential of health system-scale VLMs and Prima's role in advancing AI-driven healthcare.

Neural Network-Driven Direct CBCT-Based Dose Calculation for Head-and-Neck Proton Treatment Planning

Muheng Li, Evangelia Choulilitsa, Lisa Fankhauser, Francesca Albertini, Antony Lomax, Ye Zhang

arxiv logopreprintSep 22 2025
Accurate dose calculation on cone beam computed tomography (CBCT) images is essential for modern proton treatment planning workflows, particularly when accounting for inter-fractional anatomical changes in adaptive treatment scenarios. Traditional CBCT-based dose calculation suffers from image quality limitations, requiring complex correction workflows. This study develops and validates a deep learning approach for direct proton dose calculation from CBCT images using extended Long Short-Term Memory (xLSTM) neural networks. A retrospective dataset of 40 head-and-neck cancer patients with paired planning CT and treatment CBCT images was used to train an xLSTM-based neural network (CBCT-NN). The architecture incorporates energy token encoding and beam's-eye-view sequence modelling to capture spatial dependencies in proton dose deposition patterns. Training utilized 82,500 paired beam configurations with Monte Carlo-generated ground truth doses. Validation was performed on 5 independent patients using gamma analysis, mean percentage dose error assessment, and dose-volume histogram comparison. The CBCT-NN achieved gamma pass rates of 95.1 $\pm$ 2.7% using 2mm/2% criteria. Mean percentage dose errors were 2.6 $\pm$ 1.4% in high-dose regions ($>$90% of max dose) and 5.9 $\pm$ 1.9% globally. Dose-volume histogram analysis showed excellent preservation of target coverage metrics (Clinical Target Volume V95% difference: -0.6 $\pm$ 1.1%) and organ-at-risk constraints (parotid mean dose difference: -0.5 $\pm$ 1.5%). Computation time is under 3 minutes without sacrificing Monte Carlo-level accuracy. This study demonstrates the proof-of-principle of direct CBCT-based proton dose calculation using xLSTM neural networks. The approach eliminates traditional correction workflows while achieving comparable accuracy and computational efficiency suitable for adaptive protocols.
Page 1 of 27262 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.