Sort by:
Page 2 of 1401393 results

Dynamic Prompt Generation for Interactive 3D Medical Image Segmentation Training

Tidiane Camaret Ndir, Alexander Pfefferle, Robin Tibor Schirrmeister

arxiv logopreprintOct 3 2025
Interactive 3D biomedical image segmentation requires efficient models that can iteratively refine predictions based on user prompts. Current foundation models either lack volumetric awareness or suffer from limited interactive capabilities. We propose a training strategy that combines dynamic volumetric prompt generation with content-aware adaptive cropping to optimize the use of the image encoder. Our method simulates realistic user interaction patterns during training while addressing the computational challenges of learning from sequential refinement feedback on a single GPU. For efficient training, we initialize our network using the publicly available weights from the nnInteractive segmentation model. Evaluation on the \textbf{Foundation Models for Interactive 3D Biomedical Image Segmentation} competition demonstrates strong performance with an average final Dice score of 0.6385, normalized surface distance of 0.6614, and area-under-the-curve metrics of 2.4799 (Dice) and 2.5671 (NSD).

Multimodal Carotid Risk Stratification with Large Vision-Language Models: Benchmarking, Fine-Tuning, and Clinical Insights

Daphne Tsolissou, Theofanis Ganitidis, Konstantinos Mitsis, Stergios CHristodoulidis, Maria Vakalopoulou, Konstantina Nikita

arxiv logopreprintOct 3 2025
Reliable risk assessment for carotid atheromatous disease remains a major clinical challenge, as it requires integrating diverse clinical and imaging information in a manner that is transparent and interpretable to clinicians. This study investigates the potential of state-of-the-art and recent large vision-language models (LVLMs) for multimodal carotid plaque assessment by integrating ultrasound imaging (USI) with structured clinical, demographic, laboratory, and protein biomarker data. A framework that simulates realistic diagnostic scenarios through interview-style question sequences is proposed, comparing a range of open-source LVLMs, including both general-purpose and medically tuned models. Zero-shot experiments reveal that even if they are very powerful, not all LVLMs can accurately identify imaging modality and anatomy, while all of them perform poorly in accurate risk classification. To address this limitation, LLaVa-NeXT-Vicuna is adapted to the ultrasound domain using low-rank adaptation (LoRA), resulting in substantial improvements in stroke risk stratification. The integration of multimodal tabular data in the form of text further enhances specificity and balanced accuracy, yielding competitive performance compared to prior convolutional neural network (CNN) baselines trained on the same dataset. Our findings highlight both the promise and limitations of LVLMs in ultrasound-based cardiovascular risk prediction, underscoring the importance of multimodal integration, model calibration, and domain adaptation for clinical translation.

Real-time nonlinear inversion of magnetic resonance elastography with operator learning

Juampablo E. Heras Rivera, Caitlin M. Neher, Mehmet Kurt

arxiv logopreprintOct 3 2025
$\textbf{Purpose:}$ To develop and evaluate an operator learning framework for nonlinear inversion (NLI) of brain magnetic resonance elastography (MRE) data, which enables real-time inversion of elastograms with comparable spatial accuracy to NLI. $\textbf{Materials and Methods:}$ In this retrospective study, 3D MRE data from 61 individuals (mean age, 37.4 years; 34 female) were used for development of the framework. A predictive deep operator learning framework (oNLI) was trained using 10-fold cross-validation, with the complex curl of the measured displacement field as inputs and NLI-derived reference elastograms as outputs. A structural prior mechanism, analogous to Soft Prior Regularization in the MRE literature, was incorporated to improve spatial accuracy. Subject-level evaluation metrics included Pearson's correlation coefficient, absolute relative error, and structural similarity index measure between predicted and reference elastograms across brain regions of different sizes to understand accuracy. Statistical analyses included paired t-tests comparing the proposed oNLI variants to the convolutional neural network baselines. $\textbf{Results:}$ Whole brain absolute percent error was 8.4 $\pm$ 0.5 ($\mu'$) and 10.0 $\pm$ 0.7 ($\mu''$) for oNLI and 15.8 $\pm$ 0.8 ($\mu'$) and 26.1 $\pm$ 1.1 ($\mu''$) for CNNs. Additionally, oNLI outperformed convolutional architectures as per Pearson's correlation coefficient, $r$, in the whole brain and across all subregions for both the storage modulus and loss modulus (p < 0.05). $\textbf{Conclusion:}$ The oNLI framework enables real-time MRE inversion (30,000x speedup), outperforming CNN-based approaches and maintaining the fine-grained spatial accuracy achievable with NLI in the brain.

DuPLUS: Dual-Prompt Vision-Language Framework for Universal Medical Image Segmentation and Prognosis

Numan Saeed, Tausifa Jan Saleem, Fadillah Maani, Muhammad Ridzuan, Hu Wang, Mohammad Yaqub

arxiv logopreprintOct 3 2025
Deep learning for medical imaging is hampered by task-specific models that lack generalizability and prognostic capabilities, while existing 'universal' approaches suffer from simplistic conditioning and poor medical semantic understanding. To address these limitations, we introduce DuPLUS, a deep learning framework for efficient multi-modal medical image analysis. DuPLUS introduces a novel vision-language framework that leverages hierarchical semantic prompts for fine-grained control over the analysis task, a capability absent in prior universal models. To enable extensibility to other medical tasks, it includes a hierarchical, text-controlled architecture driven by a unique dual-prompt mechanism. For segmentation, DuPLUS is able to generalize across three imaging modalities, ten different anatomically various medical datasets, encompassing more than 30 organs and tumor types. It outperforms the state-of-the-art task specific and universal models on 8 out of 10 datasets. We demonstrate extensibility of its text-controlled architecture by seamless integration of electronic health record (EHR) data for prognosis prediction, and on a head and neck cancer dataset, DuPLUS achieved a Concordance Index (CI) of 0.69. Parameter-efficient fine-tuning enables rapid adaptation to new tasks and modalities from varying centers, establishing DuPLUS as a versatile and clinically relevant solution for medical image analysis. The code for this work is made available at: https://anonymous.4open.science/r/DuPLUS-6C52

SpineBench: A Clinically Salient, Level-Aware Benchmark Powered by the SpineMed-450k Corpus

Ming Zhao, Wenhui Dong, Yang Zhang, Xiang Zheng, Zhonghao Zhang, Zian Zhou, Yunzhi Guan, Liukun Xu, Wei Peng, Zhaoyang Gong, Zhicheng Zhang, Dachuan Li, Xiaosheng Ma, Yuli Ma, Jianing Ni, Changjiang Jiang, Lixia Tian, Qixin Chen, Kaishun Xia, Pingping Liu, Tongshun Zhang, Zhiqiang Liu, Zhongan Bi, Chenyang Si, Tiansheng Sun, Caifeng Shan

arxiv logopreprintOct 3 2025
Spine disorders affect 619 million people globally and are a leading cause of disability, yet AI-assisted diagnosis remains limited by the lack of level-aware, multimodal datasets. Clinical decision-making for spine disorders requires sophisticated reasoning across X-ray, CT, and MRI at specific vertebral levels. However, progress has been constrained by the absence of traceable, clinically-grounded instruction data and standardized, spine-specific benchmarks. To address this, we introduce SpineMed, an ecosystem co-designed with practicing spine surgeons. It features SpineMed-450k, the first large-scale dataset explicitly designed for vertebral-level reasoning across imaging modalities with over 450,000 instruction instances, and SpineBench, a clinically-grounded evaluation framework. SpineMed-450k is curated from diverse sources, including textbooks, guidelines, open datasets, and ~1,000 de-identified hospital cases, using a clinician-in-the-loop pipeline with a two-stage LLM generation method (draft and revision) to ensure high-quality, traceable data for question-answering, multi-turn consultations, and report generation. SpineBench evaluates models on clinically salient axes, including level identification, pathology assessment, and surgical planning. Our comprehensive evaluation of several recently advanced large vision-language models (LVLMs) on SpineBench reveals systematic weaknesses in fine-grained, level-specific reasoning. In contrast, our model fine-tuned on SpineMed-450k demonstrates consistent and significant improvements across all tasks. Clinician assessments confirm the diagnostic clarity and practical utility of our model's outputs.

BrainIB++: Leveraging Graph Neural Networks and Information Bottleneck for Functional Brain Biomarkers in Schizophrenia

Tianzheng Hu, Qiang Li, Shu Liu, Vince D. Calhoun, Guido van Wingen, Shujian Yu

arxiv logopreprintOct 3 2025
The development of diagnostic models is gaining traction in the field of psychiatric disorders. Recently, machine learning classifiers based on resting-state functional magnetic resonance imaging (rs-fMRI) have been developed to identify brain biomarkers that differentiate psychiatric disorders from healthy controls. However, conventional machine learning-based diagnostic models often depend on extensive feature engineering, which introduces bias through manual intervention. While deep learning models are expected to operate without manual involvement, their lack of interpretability poses significant challenges in obtaining explainable and reliable brain biomarkers to support diagnostic decisions, ultimately limiting their clinical applicability. In this study, we introduce an end-to-end innovative graph neural network framework named BrainIB++, which applies the information bottleneck (IB) principle to identify the most informative data-driven brain regions as subgraphs during model training for interpretation. We evaluate the performance of our model against nine established brain network classification methods across three multi-cohort schizophrenia datasets. It consistently demonstrates superior diagnostic accuracy and exhibits generalizability to unseen data. Furthermore, the subgraphs identified by our model also correspond with established clinical biomarkers in schizophrenia, particularly emphasizing abnormalities in the visual, sensorimotor, and higher cognition brain functional network. This alignment enhances the model's interpretability and underscores its relevance for real-world diagnostic applications.

Multilevel Correlation-aware and Modal-aware Graph Convolutional Network for Diagnosing Neurodevelopmental Disorders.

Zuo S, Li Y, Qi Y, Liu A

pubmed logopapersOct 2 2025
Graph-based methods using resting-state functional magnetic resonance imaging demonstrate strong capabilities in modeling brain networks. However, existing graph-based methods often overlook inter-graph relationships, limiting their ability to capture the intrinsic features shared across individuals. Additionally, their simplistic integration strategies may fail to take full advantage of multimodal information. To address these challenges, this paper proposes a Multilevel Correlation-aware and Modal-aware Graph Convolutional Network (MCM-GCN) for the reliable diagnosis of neurodevelopmental disorders. At the individual level, we design a correlation-driven feature generation module that incorporates a pooling layer with external graph attention to perceive inter-graph correlations, generating discriminative brain embeddings and identifying disease-related regions. At the population level, to deeply integrate multimodal and multi-atlas information, a multimodal-decoupled feature enhancement module learns unique and shared embeddings from brain graphs and phenotypic data and then fuses them adaptively with graph channel attention for reliable disease classification. Extensive experiments on two public datasets for Autism Spectrum Disorder (ASD) and Attention Deficit Hyperactivity Disorder (ADHD) demonstrate that MCM-GCN outperforms other competing methods, with an accuracy of 92.88% for ASD and 76.55% for ADHD. The MCM-GCN framework integrates individual-level and population-level analyses, offering a comprehensive perspective for neurodevelopmental disorder diagnosis, significantly improving diagnostic accuracy while identifying key indicators. These findings highlight the potential of the MCM-GCN for imaging-assisted diagnosis of neurodevelopmental diseases, advancing interpretable deep learning in medical imaging analysis.

Disrupted network integrity and therapeutic plasticity in drug-naive panic disorders: Insights from network homogeneity.

Han Y, Yan H, Li H, Liu F, Li P, Yuan Y, Guo W

pubmed logopapersOct 2 2025
This study intended to examine network homogeneity (NH) alterations in drug-naive patients with panic disorder (PD) before and after treatment and whether NH could serve as a potential biomarker. Fifty-eight patients and 85 healthy controls (HCs) underwent resting-state functional magnetic resonance imaging. Patients were rescanned following a 4-week course of paroxetine monotherapy. NH was computed to evaluate intra-network functional integration across the Yeo 7-Network. Machine learning (ML) was employed to assess the diagnostic and prognostic potential of NH metrics. Transcriptome-neuroimaging association analyses were conducted to explore the molecular correlates of NH alterations. Compared with HCs, patients showed disrupted intra-network integration in the frontoparietal, default mode, sensorimotor, limbic, and ventral attention networks, with prominent NH alterations in the superior frontal gyrus (SFG), middle temporal gyrus (MTG), superior temporal gyrus (STG), somatosensory cortex, insular, and anterior cingulate cortex. Importantly, the SFG, MTG, and STG demonstrated cross-network abnormalities. After treatment, clinical improvement correlated with normalized NH in the SFG and additional changes in the inferior occipital gyrus and calcarine sulcus within the visual network. ML demonstrated the utility of NH for PD classification and treatment outcome prediction. Transcriptome-neuroimaging analysis identified specific gene profiles related to NH alterations. NH reflects both pathological features and treatment-related changes in PD, providing a measure of network dysfunction and therapeutic response. Cross-network NH disruptions in hub regions and visual processing may reflect core neuropharmacological mechanisms underlying PD. ML findings support the potential of NH as a neuroimaging biomarker for diagnosis and treatment monitoring in PD.

NPN: Non-Linear Projections of the Null-Space for Imaging Inverse Problems

Roman Jacome, Romario Gualdrón-Hurtado, Leon Suarez, Henry Arguello

arxiv logopreprintOct 2 2025
Imaging inverse problems aims to recover high-dimensional signals from undersampled, noisy measurements, a fundamentally ill-posed task with infinite solutions in the null-space of the sensing operator. To resolve this ambiguity, prior information is typically incorporated through handcrafted regularizers or learned models that constrain the solution space. However, these priors typically ignore the task-specific structure of that null-space. In this work, we propose \textit{Non-Linear Projections of the Null-Space} (NPN), a novel class of regularization that, instead of enforcing structural constraints in the image domain, promotes solutions that lie in a low-dimensional projection of the sensing matrix's null-space with a neural network. Our approach has two key advantages: (1) Interpretability: by focusing on the structure of the null-space, we design sensing-matrix-specific priors that capture information orthogonal to the signal components that are fundamentally blind to the sensing process. (2) Flexibility: NPN is adaptable to various inverse problems, compatible with existing reconstruction frameworks, and complementary to conventional image-domain priors. We provide theoretical guarantees on convergence and reconstruction accuracy when used within plug-and-play methods. Empirical results across diverse sensing matrices demonstrate that NPN priors consistently enhance reconstruction fidelity in various imaging inverse problems, such as compressive sensing, deblurring, super-resolution, computed tomography, and magnetic resonance imaging, with plug-and-play methods, unrolling networks, deep image prior, and diffusion models.

VGDM: Vision-Guided Diffusion Model for Brain Tumor Detection and Segmentation

Arman Behnam

arxiv logopreprintOct 2 2025
Accurate detection and segmentation of brain tumors from magnetic resonance imaging (MRI) are essential for diagnosis, treatment planning, and clinical monitoring. While convolutional architectures such as U-Net have long been the backbone of medical image segmentation, their limited capacity to capture long-range dependencies constrains performance on complex tumor structures. Recent advances in diffusion models have demonstrated strong potential for generating high-fidelity medical images and refining segmentation boundaries. In this work, we propose VGDM: Vision-Guided Diffusion Model for Brain Tumor Detection and Segmentation framework, a transformer-driven diffusion framework for brain tumor detection and segmentation. By embedding a vision transformer at the core of the diffusion process, the model leverages global contextual reasoning together with iterative denoising to enhance both volumetric accuracy and boundary precision. The transformer backbone enables more effective modeling of spatial relationships across entire MRI volumes, while diffusion refinement mitigates voxel-level errors and recovers fine-grained tumor details. This hybrid design provides a pathway toward improved robustness and scalability in neuro-oncology, moving beyond conventional U-Net baselines. Experimental validation on MRI brain tumor datasets demonstrates consistent gains in Dice similarity and Hausdorff distance, underscoring the potential of transformer-guided diffusion models to advance the state of the art in tumor segmentation.
Page 2 of 1401393 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.