Latest Papers on Radiology AI. Tags: Mixed Modality

DuPLUS: Dual-Prompt Vision-Language Framework for Universal Medical Image Segmentation and Prognosis

Numan Saeed, Tausifa Jan Saleem, Fadillah Maani, Muhammad Ridzuan, Hu Wang, Mohammad Yaqub

•preprint•Oct 3 2025

Deep learning for medical imaging is hampered by task-specific models that lack generalizability and prognostic capabilities, while existing 'universal' approaches suffer from simplistic conditioning and poor medical semantic understanding. To address these limitations, we introduce DuPLUS, a deep learning framework for efficient multi-modal medical image analysis. DuPLUS introduces a novel vision-language framework that leverages hierarchical semantic prompts for fine-grained control over the analysis task, a capability absent in prior universal models. To enable extensibility to other medical tasks, it includes a hierarchical, text-controlled architecture driven by a unique dual-prompt mechanism. For segmentation, DuPLUS is able to generalize across three imaging modalities, ten different anatomically various medical datasets, encompassing more than 30 organs and tumor types. It outperforms the state-of-the-art task specific and universal models on 8 out of 10 datasets. We demonstrate extensibility of its text-controlled architecture by seamless integration of electronic health record (EHR) data for prognosis prediction, and on a head and neck cancer dataset, DuPLUS achieved a Concordance Index (CI) of 0.69. Parameter-efficient fine-tuning enables rapid adaptation to new tasks and modalities from varying centers, establishing DuPLUS as a versatile and clinically relevant solution for medical image analysis. The code for this work is made available at: https://anonymous.4open.science/r/DuPLUS-6C52

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code Benchmark SOTA

SpineBench: A Clinically Salient, Level-Aware Benchmark Powered by the SpineMed-450k Corpus

Ming Zhao, Wenhui Dong, Yang Zhang, Xiang Zheng, Zhonghao Zhang, Zian Zhou, Yunzhi Guan, Liukun Xu, Wei Peng, Zhaoyang Gong, Zhicheng Zhang, Dachuan Li, Xiaosheng Ma, Yuli Ma, Jianing Ni, Changjiang Jiang, Lixia Tian, Qixin Chen, Kaishun Xia, Pingping Liu, Tongshun Zhang, Zhiqiang Liu, Zhongan Bi, Chenyang Si, Tiansheng Sun, Caifeng Shan

•preprint•Oct 3 2025

Spine disorders affect 619 million people globally and are a leading cause of disability, yet AI-assisted diagnosis remains limited by the lack of level-aware, multimodal datasets. Clinical decision-making for spine disorders requires sophisticated reasoning across X-ray, CT, and MRI at specific vertebral levels. However, progress has been constrained by the absence of traceable, clinically-grounded instruction data and standardized, spine-specific benchmarks. To address this, we introduce SpineMed, an ecosystem co-designed with practicing spine surgeons. It features SpineMed-450k, the first large-scale dataset explicitly designed for vertebral-level reasoning across imaging modalities with over 450,000 instruction instances, and SpineBench, a clinically-grounded evaluation framework. SpineMed-450k is curated from diverse sources, including textbooks, guidelines, open datasets, and ~1,000 de-identified hospital cases, using a clinician-in-the-loop pipeline with a two-stage LLM generation method (draft and revision) to ensure high-quality, traceable data for question-answering, multi-turn consultations, and report generation. SpineBench evaluates models on clinically salient axes, including level identification, pathology assessment, and surgical planning. Our comprehensive evaluation of several recently advanced large vision-language models (LVLMs) on SpineBench reveals systematic weaknesses in fine-grained, level-specific reasoning. In contrast, our model fine-tuned on SpineMed-450k demonstrates consistent and significant improvements across all tasks. Clinician assessments confirm the diagnostic clarity and practical utility of our model's outputs.

Mixed Modality LLM Radiology Report Musculoskeletal Dataset Release In Silico Open Dataset Benchmark SOTA

Multimodal investigation of the neurocognitive deficits underlying dyslexia in adulthood.

Cara C, Zantonello G, Ghio M, Tettamanti M

•papers•Oct 2 2025

Dyslexia is a neurobiological disorder characterized by reading difficulties, yet its causes remain unclear. Neuroimaging and behavioral studies found anomalous responses in tasks requiring phonological processing, motion perception, and implicit learning, and showed gray and white matter abnormalities in dyslexics compared to controls, indicating that dyslexia is highly heterogeneous and promoting a multifactorial approach. To evaluate whether combining behavioral and multimodal MRI improves sensitivity in identifying dyslexia neurocognitive traits compared to monocomponential approaches, 19 dyslexic and 19 control subjects underwent cognitive assessments, multiple (phonological, visual motion, rhythmic) mismatch-response functional MRI tasks, structural diffusion-weighted imaging (DWI) and T1-weighted imaging. Between group differences in the neurocognitive measures were tested with univariate and multivariate approaches. Results showed that dyslexics performed worse than controls in phonological tasks and presented reduced cerebellar responses to mismatching rhythmic stimuli, as well as structural disorganization in white matter tracts and cortical regions. Most importantly, a machine learning model trained with features from all three MRI modalities discriminated between dyslexics and controls with greater accuracy than single-modality models. The individual classification scores in the multimodal machine learning model correlated with behavioral reading accuracy. These results characterize dyslexia as a composite condition with multiple distinctive cognitive and brain traits.

Mixed Modality Classification Neurological Retrospective Clinical In Silico

Multimodal imaging fusion and machine learning model development: differential diagnosis of spinal inflammatory lesions using combined CT hounsfield units and MRI features.

Wang Y, Bai X, Li T, Yuan S, Zong S, Chen Y, Wang H, Song Z, Wang H, Hao Y, Qu Y, Liu J, Zhang Q, Liu G

•papers•Oct 2 2025

The objective is to develop a differential diagnosis model for tuberculous spondylitis (TS) and pyogenic spondylitis (PS) by integrating MRI morphological features and computed tomography (CT) density parameters (Hounsfield Units, HU). This study aims to leverage multimodal data complementarity to achieve fusion of qualitative and quantitative information, thereby providing clinicians with a rapid and objective decision support tool for spinal inflammatory lesion characterization. Imaging data were extracted from MRI and CT scans of patients with TS and PS, then compared and summarized. Receiver operating characteristic (ROC) curves were used to determine optimal HU value thresholds. The least absolute shrinkage and selection operator (Lasso) regression was applied to identify the most predictive features for model construction. A logistic regression-based predictive model was developed and visualized as a nomogram. Model validation was performed using bootstrap resampling, ROC analysis, and decision curve analysis (DCA). A total of 171 patients with TS (n = 91) or PS (n = 80) were included. Statistically significant differences in MRI features were observed between the two groups (P < 0.05). Additionally, significant HU value differences were found in diseased vertebral endplates, small cavitary abscesses, large cavitary abscesses, and intravertebral abscesses between TS and PS patients (P < 0.05). The predictive model incorporated seven independent predictors. Calibration curves, ROC analysis, and DCA all demonstrated excellent model performance. Combined MRI and CT HU value analysis effectively differentiates TS from PS. The predictive model integrating imaging features and quantitative parameters demonstrates high accuracy and clinical utility, offering a novel approach to optimize diagnostic and treatment strategies for spinal infectious diseases.

Mixed Modality Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

Multimodal Foundation Models for Early Disease Detection

Md Talha Mohsin, Ismail Abdulrashid

•preprint•Oct 2 2025

Healthcare generates diverse streams of data, including electronic health records (EHR), medical imaging, genetics, and ongoing monitoring from wearable devices. Traditional diagnostic models frequently analyze these sources in isolation, which constrains their capacity to identify cross-modal correlations essential for early disease diagnosis. Our research presents a multimodal foundation model that consolidates diverse patient data through an attention-based transformer framework. At first, dedicated encoders put each modality into a shared latent space. Then, they combine them using multi-head attention and residual normalization. The architecture is made for pretraining on many tasks, which makes it easy to adapt to new diseases and datasets with little extra work. We provide an experimental strategy that uses benchmark datasets in oncology, cardiology, and neurology, with the goal of testing early detection tasks. The framework includes data governance and model management tools in addition to technological performance to improve transparency, reliability, and clinical interpretability. The suggested method works toward a single foundation model for precision diagnostics, which could improve the accuracy of predictions and help doctors make decisions.

Mixed Modality Classification Methodology In Silico GenAI

NPN: Non-Linear Projections of the Null-Space for Imaging Inverse Problems

Roman Jacome, Romario Gualdrón-Hurtado, Leon Suarez, Henry Arguello

•preprint•Oct 2 2025

Imaging inverse problems aims to recover high-dimensional signals from undersampled, noisy measurements, a fundamentally ill-posed task with infinite solutions in the null-space of the sensing operator. To resolve this ambiguity, prior information is typically incorporated through handcrafted regularizers or learned models that constrain the solution space. However, these priors typically ignore the task-specific structure of that null-space. In this work, we propose \textit{Non-Linear Projections of the Null-Space} (NPN), a novel class of regularization that, instead of enforcing structural constraints in the image domain, promotes solutions that lie in a low-dimensional projection of the sensing matrix's null-space with a neural network. Our approach has two key advantages: (1) Interpretability: by focusing on the structure of the null-space, we design sensing-matrix-specific priors that capture information orthogonal to the signal components that are fundamentally blind to the sensing process. (2) Flexibility: NPN is adaptable to various inverse problems, compatible with existing reconstruction frameworks, and complementary to conventional image-domain priors. We provide theoretical guarantees on convergence and reconstruction accuracy when used within plug-and-play methods. Empirical results across diverse sensing matrices demonstrate that NPN priors consistently enhance reconstruction fidelity in various imaging inverse problems, such as compressive sensing, deblurring, super-resolution, computed tomography, and magnetic resonance imaging, with plug-and-play methods, unrolling networks, deep image prior, and diffusion models.

Mixed Modality Reconstruction Methodology In Silico Academic Lab Reproducibility Benchmark SOTA

MedQ-Bench: Evaluating and Exploring Medical Image Quality Assessment Abilities in MLLMs

Jiyao Liu, Jinjie Wei, Wanying Qu, Chenglong Ma, Junzhi Ning, Yunheng Li, Ying Chen, Xinzhe Luo, Pengcheng Chen, Xin Gao, Ming Hu, Huihui Xu, Xin Wang, Shujian Gao, Dingkang Yang, Zhongying Deng, Jin Ye, Lihao Liu, Junjun He, Ningsheng Xu

•preprint•Oct 2 2025

Medical Image Quality Assessment (IQA) serves as the first-mile safety gate for clinical AI, yet existing approaches remain constrained by scalar, score-based metrics and fail to reflect the descriptive, human-like reasoning process central to expert evaluation. To address this gap, we introduce MedQ-Bench, a comprehensive benchmark that establishes a perception-reasoning paradigm for language-based evaluation of medical image quality with Multi-modal Large Language Models (MLLMs). MedQ-Bench defines two complementary tasks: (1) MedQ-Perception, which probes low-level perceptual capability via human-curated questions on fundamental visual attributes; and (2) MedQ-Reasoning, encompassing both no-reference and comparison reasoning tasks, aligning model evaluation with human-like reasoning on image quality. The benchmark spans five imaging modalities and over forty quality attributes, totaling 2,600 perceptual queries and 708 reasoning assessments, covering diverse image sources including authentic clinical acquisitions, images with simulated degradations via physics-based reconstructions, and AI-generated images. To evaluate reasoning ability, we propose a multi-dimensional judging protocol that assesses model outputs along four complementary axes. We further conduct rigorous human-AI alignment validation by comparing LLM-based judgement with radiologists. Our evaluation of 14 state-of-the-art MLLMs demonstrates that models exhibit preliminary but unstable perceptual and reasoning skills, with insufficient accuracy for reliable clinical use. These findings highlight the need for targeted optimization of MLLMs in medical IQA. We hope that MedQ-Bench will catalyze further exploration and unlock the untapped potential of MLLMs for medical image quality evaluation.

Mixed Modality Classification Whole Body Dataset Release In Silico Academic Lab Benchmark SOTA Open Dataset

New insights into pathogenesis, diagnosis and management of cardiac allograft vasculopathy.

David P, Roquero P, Coutance G

•papers•Oct 1 2025

Despite major advances in short-term outcomes after heart transplantation, long-term survival remains limited by chronic allograft dysfunction, with cardiac allograft vasculopathy (CAV) being the leading cause of late graft failure and an important cause of all-cause mortality. CAV is a unique and multifactorial form of transplant coronary vasculopathy, driven by a complex interplay of alloimmune responses, innate immune activation, and traditional cardiovascular risk factors. Recent insights from deep profiling of human allograft tissue have revealed the key roles of locally sustained T- and B-cell-mediated inflammation, macrophage-natural killer cell interactions, and chronic immune activation within the graft. These discoveries challenge prior models of systemic immune monitoring and highlight the importance of spatially organized, intragraft immune processes. In parallel, the diagnostic landscape of CAV is rapidly evolving. High-resolution imaging techniques such as optical coherence tomography, and advanced non-invasive tools including coronary computed tomography angiography and positron emission tomography, not only enable earlier and more precise detection of disease but also redefine the usual landscape of CAV diagnosis. New methods for individualized risk stratification, including trajectory modeling and machine learning-enhanced biopsy analysis, are paving the way for more personalized surveillance strategies. While current management remains focused on prevention, novel therapeutic targets are emerging, informed by a deeper understanding of CAV immunopathogenesis. This review provides an up-to-date synthesis of recent advances in CAV, with a focus on pathophysiology, individualized risk assessment, diagnostic innovation, and therapeutic perspectives, underscoring a paradigm shift toward more precise and proactive care in heart transplant recipients.

Mixed Modality Classification Cardiac Review Concept GenAI

Application of artificial intelligence in assisting treatment of gynecologic tumors: a systematic review.

Guo L, Zhang S, Chen H, Li Y, Liu Y, Liu W, Wang Q, Tang Z, Jiang P, Wang J

•papers•Oct 1 2025

In recent years, the application of artificial intelligence (AI) in medical image analysis has drawn increasing attention in clinical studies of gynecologic tumors. This study presents the development and prospects of AI applications to assist in the treatment of gynecological oncology. The Web of Science database was screened for articles published until August 2023. "artificial intelligence," "deep learning," "machine learning," "radiomics," "radiotherapy," "chemoradiotherapy," "neoadjuvant therapy," "immunotherapy," "gynecological malignancy," "cervical carcinoma," "cervical cancer," "ovarian cancer," "endometrial cancer," "vulvar cancer," "Vaginal cancer" were used as keywords. Research articles related to AI-assisted treatment of gynecological cancers were included. A total of 317 articles were retrieved based on the search strategy, and 133 were selected by applying the inclusion and exclusion criteria, including 114 on cervical cancer, 10 on endometrial cancer, and 9 on ovarian cancer. Among the included studies, 44 (33%) focused on prognosis prediction, 24 (18%) on treatment response prediction, 13 (10%) on adverse event prediction, five (4%) on dose distribution prediction, and 47 (35%) on target volume delineation. Target volume delineation and dose prediction were performed using deep Learning methods. For the prediction of treatment response, prognosis, and adverse events, 57 studies (70%) used conventional radiomics methods, 13 (16%) used deep Learning methods, 8 (10%) used spatial-related unconventional radiomics methods, and 3 (4%) used temporal-related unconventional radiomics methods. In cervical and endometrial cancers, target prediction mostly included treatment response, overall survival, recurrence, toxicity undergoing radiotherapy, lymph node metastasis, and dose distribution. For ovarian cancer, the target prediction included platinum sensitivity and postoperative complications. The majority of the studies were single-center, retrospective, and small-scale; 101 studies (76%) had single-center data, 125 studies (94%) were retrospective, and 127 studies (95%) included Less than 500 cases. The application of AI in assisting treatment in gynecological oncology remains limited. Although the results of AI in predicting the response, prognosis, adverse events, and dose distribution in gynecological oncology are superior, it is evident that there is no validation of substantial data from multiple centers for these tasks.

Mixed Modality Segmentation Abdominal Review In Silico Benchmark SOTA

Validation of novel low-dose CT methods for quantifying bone marrow in the appendicular skeleton of patients with multiple myeloma: initial results from the [18F]FDG PET/CT sub-study of the Phase 3 GMMG-HD7 Trial.

Sachpekidis C, Hajiyianni M, Grözinger M, Piller M, Kopp-Schneider A, Mai EK, John L, Sauer S, Weinhold N, Menis E, Enqvist O, Raab MS, Jauch A, Edenbrandt L, Hundemer M, Brobeil A, Jende J, Schlemmer HP, Delorme S, Goldschmidt H, Dimitrakopoulou-Strauss A

•papers•Oct 1 2025

The clinical significance of medullary abnormalities in the appendicular skeleton detected by computed tomography (CT) in patients with multiple myeloma (MM) remains incompletely elucidated. This study aims to validate novel low-dose CT-based methods for quantifying myeloma bone marrow (BM) volume in the appendicular skeleton of MM patients undergoing [1⁸F]FDG PET/CT. Seventy-two newly diagnosed, transplantation eligible MM patients enrolled in the randomised phase 3 GMMG-HD7 trial underwent whole-body [18F]FDG PET/CT prior to treatment and after induction therapy with either isatuximab plus lenalidomide, bortezomib, and dexamethasone or lenalidomide, bortezomib, and dexamethasone alone. Two CT-based methods using the Medical Imaging Toolkit (MITK 2.4.0.0, Heidelberg, Germany) were used to quantify BM infiltration in the appendicular skeleton: (1) Manual approach, based on calculation of the highest mean CT value (CTv) within bony canals. (2) Semi-automated approach, based on summation of CT values across the appendicular skeleton to compute cumulative CT values (cCTv). PET/CT data were analyzed visually and via standardized uptake value (SUV) metrics, applying the Italian Myeloma criteria for PET Use (IMPeTUs). Additionally, an AI-based method was used to automatically derive whole-body metabolic tumor volume (MTV) and total lesion glycolysis (TLG) from PET scans. Post-induction, all patients were evaluated for minimal residual disease (MRD) using BM multiparametric flow cytometry. Correlation analyses were performed between imaging data and clinical, histopathological, and cytogenetic parameters, as well as treatment response. Statistical significance was defined as p < 0.05. At baseline, the median CTv (manual) was 26.1 Hounsfield units (HU) and the median cCTv (semi-automated) was 5.5 HU. Both CT-based methods showed weak but significant correlations with disease burden indicators: CTv correlated with BM plasma cell infiltration (r = 0.29; p = 0.02) and β2-microglobulin levels (r = 0.28; p = 0.02), while cCTv correlated with BM plasma cell infiltration (r = 0.25; p = 0.04). Appendicular CT values further demonstrated significant associations with PET-derived parameters. Notably, SUVmax values from the BM of long bones were strongly correlated with both CTv (r = 0.61; p < 0.001) and moderately with cCTv (r = 0.45; p < 0.001). Patients classified as having increased [1⁸F]FDG uptake in the BM (Deauville Score ≥ 4), according to the IMPeTUs criteria, exhibited significantly higher CTv and cCTv values compared to those with Deauville Score <4 (p = 0.002 for both). AI-based analysis of PET data revealed additional weak-to-moderate significant associations, with MTV correlating with CTv (r = 0.32; p = 0.008) and cCTv (r = 0.45; p < 0.001), and TLG showing correlations with CTv (r = 0.36; p = 0.002) and cCTv (r = 0.46; p < 0.001). Following induction therapy, CT values decreased significantly from baseline (median CTv = -13.8 HU, median cCTv = 5.2 HU; p < 0.001 for both), and CTv significantly correlated with SUVmax values from the BM of long bones (r = 0.59; p < 0.001). In parallel, the incidence of follow-up pathological PET/CT scans, SUV values, Deauville Scores, and AI-derived MTV and TLG values showed a significant reduction after therapy (all p < 0.001). No significant differences in CTv, cCTv, or PET-derived metrics were observed between MRD-positive and MRD-negative patients. Novel CT-based quantification approaches for assessing BM involvement in the appendicular skeleton correlate with key clinical and PET parameters in MM. As low-dose, standardized techniques, they show promise for inclusion in MM imaging protocols, potentially enhancing assessment of disease extent and treatment response.

Mixed Modality Segmentation Musculoskeletal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

DuPLUS: Dual-Prompt Vision-Language Framework for Universal Medical Image Segmentation and Prognosis

SpineBench: A Clinically Salient, Level-Aware Benchmark Powered by the SpineMed-450k Corpus

Multimodal investigation of the neurocognitive deficits underlying dyslexia in adulthood.

Multimodal imaging fusion and machine learning model development: differential diagnosis of spinal inflammatory lesions using combined CT hounsfield units and MRI features.

Multimodal Foundation Models for Early Disease Detection

NPN: Non-Linear Projections of the Null-Space for Imaging Inverse Problems

MedQ-Bench: Evaluating and Exploring Medical Image Quality Assessment Abilities in MLLMs

New insights into pathogenesis, diagnosis and management of cardiac allograft vasculopathy.

Application of artificial intelligence in assisting treatment of gynecologic tumors: a systematic review.

Validation of novel low-dose CT methods for quantifying bone marrow in the appendicular skeleton of patients with multiple myeloma: initial results from the [<sup>18</sup>F]FDG PET/CT sub-study of the Phase 3 GMMG-HD7 Trial.

Ready to Sharpen Your Edge?