Latest Papers on Radiology AI. Tags: Mixed Modality

Your other Left! Vision-Language Models Fail to Identify Relative Positions in Medical Images

Daniel Wolf, Heiko Hillenhagen, Billurvan Taskin, Alex Bäuerle, Meinrad Beer, Michael Götz, Timo Ropinski

•preprint•Aug 1 2025

Clinical decision-making relies heavily on understanding relative positions of anatomical structures and anomalies. Therefore, for Vision-Language Models (VLMs) to be applicable in clinical practice, the ability to accurately determine relative positions on medical images is a fundamental prerequisite. Despite its importance, this capability remains highly underexplored. To address this gap, we evaluate the ability of state-of-the-art VLMs, GPT-4o, Llama3.2, Pixtral, and JanusPro, and find that all models fail at this fundamental task. Inspired by successful approaches in computer vision, we investigate whether visual prompts, such as alphanumeric or colored markers placed on anatomical structures, can enhance performance. While these markers provide moderate improvements, results remain significantly lower on medical images compared to observations made on natural images. Our evaluations suggest that, in medical imaging, VLMs rely more on prior anatomical knowledge than on actual image content for answering relative position questions, often leading to incorrect conclusions. To facilitate further research in this area, we introduce the MIRP , Medical Imaging Relative Positioning, benchmark dataset, designed to systematically evaluate the capability to identify relative positions in medical images.

Mixed Modality Classification Dataset Release In Silico Academic Lab Open Dataset Benchmark SOTA

Multimodal data curation via interoperability: use cases with the Medical Imaging and Data Resource Center.

Chen W, Whitney HM, Kahaki S, Meyer C, Li H, Sá RC, Lauderdale D, Napel S, Gersing K, Grossman RL, Giger ML

•papers•Aug 1 2025

Interoperability (the ability of data or tools from non-cooperating resources to integrate or work together with minimal effort) is particularly important for curation of multimodal datasets from multiple data sources. The Medical Imaging and Data Resource Center (MIDRC), a multi-institutional collaborative initiative to collect, curate, and share medical imaging datasets, has made interoperability with other data commons one of its top priorities. The purpose of this study was to demonstrate the interoperability between MIDRC and two other data repositories, BioData Catalyst (BDC) and National Clinical Cohort Collaborative (N3C). Using interoperability capabilities of the data repositories, we built two cohorts for example use cases, with each containing clinical and imaging data on matched patients. The representativeness of the cohorts is characterized by comparing with CDC population statistics using the Jensen-Shannon distance. The process and methods of interoperability demonstrated in this work can be utilized by MIDRC, BDC, and N3C users to create multimodal datasets for development of artificial intelligence/machine learning models.

Mixed Modality Methodology Concept Consortium Open Dataset

Natural language processing and LLMs in liver imaging: a practical review of clinical applications.

López-Úbeda P, Martín-Noguerol T, Luna A

•papers•Aug 1 2025

Liver diseases pose a significant global health challenge due to their silent progression and high mortality. Proper interpretation of radiology reports is essential for the evaluation and management of these conditions but is limited by variability in reporting styles and the complexity of unstructured medical language. In this context, Natural Language Processing (NLP) techniques and Large Language Models (LLMs) have emerged as promising tools to extract relevant clinical information from unstructured liver radiology reports. This work reviews, from a practical point of view, the current state of NLP and LLM applications for liver disease classification, clinical feature extraction, diagnostic support, and staging from reports. It also discusses existing limitations, such as the need for high-quality annotated data, lack of explainability, and challenges in clinical integration. With responsible and validated implementation, these technologies have the potential to transform liver clinical management by enabling faster and more accurate diagnoses and optimizing radiology workflows, ultimately improving patient care in liver diseases.

Mixed Modality LLM Radiology Report Abdominal Review GenAI

Mobile U-ViT: Revisiting large kernel and U-shaped ViT for efficient medical image segmentation

Fenghe Tang, Bingkun Nian, Jianrui Ding, Wenxin Ma, Quan Quan, Chengqi Dong, Jie Yang, Wei Liu, S. Kevin Zhou

•preprint•Aug 1 2025

In clinical practice, medical image analysis often requires efficient execution on resource-constrained mobile devices. However, existing mobile models-primarily optimized for natural images-tend to perform poorly on medical tasks due to the significant information density gap between natural and medical domains. Combining computational efficiency with medical imaging-specific architectural advantages remains a challenge when developing lightweight, universal, and high-performing networks. To address this, we propose a mobile model called Mobile U-shaped Vision Transformer (Mobile U-ViT) tailored for medical image segmentation. Specifically, we employ the newly purposed ConvUtr as a hierarchical patch embedding, featuring a parameter-efficient large-kernel CNN with inverted bottleneck fusion. This design exhibits transformer-like representation learning capacity while being lighter and faster. To enable efficient local-global information exchange, we introduce a novel Large-kernel Local-Global-Local (LGL) block that effectively balances the low information density and high-level semantic discrepancy of medical images. Finally, we incorporate a shallow and lightweight transformer bottleneck for long-range modeling and employ a cascaded decoder with downsample skip connections for dense prediction. Despite its reduced computational demands, our medical-optimized architecture achieves state-of-the-art performance across eight public 2D and 3D datasets covering diverse imaging modalities, including zero-shot testing on four unseen datasets. These results establish it as an efficient yet powerful and generalization solution for mobile medical image analysis. Code is available at https://github.com/FengheTan9/Mobile-U-ViT.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA Open Code

Emerging Applications of Feature Selection in Osteoporosis Research: From Biomarker Discovery to Clinical Decision Support.

Wang J, Wang Y, Ren J, Li Z, Guo L, Lv J

•papers•Aug 1 2025

Osteoporosis (OP), a systemic skeletal disease characterized by compromised bone strength and elevated fracture susceptibility, represents a growing global health challenge that necessitates early detection and accurate risk stratification. With the exponential growth of multidimensional biomedical data in OP research, feature selection has become an indispensable machine learning paradigm that improves model generalizability. At the same time, it preserves clinical interpretability and enhances predictive accuracy. This perspective article systematically reviews the transformative role of feature selection methodologies across three critical domains of OP investigation: 1) multi-omics biomarker identification, 2) diagnostic pattern recognition, and 3) fracture risk prognostication. In biomarker discovery, advanced feature selection algorithms systematically refine high-dimensional multi-omics datasets (genomic, proteomic, metabolomic) to isolate key molecular signatures correlated with bone mineral density (BMD) trajectories and microarchitectural deterioration. For clinical diagnostics, these techniques enable efficient extraction of discriminative pattern from multimodal imaging data, including dual-energy X-ray absorptiometry (DXA), quantitative computed tomography (CT), and emerging dental radiographic biomarkers. In prognostic modeling, strategic variable selection optimizes prognostic accuracy by integrating demographic, biochemical, and biomechanical predictors while migrating overfitting in heterogeneous patient cohorts. Current challenges include heterogeneity in dataset quality and dimensionality, translational gaps between algorithmic outputs and clinical decision parameters, and limited reproducibility across diverse populations. Future directions should prioritize the development of adaptive feature selection frameworks capable of dynamic multi-omics data integration, coupled with hybrid intelligence systems that synergize machine-derived biomarkers with clinician expertise. Addressing these challenges requires coordinated interdisciplinary efforts to establish standardized validation protocols and create clinician-friendly decision support interfaces, ultimately bridging the gap between computational OP research and personalized patient care.

Mixed Modality Classification Musculoskeletal Review Concept Ethics GenAI

DiSC-Med: Diffusion-based Semantic Communications for Robust Medical Image Transmission

Fupei Guo, Hao Zheng, Xiang Zhang, Li Chen, Yue Wang, Songyang Zhang

•preprint•Jul 31 2025

The rapid development of artificial intelligence has driven smart health with next-generation wireless communication technologies, stimulating exciting applications in remote diagnosis and intervention. To enable a timely and effective response for remote healthcare, efficient transmission of medical data through noisy channels with limited bandwidth emerges as a critical challenge. In this work, we propose a novel diffusion-based semantic communication framework, namely DiSC-Med, for the medical image transmission, where medical-enhanced compression and denoising blocks are developed for bandwidth efficiency and robustness, respectively. Unlike conventional pixel-wise communication framework, our proposed DiSC-Med is able to capture the key semantic information and achieve superior reconstruction performance with ultra-high bandwidth efficiency against noisy channels. Extensive experiments on real-world medical datasets validate the effectiveness of our framework, demonstrating its potential for robust and efficient telehealth applications.

Mixed Modality Reconstruction Methodology In Silico Academic Lab

Precision Medicine in Substance Use Disorders: Integrating Behavioral, Environmental, and Biological Insights.

Guerrin CGJ, Tesselaar DRM, Booij J, Schellekens AFA, Homberg JR

•papers•Jul 31 2025

Substance use disorders (SUD) are chronic, relapsing conditions marked by high variability in treatment response and frequent relapse. This variability arises from complex interactions among behavioral, environmental, and biological factors unique to each individual. Precision medicine, which tailors treatment to patient-specific characteristics, offers a promising avenue to address these challenges. This review explores key factors influencing SUD, including severity, comorbidities, drug use motives, polysubstance use, cognitive impairments, and biological and environmental influences. Advanced neuroimaging, such as MRI and PET, enables patient subtyping by identifying altered brain mechanisms, including reward, relief, and cognitive pathways, and striatal dopamine D<sub>2/3</sub> receptor binding. Pharmacogenetic and epigenetic studies uncover how variations in dopaminergic, serotoninergic, and opioidergic systems shape treatment outcomes. Emerging biomarkers, such as neurofilament light chain, offer non-invasive relapse monitoring. Multifactorial models integrating behavioral and neural markers outperform single-factor approaches in predicting treatment success. Machine learning refines these models, while longitudinal and preclinical studies support individualized care. Despite translational hurdles, precision medicine offers transformative potential for improving SUD treatment outcomes.

Mixed Modality Classification Neurological Review Concept GenAI

Topology Optimization in Medical Image Segmentation with Fast Euler Characteristic

Liu Li, Qiang Ma, Cheng Ouyang, Johannes C. Paetzold, Daniel Rueckert, Bernhard Kainz

•preprint•Jul 31 2025

Deep learning-based medical image segmentation techniques have shown promising results when evaluated based on conventional metrics such as the Dice score or Intersection-over-Union. However, these fully automatic methods often fail to meet clinically acceptable accuracy, especially when topological constraints should be observed, e.g., continuous boundaries or closed surfaces. In medical image segmentation, the correctness of a segmentation in terms of the required topological genus sometimes is even more important than the pixel-wise accuracy. Existing topology-aware approaches commonly estimate and constrain the topological structure via the concept of persistent homology (PH). However, these methods are difficult to implement for high dimensional data due to their polynomial computational complexity. To overcome this problem, we propose a novel and fast approach for topology-aware segmentation based on the Euler Characteristic ($\chi$). First, we propose a fast formulation for $\chi$ computation in both 2D and 3D. The scalar $\chi$ error between the prediction and ground-truth serves as the topological evaluation metric. Then we estimate the spatial topology correctness of any segmentation network via a so-called topological violation map, i.e., a detailed map that highlights regions with $\chi$ errors. Finally, the segmentation results from the arbitrary network are refined based on the topological violation maps by a topology-aware correction network. Our experiments are conducted on both 2D and 3D datasets and show that our method can significantly improve topological correctness while preserving pixel-wise segmentation accuracy.

Mixed Modality Segmentation Methodology In Silico Breakthrough

Impact of large language models and vision deep learning models in predicting neoadjuvant rectal score for rectal cancer treated with neoadjuvant chemoradiation.

Kim HB, Tan HQ, Nei WL, Tan YCRS, Cai Y, Wang F

•papers•Jul 31 2025

This study aims to explore Deep Learning methods, namely Large Language Models (LLMs) and Computer Vision models to accurately predict neoadjuvant rectal (NAR) score for locally advanced rectal cancer (LARC) treated with neoadjuvant chemoradiation (NACRT). The NAR score is a validated surrogate endpoint for LARC. 160 CT scans of patients were used in this study, along with 4 different types of radiology reports, 2 generated from CT scans and other 2 from MRI scans, both before and after NACRT. For CT scans, two different approaches with convolutional neural network were utilized to tackle the 3D scan entirely or tackle it slice by slice. For radiology reports, an encoder architecture LLM was used. The performance of the approaches was quantified by the Area under the Receiver Operating Characteristic curve (AUC). The two different approaches for CT scans yielded [Formula: see text] and [Formula: see text] while the LLM trained on post NACRT MRI reports showed the most predictive potential at [Formula: see text] and a statistical improvement, p = 0.03, over the baseline clinical approach (from [Formula: see text] to [Formula: see text])). This study showcases the potential of Large Language Models and the inadequacies of CT scans in predicting NAR values. Clinical trial number Not applicable.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Effectiveness of Radiomics-Based Machine Learning Models in Differentiating Pancreatitis and Pancreatic Ductal Adenocarcinoma: Systematic Review and Meta-Analysis.

Zhang L, Li D, Su T, Xiao T, Zhao S

•papers•Jul 31 2025

Pancreatic ductal adenocarcinoma (PDAC) and mass-forming pancreatitis (MFP) share similar clinical, laboratory, and imaging features, making accurate diagnosis challenging. Nevertheless, PDAC is highly malignant with a poor prognosis, whereas MFP is an inflammatory condition typically responding well to medical or interventional therapies. Some investigators have explored radiomics-based machine learning (ML) models for distinguishing PDAC from MFP. However, systematic evidence supporting the feasibility of these models is insufficient, presenting a notable challenge for clinical application. This study intended to review the diagnostic performance of radiomics-based ML models in differentiating PDAC from MFP, summarize the methodological quality of the included studies, and provide evidence-based guidance for optimizing radiomics-based ML models and advancing their clinical use. PubMed, Embase, Cochrane, and Web of Science were searched for relevant studies up to June 29, 2024. Eligible studies comprised English cohort, case-control, or cross-sectional designs that applied fully developed radiomics-based ML models-including traditional and deep radiomics-to differentiate PDAC from MFP, while also reporting their diagnostic performance. Studies without full text, limited to image segmentation, or insufficient outcome metrics were excluded. Methodological quality was appraised by means of the radiomics quality score. Since the limited applicability of QUADAS-2 in radiomics-based ML studies, the risk of bias was not formally assessed. Pooled sensitivity, specificity, area under the curve of summary receiver operating characteristics (SROC), likelihood ratios, and diagnostic odds ratio were estimated through a bivariate mixed-effects model. Results were presented with forest plots, SROC curves, and Fagan's nomogram. Subgroup analysis was performed to appraise the diagnostic performance of radiomics-based ML models across various imaging modalities, including computed tomography (CT), magnetic resonance imaging, positron emission tomography-CT, and endoscopic ultrasound. This meta-analysis included 24 studies with 14,406 cases, including 7635 PDAC cases. All studies adopted a case-control design, with 5 conducted across multiple centers. Most studies used CT as the primary imaging modality. The radiomics quality score scores ranged from 5 points (14%) to 17 points (47%), with an average score of 9 (25%). The radiomics-based ML models demonstrated high diagnostic performance. Based on the independent validation sets, the pooled sensitivity, specificity, area under the curve of SROC, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratio were 0.92 (95% CI 0.91-0.94), 0.90 (95% CI 0.85-0.94), 0.94 (95% CI 0.74-0.99), 9.3 (95% CI 6.0-14.2), 0.08 (95% CI 0.07-0.11), and 110 (95% CI 62-194), respectively. Radiomics-based ML models demonstrate high diagnostic accuracy in differentiating PDAC from MFP, underscoring their potential as noninvasive tools for clinical decision-making. Nonetheless, the overall methodological quality was moderate due to limitations in external validation, standardized protocols, and reproducibility. These findings support the promise of radiomics in clinical diagnostics while highlighting the need for more rigorous, multicenter research to enhance model generalizability and clinical applicability.

Mixed Modality Classification Abdominal Meta Analysis In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

Your other Left! Vision-Language Models Fail to Identify Relative Positions in Medical Images

Multimodal data curation via interoperability: use cases with the Medical Imaging and Data Resource Center.

Natural language processing and LLMs in liver imaging: a practical review of clinical applications.

Mobile U-ViT: Revisiting large kernel and U-shaped ViT for efficient medical image segmentation

Emerging Applications of Feature Selection in Osteoporosis Research: From Biomarker Discovery to Clinical Decision Support.

DiSC-Med: Diffusion-based Semantic Communications for Robust Medical Image Transmission

Precision Medicine in Substance Use Disorders: Integrating Behavioral, Environmental, and Biological Insights.

Topology Optimization in Medical Image Segmentation with Fast Euler Characteristic

Impact of large language models and vision deep learning models in predicting neoadjuvant rectal score for rectal cancer treated with neoadjuvant chemoradiation.

Effectiveness of Radiomics-Based Machine Learning Models in Differentiating Pancreatitis and Pancreatic Ductal Adenocarcinoma: Systematic Review and Meta-Analysis.

Ready to Sharpen Your Edge?