Sort by:
Page 14 of 59587 results

Ontology-Based Concept Distillation for Radiology Report Retrieval and Labeling

Felix Nützel, Mischa Dombrowski, Bernhard Kainz

arxiv logopreprintAug 27 2025
Retrieval-augmented learning based on radiology reports has emerged as a promising direction to improve performance on long-tail medical imaging tasks, such as rare disease detection in chest X-rays. Most existing methods rely on comparing high-dimensional text embeddings from models like CLIP or CXR-BERT, which are often difficult to interpret, computationally expensive, and not well-aligned with the structured nature of medical knowledge. We propose a novel, ontology-driven alternative for comparing radiology report texts based on clinically grounded concepts from the Unified Medical Language System (UMLS). Our method extracts standardised medical entities from free-text reports using an enhanced pipeline built on RadGraph-XL and SapBERT. These entities are linked to UMLS concepts (CUIs), enabling a transparent, interpretable set-based representation of each report. We then define a task-adaptive similarity measure based on a modified and weighted version of the Tversky Index that accounts for synonymy, negation, and hierarchical relationships between medical entities. This allows efficient and semantically meaningful similarity comparisons between reports. We demonstrate that our approach outperforms state-of-the-art embedding-based retrieval methods in a radiograph classification task on MIMIC-CXR, particularly in long-tail settings. Additionally, we use our pipeline to generate ontology-backed disease labels for MIMIC-CXR, offering a valuable new resource for downstream learning tasks. Our work provides more explainable, reliable, and task-specific retrieval strategies in clinical AI systems, especially when interpretability and domain knowledge integration are essential. Our code is available at https://github.com/Felix-012/ontology-concept-distillation

HONeYBEE: Enabling Scalable Multimodal AI in Oncology Through Foundation Model-Driven Embeddings

Tripathi, A. G., Waqas, A., Schabath, M. B., Yilmaz, Y., Rasool, G.

medrxiv logopreprintAug 27 2025
HONeYBEE (Harmonized ONcologY Biomedical Embedding Encoder) is an open-source framework that integrates multimodal biomedical data for oncology applications. It processes clinical data (structured and unstructured), whole-slide images, radiology scans, and molecular profiles to generate unified patient-level embeddings using domain-specific foundation models and fusion strategies. These embeddings enable survival prediction, cancer-type classification, patient similarity retrieval, and cohort clustering. Evaluated on 11,400+ patients across 33 cancer types from The Cancer Genome Atlas (TCGA), clinical embeddings showed the strongest single-modality performance with 98.5% classification accuracy and 96.4% precision@10 in patient retrieval. They also achieved the highest survival prediction concordance indices across most cancer types. Multimodal fusion provided complementary benefits for specific cancers, improving overall survival prediction beyond clinical features alone. Comparative evaluation of four large language models revealed that general-purpose models like Qwen3 outperformed specialized medical models for clinical text representation, though task-specific fine-tuning improved performance on heterogeneous data such as pathology reports.

Is the medical image segmentation problem solved? A survey of current developments and future directions

Guoping Xu, Jayaram K. Udupa, Jax Luo, Songlin Zhao, Yajun Yu, Scott B. Raymond, Hao Peng, Lipeng Ning, Yogesh Rathi, Wei Liu, You Zhang

arxiv logopreprintAug 27 2025
Medical image segmentation has advanced rapidly over the past two decades, largely driven by deep learning, which has enabled accurate and efficient delineation of cells, tissues, organs, and pathologies across diverse imaging modalities. This progress raises a fundamental question: to what extent have current models overcome persistent challenges, and what gaps remain? In this work, we provide an in-depth review of medical image segmentation, tracing its progress and key developments over the past decade. We examine core principles, including multiscale analysis, attention mechanisms, and the integration of prior knowledge, across the encoder, bottleneck, skip connections, and decoder components of segmentation networks. Our discussion is organized around seven key dimensions: (1) the shift from supervised to semi-/unsupervised learning, (2) the transition from organ segmentation to lesion-focused tasks, (3) advances in multi-modality integration and domain adaptation, (4) the role of foundation models and transfer learning, (5) the move from deterministic to probabilistic segmentation, (6) the progression from 2D to 3D and 4D segmentation, and (7) the trend from model invocation to segmentation agents. Together, these perspectives provide a holistic overview of the trajectory of deep learning-based medical image segmentation and aim to inspire future innovation. To support ongoing research, we maintain a continually updated repository of relevant literature and open-source resources at https://github.com/apple1986/medicalSegReview

AT-CXR: Uncertainty-Aware Agentic Triage for Chest X-rays

Xueyang Li, Mingze Jiang, Gelei Xu, Jun Xia, Mengzhao Jia, Danny Chen, Yiyu Shi

arxiv logopreprintAug 26 2025
Agentic AI is advancing rapidly, yet truly autonomous medical-imaging triage, where a system decides when to stop, escalate, or defer under real constraints, remains relatively underexplored. To address this gap, we introduce AT-CXR, an uncertainty-aware agent for chest X-rays. The system estimates per-case confidence and distributional fit, then follows a stepwise policy to issue an automated decision or abstain with a suggested label for human intervention. We evaluate two router designs that share the same inputs and actions: a deterministic rule-based router and an LLM-decided router. Across five-fold evaluation on a balanced subset of NIH ChestX-ray14 dataset, both variants outperform strong zero-shot vision-language models and state-of-the-art supervised classifiers, achieving higher full-coverage accuracy and superior selective-prediction performance, evidenced by a lower area under the risk-coverage curve (AURC) and a lower error rate at high coverage, while operating with lower latency that meets practical clinical constraints. The two routers provide complementary operating points, enabling deployments to prioritize maximal throughput or maximal accuracy. Our code is available at https://github.com/XLIAaron/uncertainty-aware-cxr-agent.

Displacement-Guided Anisotropic 3D-MRI Super-Resolution with Warp Mechanism.

Wang L, Liu S, Yu Z, Du J, Li Y

pubmed logopapersAug 25 2025
Enhancing the resolution of Magnetic Resonance Imaging (MRI) through super-resolution (SR) reconstruction is crucial for boosting diagnostic precision. However, current SR methods primarily rely on single LR images or multi-contrast features, limiting detail restoration. Inspired by video frame interpolation, this work utilizes the spatiotemporal correlations between adjacent slices to reformulate the SR task of anisotropic 3D-MRI image into the generation of new high-resolution (HR) slices between adjacent 2D slices. The generated SR slices are subsequently combined with the HR adjacent slices to create a new HR 3D-MRI image. We propose a innovative network architecture termed DGWMSR, comprising a backbone network and a feature supplement module (FSM). The backbone's core innovations include the displacement former block (DFB) module, which independently extracts structural and displacement features, and the maskdisplacement vector network (MDVNet) which combines with Warp mechanism to facilitate edge pixel detailing. The DFB integrates the inter-slice attention (ISA) mechanism into the Transformer, effectively minimizing the mutual interference between the two types of features and mitigating volume effects during reconstruction. Additionally, the FSM module combines self-attention with feed-forward neural network, which emphasizes critical details derived from the backbone architecture. Experimental results demonstrate the DGWMSR network outperforms current MRI SR methods on Kirby21, ANVIL-adult, and MSSEG datasets. Our code has been made publicly available on GitHub at https://github.com/Dohbby/DGWMSR.

A Deep Learning Pipeline for Mapping in situ Network-level Neurovascular Coupling in Multi-photon Fluorescence Microscopy

Rozak, M. W., Mester, J. R., Attarpour, A., Dorr, A., Patel, S., Koletar, M., Hill, M. E., McLaurin, J., Goubran, M., Stefanovic, B.

biorxiv logopreprintAug 25 2025
Functional hyperaemia is a well-established hallmark of healthy brain function, whereby local brain blood flow adjusts in response to a change in the activity of the surrounding neurons. Although functional hyperemia has been extensively studied at the level of both tissue and individual vessels, vascular network-level coordination remains largely unknown. To bridge this gap, we developed a deep learning-based computational pipeline that uses two-photon fluorescence microscopy images of cerebral microcirculation to enable automated reconstruction and quantification of the geometric changes across the microvascular network, comprising hundreds of interconnected blood vessels, pre and post-activation of the neighbouring neurons. The pipeline's utility was demonstrated in the Thy1-ChR2 optogenetic mouse model, where we observed network-wide vessel radius changes to depend on the photostimulation intensity, with both dilations and constrictions occurring across the cortical depth, at an average of 16.1{+/-}14.3 m (mean{+/-}stddev) away from the most proximal neuron for dilations; and at 21.9{+/-}14.6 m away for constrictions. We observed a significant heterogeneity of the vascular radius changes within vessels, with radius adjustment varying by an average of 24 {+/-} 28% of the resting diameter, likely reflecting the heterogeneity of the distribution of contractile cells on the vessel walls. A graph theory-based network analysis revealed that the assortativity of adjacent blood vessel responses rose by 152 {+/-} 65% at 4.3 mW/mm2 of blue photostimulation vs. the control, with a 4% median increase in the efficiency of the capillary networks during this level of blue photostimulation in relation to the baseline. Interrogating individual vessels is thus not sufficient to predict how the blood flow is modulated in the network. Our computational pipeline, to be made openly available, enables tracking of the microvascular network geometry over time, relating caliber adjustments to vessel wall-associated cells' state, and mapping network-level flow distribution impairments in experimental models of disease.

DYNAFormer: Enhancing transformer segmentation with dynamic anchor mask for medical imaging.

Nguyen TC, Phung KA, Dao TTP, Nguyen-Mau TH, Nguyen-Quang T, Pham CN, Le TN, Shen J, Nguyen TV, Tran MT

pubmed logopapersAug 25 2025
Polyp shape is critical for diagnosing colorectal polyps and assessing cancer risk, yet there is limited data on segmenting pedunculated and sessile polyps. This paper introduces PolypDB_INS, a dataset of 4403 images containing 4918 annotated polyps, specifically for sessile and pedunculated polyps. In addition, we propose DYNAFormer, a novel transformer-based model utilizing an anchor mask-guided mechanism that incorporates cross-attention, dynamic query updates, and query denoising for improved object segmentation. Treating each positional query as an anchor mask dynamically updated through decoder layers enhances perceptual information regarding the object's position, allowing for more precise segmentation of complex structures like polyps. Extensive experiments on the PolypDB_INS dataset using standard evaluation metrics for both instance and semantic segmentation show that DYNAFormer significantly outperforms state-of-the-art methods. Ablation studies confirm the effectiveness of the proposed techniques, highlighting the model's robustness for diagnosing colorectal cancer. The source code and dataset are available at https://github.com/ntcongvn/DYNAFormer https://github.com/ntcongvn/DYNAFormer.

UniSino: Physics-Driven Foundational Model for Universal CT Sinogram Standardization

Xingyu Ai, Shaoyu Wang, Zhiyuan Jia, Ao Xu, Hongming Shan, Jianhua Ma, Qiegen Liu

arxiv logopreprintAug 25 2025
During raw-data acquisition in CT imaging, diverse factors can degrade the collected sinograms, with undersampling and noise leading to severe artifacts and noise in reconstructed images and compromising diagnostic accuracy. Conventional correction methods rely on manually designed algorithms or fixed empirical parameters, but these approaches often lack generalizability across heterogeneous artifact types. To address these limitations, we propose UniSino, a foundation model for universal CT sinogram standardization. Unlike existing foundational models that operate in image domain, UniSino directly standardizes data in the projection domain, which enables stronger generalization across diverse undersampling scenarios. Its training framework incorporates the physical characteristics of sinograms, enhancing generalization and enabling robust performance across multiple subtasks spanning four benchmark datasets. Experimental results demonstrate thatUniSino achieves superior reconstruction quality both single and mixed undersampling case, demonstrating exceptional robustness and generalization in sinogram enhancement for CT imaging. The code is available at: https://github.com/yqx7150/UniSino.

Improving Interpretability in Alzheimer's Prediction via Joint Learning of ADAS-Cog Scores

Nur Amirah Abd Hamid, Mohd Shahrizal Rusli, Muhammad Thaqif Iman Mohd Taufek, Mohd Ibrahim Shapiai, Daphne Teck Ching Lai

arxiv logopreprintAug 25 2025
Accurate prediction of clinical scores is critical for early detection and prognosis of Alzheimers disease (AD). While existing approaches primarily focus on forecasting the ADAS-Cog global score, they often overlook the predictive value of its sub-scores (13 items), which capture domain-specific cognitive decline. In this study, we propose a multi task learning (MTL) framework that jointly predicts the global ADAS-Cog score and its sub-scores (13 items) at Month 24 using baseline MRI and longitudinal clinical scores from baseline and Month 6. The main goal is to examine how each sub scores particularly those associated with MRI features contribute to the prediction of the global score, an aspect largely neglected in prior MTL studies. We employ Vision Transformer (ViT) and Swin Transformer architectures to extract imaging features, which are fused with longitudinal clinical inputs to model cognitive progression. Our results show that incorporating sub-score learning improves global score prediction. Subscore level analysis reveals that a small subset especially Q1 (Word Recall), Q4 (Delayed Recall), and Q8 (Word Recognition) consistently dominates the predicted global score. However, some of these influential sub-scores exhibit high prediction errors, pointing to model instability. Further analysis suggests that this is caused by clinical feature dominance, where the model prioritizes easily predictable clinical scores over more complex MRI derived features. These findings emphasize the need for improved multimodal fusion and adaptive loss weighting to achieve more balanced learning. Our study demonstrates the value of sub score informed modeling and provides insights into building more interpretable and clinically robust AD prediction frameworks. (Github repo provided)

A Weighted Vision Transformer-Based Multi-Task Learning Framework for Predicting ADAS-Cog Scores

Nur Amirah Abd Hamid, Mohd Ibrahim Shapiai, Daphne Teck Ching Lai

arxiv logopreprintAug 25 2025
Prognostic modeling is essential for forecasting future clinical scores and enabling early detection of Alzheimers disease (AD). While most existing methods focus on predicting the ADAS-Cog global score, they often overlook the predictive value of its 13 sub-scores, which reflect distinct cognitive domains. Some sub-scores may exert greater influence on determining global scores. Assigning higher loss weights to these clinically meaningful sub-scores can guide the model to focus on more relevant cognitive domains, enhancing both predictive accuracy and interpretability. In this study, we propose a weighted Vision Transformer (ViT)-based multi-task learning (MTL) framework to jointly predict the ADAS-Cog global score using baseline MRI scans and its 13 sub-scores at Month 24. Our framework integrates ViT as a feature extractor and systematically investigates the impact of sub-score-specific loss weighting on model performance. Results show that our proposed weighting strategies are group-dependent: strong weighting improves performance for MCI subjects with more heterogeneous MRI patterns, while moderate weighting is more effective for CN subjects with lower variability. Our findings suggest that uniform weighting underutilizes key sub-scores and limits generalization. The proposed framework offers a flexible, interpretable approach to AD prognosis using end-to-end MRI-based learning. (Github repo link will be provided after review)
Page 14 of 59587 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.