Sort by:
Page 3 of 81804 results

In-hoc Concept Representations to Regularise Deep Learning in Medical Imaging

Valentina Corbetta, Floris Six Dijkstra, Regina Beets-Tan, Hoel Kervadec, Kristoffer Wickstrøm, Wilson Silva

arxiv logopreprintAug 19 2025
Deep learning models in medical imaging often achieve strong in-distribution performance but struggle to generalise under distribution shifts, frequently relying on spurious correlations instead of clinically meaningful features. We introduce LCRReg, a novel regularisation approach that leverages Latent Concept Representations (LCRs) (e.g., Concept Activation Vectors (CAVs)) to guide models toward semantically grounded representations. LCRReg requires no concept labels in the main training set and instead uses a small auxiliary dataset to synthesise high-quality, disentangled concept examples. We extract LCRs for predefined relevant features, and incorporate a regularisation term that guides a Convolutional Neural Network (CNN) to activate within latent subspaces associated with those concepts. We evaluate LCRReg across synthetic and real-world medical tasks. On a controlled toy dataset, it significantly improves robustness to injected spurious correlations and remains effective even in multi-concept and multiclass settings. On the diabetic retinopathy binary classification task, LCRReg enhances performance under both synthetic spurious perturbations and out-of-distribution (OOD) generalisation. Compared to baselines, including multitask learning, linear probing, and post-hoc concept-based models, LCRReg offers a lightweight, architecture-agnostic strategy for improving model robustness without requiring dense concept supervision. Code is available at the following link: https://github.com/Trustworthy-AI-UU-NKI/lcr\_regularization

Improving Deep Learning for Accelerated MRI With Data Filtering

Kang Lin, Anselm Krainovic, Kun Wang, Reinhard Heckel

arxiv logopreprintAug 19 2025
Deep neural networks achieve state-of-the-art results for accelerated MRI reconstruction. Most research on deep learning based imaging focuses on improving neural network architectures trained and evaluated on fixed and homogeneous training and evaluation data. In this work, we investigate data curation strategies for improving MRI reconstruction. We assemble a large dataset of raw k-space data from 18 public sources consisting of 1.1M images and construct a diverse evaluation set comprising 48 test sets, capturing variations in anatomy, contrast, number of coils, and other key factors. We propose and study different data filtering strategies to enhance performance of current state-of-the-art neural networks for accelerated MRI reconstruction. Our experiments show that filtering the training data leads to consistent, albeit modest, performance gains. These performance gains are robust across different training set sizes and accelerations, and we find that filtering is particularly beneficial when the proportion of in-distribution data in the unfiltered training set is low.

LGFFM: A Localized and Globalized Frequency Fusion Model for Ultrasound Image Segmentation.

Luo X, Wang Y, Ou-Yang L

pubmed logopapersAug 19 2025
Accurate segmentation of ultrasound images plays a critical role in disease screening and diagnosis. Recently, neural network-based methods have garnered significant attention for their potential in improving ultrasound image segmentation. However, these methods still face significant challenges, primarily due to inherent issues in ultrasound images, such as low resolution, speckle noise, and artifacts. Additionally, ultrasound image segmentation encompasses a wide range of scenarios, including organ segmentation (e.g., cardiac and fetal head) and lesion segmentation (e.g., breast cancer and thyroid nodules), making the task highly diverse and complex. Existing methods are often designed for specific segmentation scenarios, which limits their flexibility and ability to meet the diverse needs across various scenarios. To address these challenges, we propose a novel Localized and Globalized Frequency Fusion Model (LGFFM) for ultrasound image segmentation. Specifically, we first design a Parallel Bi-Encoder (PBE) architecture that integrates Local Feature Blocks (LFB) and Global Feature Blocks (GLB) to enhance feature extraction. Additionally, we introduce a Frequency Domain Mapping Module (FDMM) to capture texture information, particularly high-frequency details such as edges. Finally, a Multi-Domain Fusion (MDF) method is developed to effectively integrate features across different domains. We conduct extensive experiments on eight representative public ultrasound datasets across four different types. The results demonstrate that LGFFM outperforms current state-of-the-art methods in both segmentation accuracy and generalization performance.

Breaking Reward Collapse: Adaptive Reinforcement for Open-ended Medical Reasoning with Enhanced Semantic Discrimination

Yizhou Liu, Jingwei Wei, Zizhi Chen, Minghao Han, Xukun Zhang, Keliang Liu, Lihua Zhang

arxiv logopreprintAug 18 2025
Reinforcement learning (RL) with rule-based rewards has demonstrated strong potential in enhancing the reasoning and generalization capabilities of vision-language models (VLMs) and large language models (LLMs), while reducing computational overhead. However, its application in medical imaging remains underexplored. Existing reinforcement fine-tuning (RFT) approaches in this domain primarily target closed-ended visual question answering (VQA), limiting their applicability to real-world clinical reasoning. In contrast, open-ended medical VQA better reflects clinical practice but has received limited attention. While some efforts have sought to unify both formats via semantically guided RL, we observe that model-based semantic rewards often suffer from reward collapse, where responses with significant semantic differences receive similar scores. To address this, we propose ARMed (Adaptive Reinforcement for Medical Reasoning), a novel RL framework for open-ended medical VQA. ARMed first incorporates domain knowledge through supervised fine-tuning (SFT) on chain-of-thought data, then applies reinforcement learning with textual correctness and adaptive semantic rewards to enhance reasoning quality. We evaluate ARMed on six challenging medical VQA benchmarks. Results show that ARMed consistently boosts both accuracy and generalization, achieving a 32.64% improvement on in-domain tasks and an 11.65% gain on out-of-domain benchmarks. These results highlight the critical role of reward discriminability in medical RL and the promise of semantically guided rewards for enabling robust and clinically meaningful multimodal reasoning.

Multimodal large language models for medical image diagnosis: Challenges and opportunities.

Zhang A, Zhao E, Wang R, Zhang X, Wang J, Chen E

pubmed logopapersAug 18 2025
The integration of artificial intelligence (AI) into radiology has significantly improved diagnostic accuracy and workflow efficiency. Multimodal large language models (MLLMs), which combine natural language processing (NLP) and computer vision techniques, hold the potential to further revolutionize medical image analysis. Despite these advances, their widespread clinical adoption of MLLMs remains limited by challenges such as data quality, interpretability, ethical and regulatory compliance- including adherence to frameworks like the General Data Protection Regulation (GDPR) - computational demands, and generalizability across diverse patient populations. Addressing these interconnected challenges presents opportunities to enhance MLLM performance and reliability. Priorities for future research include improving model transparency, safeguarding data privacy through federated learning, optimizing multimodal fusion strategies, and establishing standardized evaluation frameworks. By overcoming these barriers, MLLMs can become essential tools in radiology, supporting clinical decision-making, and improving patient outcomes.

Multiphysics modelling enhanced by imaging and artificial intelligence for personalised cancer nanomedicine: Foundations for clinical digital twins.

Kashkooli FM, Bhandari A, Gu B, Kolios MC, Kohandel M, Zhan W

pubmed logopapersAug 18 2025
Nano-sized drug delivery systems have emerged as a more effective, versatile means for improving cancer treatment. However, the complexity of drug delivery to cancer involves intricate interactions between physiological and physicochemical processes across various temporal and spatial scales. Relying solely on experimental methods for developing and clinically translating nano-sized drug delivery systems is economically unfeasible. Multiphysics models, acting as open systems, offer a viable approach by allowing control over the individual and combined effects of various influencing factors on drug delivery outcomes. This provides an effective pathway for developing, optimising, and applying nano-sized drug delivery systems. These models are specifically designed to uncover the underlying mechanisms of drug delivery and to optimise effective delivery strategies. This review outlines the diverse applications of multiphysics simulations in advancing nanos-sized drug delivery systems for cancer treatment. The methods to develop these models and the integration of emerging technologies (i.e., medical imaging and artificial intelligence) are also addressed towards digital twins for personalised clinical translation of cancer nanomedicine. Multiphysics modelling tools are expected to become a powerful technology, expanding the scope of nano-sized drug delivery systems, thereby greatly enhancing cancer treatment outcomes and offering promising prospects for more effective patient care.

Interactive AI annotation of medical images in a virtual reality environment.

Orsmaa L, Saukkoriipi M, Kangas J, Rasouli N, Järnstedt J, Mehtonen H, Sahlsten J, Jaskari J, Kaski K, Raisamo R

pubmed logopapersAug 18 2025
Artificial intelligence (AI) achieves high-quality annotations of radiological images, yet often lacks the robustness required in clinical practice. Interactive annotation starts with an AI-generated delineation, allowing radiologists to refine it with feedback, potentially improving precision and reliability. These techniques have been explored in two-dimensional desktop environments, but are not validated by radiologists or integrated with immersive visualization technologies. We used a Virtual Reality (VR) system to determine whether (1) the annotation quality improves when radiologists can edit the AI annotation and (2) whether the extra work done by editing is worthwhile. We evaluated the clinical feasibility of an interactive VR approach to annotate mandibular and mental foramina on segmented 3D mandibular models. Three experienced dentomaxillofacial radiologists reviewed AI-generated annotations and, when needed, refined them at the voxel level in 3D space through click-based interactions until clinical standards were met. Our results indicate that integrating expert feedback within an immersive VR environment enhances annotation accuracy, improves clinical usability, and offers valuable insights for developing medical image analysis systems incorporating radiologist input. This study is the first to compare the quality of original and interactive AI annotation and to use radiologists' opinions as the measure. More research is needed for generalization.

Multi-Phase Automated Segmentation of Dental Structures in CBCT Using a Lightweight Auto3DSeg and SegResNet Implementation

Dominic LaBella, Keshav Jha, Jared Robbins, Esther Yu

arxiv logopreprintAug 18 2025
Cone-beam computed tomography (CBCT) has become an invaluable imaging modality in dentistry, enabling 3D visualization of teeth and surrounding structures for diagnosis and treatment planning. Automated segmentation of dental structures in CBCT can efficiently assist in identifying pathology (e.g., pulpal or periapical lesions) and facilitate radiation therapy planning in head and neck cancer patients. We describe the DLaBella29 team's approach for the MICCAI 2025 ToothFairy3 Challenge, which involves a deep learning pipeline for multi-class tooth segmentation. We utilized the MONAI Auto3DSeg framework with a 3D SegResNet architecture, trained on a subset of the ToothFairy3 dataset (63 CBCT scans) with 5-fold cross-validation. Key preprocessing steps included image resampling to 0.6 mm isotropic resolution and intensity clipping. We applied an ensemble fusion using Multi-Label STAPLE on the 5-fold predictions to infer a Phase 1 segmentation and then conducted tight cropping around the easily segmented Phase 1 mandible to perform Phase 2 segmentation on the smaller nerve structures. Our method achieved an average Dice of 0.87 on the ToothFairy3 challenge out-of-sample validation set. This paper details the clinical context, data preparation, model development, results of our approach, and discusses the relevance of automated dental segmentation for improving patient care in radiation oncology.

Modeling the MRI gradient system with a temporal convolutional network: Improved reconstruction by prediction of readout gradient errors.

Martin JB, Alderson HE, Gore JC, Does MD, Harkins KD

pubmed logopapersAug 18 2025
Our objective is to develop a general, nonlinear gradient system model that can accurately predict gradient distortions using convolutional networks. A set of training gradient waveforms were measured on a small animal imaging system and used to train a temporal convolutional network to predict the gradient waveforms produced by the imaging system. The trained network was able to accurately predict nonlinear distortions produced by the gradient system. Network prediction of gradient waveforms was incorporated into the image reconstruction pipeline and provided improvements in image quality and diffusion parameter mapping compared to both the nominal gradient waveform and the gradient impulse response function. Temporal convolutional networks can more accurately model gradient system behavior than existing linear methods and may be used to retrospectively correct gradient errors.

Weighted loss for imbalanced glaucoma detection: Insights from visual explanations.

Nugraha DJ, Yudistira N, Widodo AW

pubmed logopapersAug 17 2025
Glaucoma is a leading cause of irreversible vision loss in ophthalmology, primarily resulting from damage to the optic nerve. Early detection is crucial but remains challenging due to the inherent class imbalance in glaucoma fundus image datasets. This study addresses this limitation by applying a weighted loss function to Convolutional Neural Networks (CNNs), evaluated on the standardized SMDG-19 dataset, which integrates data from 19 publicly available sources. Key performance metrics including recall, F1-score, precision, accuracy, and AUC were analyzed, and interpretability was assessed using Grad-CAM.The results demonstrate that recall increased from 60.3% to 87.3%, representing a relative improvement of 44.75%, while F1-score improved from 66.5% to 71.4% (+7.25%). Minor trade-offs were observed in precision, which declined from 74.5% to 69.6% (-6.53%), and in accuracy, which dropped from 84.2% to 80.7% (-4.10%). In contrast, AUC rose from 84.2% to 87.4%, reflecting a relative gain of 3.21%. Grad-CAM visualizations showed consistent focus on clinically relevant regions of the optic nerve head, underscoring the effectiveness of the weighted loss strategy in improving both the performance and interpretability of CNN-based glaucoma detection systems.
Page 3 of 81804 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.