Sort by:
Page 16 of 4064055 results

Quest for a clinically relevant medical image segmentation metric: the definition and implementation of Medical Similarity Index

Szuzina Fazekas, Bettina Katalin Budai, Viktor Bérczi, Pál Maurovich-Horvat, Zsolt Vizi

arxiv logopreprintAug 13 2025
Background: In the field of radiology and radiotherapy, accurate delineation of tissues and organs plays a crucial role in both diagnostics and therapeutics. While the gold standard remains expert-driven manual segmentation, many automatic segmentation methods are emerging. The evaluation of these methods primarily relies on traditional metrics that only incorporate geometrical properties and fail to adapt to various applications. Aims: This study aims to develop and implement a clinically relevant segmentation metric that can be adapted for use in various medical imaging applications. Methods: Bidirectional local distance was defined, and the points of the test contour were paired with points of the reference contour. After correcting for the distance between the test and reference center of mass, Euclidean distance was calculated between the paired points, and a score was given to each test point. The overall medical similarity index was calculated as the average score across all the test points. For demonstration, we used myoma and prostate datasets; nnUNet neural networks were trained for segmentation. Results: An easy-to-use, sustainable image processing pipeline was created using Python. The code is available in a public GitHub repository along with Google Colaboratory notebooks. The algorithm can handle multislice images with multiple masks per slice. Mask splitting algorithm is also provided that can separate the concave masks. We demonstrate the adaptability with prostate segmentation evaluation. Conclusions: A novel segmentation evaluation metric was implemented, and an open-access image processing pipeline was also provided, which can be easily used for automatic measurement of clinical relevance of medical image segmentation.}

Multi-Contrast Fusion Module: An attention mechanism integrating multi-contrast features for fetal torso plane classification

Shengjun Zhu, Siyu Liu, Runqing Xiong, Liping Zheng, Duo Ma, Rongshang Chen, Jiaxin Cai

arxiv logopreprintAug 13 2025
Purpose: Prenatal ultrasound is a key tool in evaluating fetal structural development and detecting abnormalities, contributing to reduced perinatal complications and improved neonatal survival. Accurate identification of standard fetal torso planes is essential for reliable assessment and personalized prenatal care. However, limitations such as low contrast and unclear texture details in ultrasound imaging pose significant challenges for fine-grained anatomical recognition. Methods: We propose a novel Multi-Contrast Fusion Module (MCFM) to enhance the model's ability to extract detailed information from ultrasound images. MCFM operates exclusively on the lower layers of the neural network, directly processing raw ultrasound data. By assigning attention weights to image representations under different contrast conditions, the module enhances feature modeling while explicitly maintaining minimal parameter overhead. Results: The proposed MCFM was evaluated on a curated dataset of fetal torso plane ultrasound images. Experimental results demonstrate that MCFM substantially improves recognition performance, with a minimal increase in model complexity. The integration of multi-contrast attention enables the model to better capture subtle anatomical structures, contributing to higher classification accuracy and clinical reliability. Conclusions: Our method provides an effective solution for improving fetal torso plane recognition in ultrasound imaging. By enhancing feature representation through multi-contrast fusion, the proposed approach supports clinicians in achieving more accurate and consistent diagnoses, demonstrating strong potential for clinical adoption in prenatal screening. The codes are available at https://github.com/sysll/MCFM.

KonfAI: A Modular and Fully Configurable Framework for Deep Learning in Medical Imaging

Valentin Boussot, Jean-Louis Dillenseger

arxiv logopreprintAug 13 2025
KonfAI is a modular, extensible, and fully configurable deep learning framework specifically designed for medical imaging tasks. It enables users to define complete training, inference, and evaluation workflows through structured YAML configuration files, without modifying the underlying code. This declarative approach enhances reproducibility, transparency, and experimental traceability while reducing development time. Beyond the capabilities of standard pipelines, KonfAI provides native abstractions for advanced strategies including patch-based learning, test-time augmentation, model ensembling, and direct access to intermediate feature representations for deep supervision. It also supports complex multi-model training setups such as generative adversarial architectures. Thanks to its modular and extensible architecture, KonfAI can easily accommodate custom models, loss functions, and data processing components. The framework has been successfully applied to segmentation, registration, and image synthesis tasks, and has contributed to top-ranking results in several international medical imaging challenges. KonfAI is open source and available at \href{https://github.com/vboussot/KonfAI}{https://github.com/vboussot/KonfAI}.

NEURAL: Attention-Guided Pruning for Unified Multimodal Resource-Constrained Clinical Evaluation

Devvrat Joshi, Islem Rekik

arxiv logopreprintAug 13 2025
The rapid growth of multimodal medical imaging data presents significant storage and transmission challenges, particularly in resource-constrained clinical settings. We propose NEURAL, a novel framework that addresses this by using semantics-guided data compression. Our approach repurposes cross-attention scores between the image and its radiological report from a fine-tuned generative vision-language model to structurally prune chest X-rays, preserving only diagnostically critical regions. This process transforms the image into a highly compressed, graph representation. This unified graph-based representation fuses the pruned visual graph with a knowledge graph derived from the clinical report, creating a universal data structure that simplifies downstream modeling. Validated on the MIMIC-CXR and CheXpert Plus dataset for pneumonia detection, NEURAL achieves a 93.4-97.7\% reduction in image data size while maintaining a high diagnostic performance of 0.88-0.95 AUC, outperforming other baseline models that use uncompressed data. By creating a persistent, task-agnostic data asset, NEURAL resolves the trade-off between data size and clinical utility, enabling efficient workflows and teleradiology without sacrificing performance. Our NEURAL code is available at https://github.com/basiralab/NEURAL.

MInDI-3D: Iterative Deep Learning in 3D for Sparse-view Cone Beam Computed Tomography

Daniel Barco, Marc Stadelmann, Martin Oswald, Ivo Herzig, Lukas Lichtensteiger, Pascal Paysan, Igor Peterlik, Michal Walczak, Bjoern Menze, Frank-Peter Schilling

arxiv logopreprintAug 13 2025
We present MInDI-3D (Medical Inversion by Direct Iteration in 3D), the first 3D conditional diffusion-based model for real-world sparse-view Cone Beam Computed Tomography (CBCT) artefact removal, aiming to reduce imaging radiation exposure. A key contribution is extending the "InDI" concept from 2D to a full 3D volumetric approach for medical images, implementing an iterative denoising process that refines the CBCT volume directly from sparse-view input. A further contribution is the generation of a large pseudo-CBCT dataset (16,182) from chest CT volumes of the CT-RATE public dataset to robustly train MInDI-3D. We performed a comprehensive evaluation, including quantitative metrics, scalability analysis, generalisation tests, and a clinical assessment by 11 clinicians. Our results show MInDI-3D's effectiveness, achieving a 12.96 (6.10) dB PSNR gain over uncorrected scans with only 50 projections on the CT-RATE pseudo-CBCT (independent real-world) test set and enabling an 8x reduction in imaging radiation exposure. We demonstrate its scalability by showing that performance improves with more training data. Importantly, MInDI-3D matches the performance of a 3D U-Net on real-world scans from 16 cancer patients across distortion and task-based metrics. It also generalises to new CBCT scanner geometries. Clinicians rated our model as sufficient for patient positioning across all anatomical sites and found it preserved lung tumour boundaries well.

A Chain of Diagnosis Framework for Accurate and Explainable Radiology Report Generation

Haibo Jin, Haoxuan Che, Sunan He, Hao Chen

arxiv logopreprintAug 13 2025
Despite the progress of radiology report generation (RRG), existing works face two challenges: 1) The performances in clinical efficacy are unsatisfactory, especially for lesion attributes description; 2) the generated text lacks explainability, making it difficult for radiologists to trust the results. To address the challenges, we focus on a trustworthy RRG model, which not only generates accurate descriptions of abnormalities, but also provides basis of its predictions. To this end, we propose a framework named chain of diagnosis (CoD), which maintains a chain of diagnostic process for clinically accurate and explainable RRG. It first generates question-answer (QA) pairs via diagnostic conversation to extract key findings, then prompts a large language model with QA diagnoses for accurate generation. To enhance explainability, a diagnosis grounding module is designed to match QA diagnoses and generated sentences, where the diagnoses act as a reference. Moreover, a lesion grounding module is designed to locate abnormalities in the image, further improving the working efficiency of radiologists. To facilitate label-efficient training, we propose an omni-supervised learning strategy with clinical consistency to leverage various types of annotations from different datasets. Our efforts lead to 1) an omni-labeled RRG dataset with QA pairs and lesion boxes; 2) a evaluation tool for assessing the accuracy of reports in describing lesion location and severity; 3) extensive experiments to demonstrate the effectiveness of CoD, where it outperforms both specialist and generalist models consistently on two RRG benchmarks and shows promising explainability by accurately grounding generated sentences to QA diagnoses and images.

GazeLT: Visual attention-guided long-tailed disease classification in chest radiographs

Moinak Bhattacharya, Gagandeep Singh, Shubham Jain, Prateek Prasanna

arxiv logopreprintAug 13 2025
In this work, we present GazeLT, a human visual attention integration-disintegration approach for long-tailed disease classification. A radiologist's eye gaze has distinct patterns that capture both fine-grained and coarser level disease related information. While interpreting an image, a radiologist's attention varies throughout the duration; it is critical to incorporate this into a deep learning framework to improve automated image interpretation. Another important aspect of visual attention is that apart from looking at major/obvious disease patterns, experts also look at minor/incidental findings (few of these constituting long-tailed classes) during the course of image interpretation. GazeLT harnesses the temporal aspect of the visual search process, via an integration and disintegration mechanism, to improve long-tailed disease classification. We show the efficacy of GazeLT on two publicly available datasets for long-tailed disease classification, namely the NIH-CXR-LT (n=89237) and the MIMIC-CXR-LT (n=111898) datasets. GazeLT outperforms the best long-tailed loss by 4.1% and the visual attention-based baseline by 21.7% in average accuracy metrics for these datasets. Our code is available at https://github.com/lordmoinak1/gazelt.

SKOOTS: Skeleton oriented object segmentation for mitochondria

Buswinka, C. J., Osgood, R. T., Nitta, H., Indzhykulian, A. A.

biorxiv logopreprintAug 13 2025
Segmenting individual instances of mitochondria from imaging datasets can provide rich quantitative information, but is prohibitively time-consuming when done manually, prompting interest in the development of automated algorithms using deep neural networks. Existing solutions for various segmentation tasks are optimized for either: high-resolution three-dimensional imaging, relying on well-defined object boundaries (e.g., whole neuron segmentation in volumetric electron microscopy datasets); or low-resolution two-dimensional imaging, boundary-invariant but poorly suited to large 3D objects (e.g., whole-cell segmentation of light microscopy images). Mitochondria in whole-cell 3D electron microscopy datasets often lie in the middle ground - large, yet with ambiguous borders, challenging current segmentation tools. To address this, we developed skeleton-oriented object segmentation (SKOOTS) - a novel approach that efficiently segments large, densely packed mitochondria. SKOOTS accurately and efficiently segments mitochondria in previously difficult contexts and can also be applied to segment other objects in 3D light microscopy datasets. This approach bridges a critical gap between existing segmentation approaches, improving the utility of automated analysis of three-dimensional biomedical imaging data. We demonstrate the utility of SKOOTS by applying it to segment over 15,000 cochlear hair cell mitochondria across experimental conditions in under 2 hours on a consumer-grade PC, enabling downstream morphological analysis that revealed subtle structural changes following aminoglycoside exposure - differences not detectable using analysis approaches currently used in the field.

BSA-Net: Boundary-prioritized spatial adaptive network for efficient left atrial segmentation.

Xu F, Tu W, Feng F, Yang J, Gunawardhana M, Gu Y, Huang J, Zhao J

pubmed logopapersAug 13 2025
Atrial fibrillation, a common cardiac arrhythmia with rapid and irregular atrial electrical activity, requires accurate left atrial segmentation for effective treatment planning. Recently, deep learning methods have gained encouraging success in left atrial segmentation. However, current methodologies critically depend on the assumption of consistently complete centered left atrium as input, which neglects the structural incompleteness and boundary discontinuities arising from random-crop operations during inference. In this paper, we propose BSA-Net, which exploits an adaptive adjustment strategy in both feature position and loss optimization to establish long-range feature relationships and strengthen robust intermediate feature representations in boundary regions. Specifically, we propose a Spatial-adaptive Convolution (SConv) that employs a shuffle operation combined with lightweight convolution to directly establish cross-positional relationships within regions of potential relevance. Moreover, we develop the dual Boundary Prioritized loss, which enhances boundary precision by differentially weighting foreground and background boundaries, thus optimizing complex boundary regions. With the above technologies, the proposed method enjoys a better speed-accuracy trade-off compared to current methods. BSA-Net attains Dice scores of 92.55%, 91.42%, and 84.67% on the LA, Utah, and Waikato datasets, respectively, with a mere 2.16 M parameters-approximately 80% fewer than other contemporary state-of-the-art models. Extensive experimental results on three benchmark datasets have demonstrated that BSA-Net, consistently and significantly outperforms existing state-of-the-art methods.

Automatic detection of arterial input function for brain DCE-MRI in multi-site cohorts.

Saca L, Gaggar R, Pappas I, Benzinger T, Reiman EM, Shiroishi MS, Joe EB, Ringman JM, Yassine HN, Schneider LS, Chui HC, Nation DA, Zlokovic BV, Toga AW, Chakhoyan A, Barnes S

pubmed logopapersAug 13 2025
Arterial input function (AIF) extraction is a crucial step in quantitative pharmacokinetic modeling of DCE-MRI. This work proposes a robust deep learning model that can precisely extract an AIF from DCE-MRI images. A diverse dataset of human brain DCE-MRI images from 289 participants, totaling 384 scans, from five different institutions with extracted gadolinium-based contrast agent curves from large penetrating arteries, and with most data collected for blood-brain barrier (BBB) permeability measurement, was retrospectively analyzed. A 3D UNet model was implemented and trained on manually drawn AIF regions. The testing cohort was compared using proposed AIF quality metric AIFitness and K<sup>trans</sup> values from a standard DCE pipeline. This UNet was then applied to a separate dataset of 326 participants with a total of 421 DCE-MRI images with analyzed AIF quality and K<sup>trans</sup> values. The resulting 3D UNet model achieved an average AIFitness score of 93.9 compared to 99.7 for manually selected AIFs, and white matter K<sup>trans</sup> values were 0.45/min × 10<sup>-3</sup> and 0.45/min × 10<sup>-3</sup>, respectively. The intraclass correlation between automated and manual K<sup>trans</sup> values was 0.89. The separate replication dataset yielded an AIFitness score of 97.0 and white matter K<sup>trans</sup> of 0.44/min × 10<sup>-3</sup>. Findings suggest a 3D UNet model with additional convolutional neural network kernels and a modified Huber loss function achieves superior performance for identifying AIF curves from DCE-MRI in a diverse multi-center cohort. AIFitness scores and DCE-MRI-derived metrics, such as K<sup>trans</sup> maps, showed no significant differences in gray and white matter between manually drawn and automated AIFs.
Page 16 of 4064055 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.