Sort by:
Page 6 of 3293286 results

Development of a multimodal vision transformer model for predicting traumatic versus degenerative rotator cuff tears on magnetic resonance imaging: A single-centre retrospective study.

Oettl FC, Malayeri AB, Furrer PR, Wieser K, Fürnstahl P, Bouaicha S

pubmed logopapersAug 13 2025
The differentiation between traumatic and degenerative rotator cuff tears (RCTs remains a diagnostic challenge with significant implications for treatment planning. While magnetic resonance imaging (MRI) is standard practice, traditional radiological interpretation has shown limited reliability in distinguishing these etiologies. This study evaluates the potential of artificial intelligence (AI) models, specifically a multimodal vision transformer (ViT), to differentiate between traumatic and degenerative RCT. In this retrospective, single-centre study, 99 shoulder MRIs were analysed from patients who underwent surgery at a specialised university shoulder unit between 2016 and 2019. The cohort was divided into training (n = 79) and validation (n = 20) sets. The traumatic group required a documented relevant trauma (excluding simple lifting injuries), previously asymptomatic shoulder and MRI within 3 months posttrauma. The degenerative group was of similar age and injured tendon, with patients presenting with at least 1 year of constant shoulder pain prior to imaging and no trauma history. The ViT was subsequently combined with demographic data to finalise in a multimodal ViT. Saliency maps are utilised as an explainability tool. The multimodal ViT model achieved an accuracy of 0.75 ± 0.08 with a recall of 0.8 ± 0.08, specificity of 0.71 ± 0.11 and a F1 score of 0.76 ± 0.1. The model maintained consistent performance across different patient subsets, demonstrating robust generalisation. Saliency maps do not show a consistent focus on the rotator cuff. AI shows potential in supporting the challenging differentiation between traumatic and degenerative RCT on MRI. The achieved accuracy of 75% is particularly significant given the similar groups which presented a challenging diagnostic scenario. Saliency maps were utilised to ensure explainability, the given lack of consistent focus on rotator cuff tendons hints towards underappreciated aspects in the differentiation. Not applicable.

AST-n: A Fast Sampling Approach for Low-Dose CT Reconstruction using Diffusion Models

Tomás de la Sotta, José M. Saavedra, Héctor Henríquez, Violeta Chang, Aline Xavier

arxiv logopreprintAug 13 2025
Low-dose CT (LDCT) protocols reduce radiation exposure but increase image noise, compromising diagnostic confidence. Diffusion-based generative models have shown promise for LDCT denoising by learning image priors and performing iterative refinement. In this work, we introduce AST-n, an accelerated inference framework that initiates reverse diffusion from intermediate noise levels, and integrate high-order ODE solvers within conditioned models to further reduce sampling steps. We evaluate two acceleration paradigms--AST-n sampling and standard scheduling with high-order solvers -- on the Low Dose CT Grand Challenge dataset, covering head, abdominal, and chest scans at 10-25 % of standard dose. Conditioned models using only 25 steps (AST-25) achieve peak signal-to-noise ratio (PSNR) above 38 dB and structural similarity index (SSIM) above 0.95, closely matching standard baselines while cutting inference time from ~16 seg to under 1 seg per slice. Unconditional sampling suffers substantial quality loss, underscoring the necessity of conditioning. We also assess DDIM inversion, which yields marginal PSNR gains at the cost of doubling inference time, limiting its clinical practicality. Our results demonstrate that AST-n with high-order samplers enables rapid LDCT reconstruction without significant loss of image fidelity, advancing the feasibility of diffusion-based methods in clinical workflows.

Automated Segmentation of Coronal Brain Tissue Slabs for 3D Neuropathology

Jonathan Williams Ramirez, Dina Zemlyanker, Lucas Deden-Binder, Rogeny Herisse, Erendira Garcia Pallares, Karthik Gopinath, Harshvardhan Gazula, Christopher Mount, Liana N. Kozanno, Michael S. Marshall, Theresa R. Connors, Matthew P. Frosch, Mark Montine, Derek H. Oakley, Christine L. Mac Donald, C. Dirk Keene, Bradley T. Hyman, Juan Eugenio Iglesias

arxiv logopreprintAug 13 2025
Advances in image registration and machine learning have recently enabled volumetric analysis of \emph{postmortem} brain tissue from conventional photographs of coronal slabs, which are routinely collected in brain banks and neuropathology laboratories worldwide. One caveat of this methodology is the requirement of segmentation of the tissue from photographs, which currently requires costly manual intervention. In this article, we present a deep learning model to automate this process. The automatic segmentation tool relies on a U-Net architecture that was trained with a combination of \textit{(i)}1,414 manually segmented images of both fixed and fresh tissue, from specimens with varying diagnoses, photographed at two different sites; and \textit{(ii)}~2,000 synthetic images with randomized contrast and corresponding masks generated from MRI scans for improved generalizability to unseen photographic setups. Automated model predictions on a subset of photographs not seen in training were analyzed to estimate performance compared to manual labels -- including both inter- and intra-rater variability. Our model achieved a median Dice score over 0.98, mean surface distance under 0.4~mm, and 95\% Hausdorff distance under 1.60~mm, which approaches inter-/intra-rater levels. Our tool is publicly available at surfer.nmr.mgh.harvard.edu/fswiki/PhotoTools.

Quest for a clinically relevant medical image segmentation metric: the definition and implementation of Medical Similarity Index

Szuzina Fazekas, Bettina Katalin Budai, Viktor Bérczi, Pál Maurovich-Horvat, Zsolt Vizi

arxiv logopreprintAug 13 2025
Background: In the field of radiology and radiotherapy, accurate delineation of tissues and organs plays a crucial role in both diagnostics and therapeutics. While the gold standard remains expert-driven manual segmentation, many automatic segmentation methods are emerging. The evaluation of these methods primarily relies on traditional metrics that only incorporate geometrical properties and fail to adapt to various applications. Aims: This study aims to develop and implement a clinically relevant segmentation metric that can be adapted for use in various medical imaging applications. Methods: Bidirectional local distance was defined, and the points of the test contour were paired with points of the reference contour. After correcting for the distance between the test and reference center of mass, Euclidean distance was calculated between the paired points, and a score was given to each test point. The overall medical similarity index was calculated as the average score across all the test points. For demonstration, we used myoma and prostate datasets; nnUNet neural networks were trained for segmentation. Results: An easy-to-use, sustainable image processing pipeline was created using Python. The code is available in a public GitHub repository along with Google Colaboratory notebooks. The algorithm can handle multislice images with multiple masks per slice. Mask splitting algorithm is also provided that can separate the concave masks. We demonstrate the adaptability with prostate segmentation evaluation. Conclusions: A novel segmentation evaluation metric was implemented, and an open-access image processing pipeline was also provided, which can be easily used for automatic measurement of clinical relevance of medical image segmentation.}

Multi-Contrast Fusion Module: An attention mechanism integrating multi-contrast features for fetal torso plane classification

Shengjun Zhu, Siyu Liu, Runqing Xiong, Liping Zheng, Duo Ma, Rongshang Chen, Jiaxin Cai

arxiv logopreprintAug 13 2025
Purpose: Prenatal ultrasound is a key tool in evaluating fetal structural development and detecting abnormalities, contributing to reduced perinatal complications and improved neonatal survival. Accurate identification of standard fetal torso planes is essential for reliable assessment and personalized prenatal care. However, limitations such as low contrast and unclear texture details in ultrasound imaging pose significant challenges for fine-grained anatomical recognition. Methods: We propose a novel Multi-Contrast Fusion Module (MCFM) to enhance the model's ability to extract detailed information from ultrasound images. MCFM operates exclusively on the lower layers of the neural network, directly processing raw ultrasound data. By assigning attention weights to image representations under different contrast conditions, the module enhances feature modeling while explicitly maintaining minimal parameter overhead. Results: The proposed MCFM was evaluated on a curated dataset of fetal torso plane ultrasound images. Experimental results demonstrate that MCFM substantially improves recognition performance, with a minimal increase in model complexity. The integration of multi-contrast attention enables the model to better capture subtle anatomical structures, contributing to higher classification accuracy and clinical reliability. Conclusions: Our method provides an effective solution for improving fetal torso plane recognition in ultrasound imaging. By enhancing feature representation through multi-contrast fusion, the proposed approach supports clinicians in achieving more accurate and consistent diagnoses, demonstrating strong potential for clinical adoption in prenatal screening. The codes are available at https://github.com/sysll/MCFM.

NEURAL: Attention-Guided Pruning for Unified Multimodal Resource-Constrained Clinical Evaluation

Devvrat Joshi, Islem Rekik

arxiv logopreprintAug 13 2025
The rapid growth of multimodal medical imaging data presents significant storage and transmission challenges, particularly in resource-constrained clinical settings. We propose NEURAL, a novel framework that addresses this by using semantics-guided data compression. Our approach repurposes cross-attention scores between the image and its radiological report from a fine-tuned generative vision-language model to structurally prune chest X-rays, preserving only diagnostically critical regions. This process transforms the image into a highly compressed, graph representation. This unified graph-based representation fuses the pruned visual graph with a knowledge graph derived from the clinical report, creating a universal data structure that simplifies downstream modeling. Validated on the MIMIC-CXR and CheXpert Plus dataset for pneumonia detection, NEURAL achieves a 93.4-97.7\% reduction in image data size while maintaining a high diagnostic performance of 0.88-0.95 AUC, outperforming other baseline models that use uncompressed data. By creating a persistent, task-agnostic data asset, NEURAL resolves the trade-off between data size and clinical utility, enabling efficient workflows and teleradiology without sacrificing performance. Our NEURAL code is available at https://github.com/basiralab/NEURAL.

MInDI-3D: Iterative Deep Learning in 3D for Sparse-view Cone Beam Computed Tomography

Daniel Barco, Marc Stadelmann, Martin Oswald, Ivo Herzig, Lukas Lichtensteiger, Pascal Paysan, Igor Peterlik, Michal Walczak, Bjoern Menze, Frank-Peter Schilling

arxiv logopreprintAug 13 2025
We present MInDI-3D (Medical Inversion by Direct Iteration in 3D), the first 3D conditional diffusion-based model for real-world sparse-view Cone Beam Computed Tomography (CBCT) artefact removal, aiming to reduce imaging radiation exposure. A key contribution is extending the "InDI" concept from 2D to a full 3D volumetric approach for medical images, implementing an iterative denoising process that refines the CBCT volume directly from sparse-view input. A further contribution is the generation of a large pseudo-CBCT dataset (16,182) from chest CT volumes of the CT-RATE public dataset to robustly train MInDI-3D. We performed a comprehensive evaluation, including quantitative metrics, scalability analysis, generalisation tests, and a clinical assessment by 11 clinicians. Our results show MInDI-3D's effectiveness, achieving a 12.96 (6.10) dB PSNR gain over uncorrected scans with only 50 projections on the CT-RATE pseudo-CBCT (independent real-world) test set and enabling an 8x reduction in imaging radiation exposure. We demonstrate its scalability by showing that performance improves with more training data. Importantly, MInDI-3D matches the performance of a 3D U-Net on real-world scans from 16 cancer patients across distortion and task-based metrics. It also generalises to new CBCT scanner geometries. Clinicians rated our model as sufficient for patient positioning across all anatomical sites and found it preserved lung tumour boundaries well.

A Chain of Diagnosis Framework for Accurate and Explainable Radiology Report Generation

Haibo Jin, Haoxuan Che, Sunan He, Hao Chen

arxiv logopreprintAug 13 2025
Despite the progress of radiology report generation (RRG), existing works face two challenges: 1) The performances in clinical efficacy are unsatisfactory, especially for lesion attributes description; 2) the generated text lacks explainability, making it difficult for radiologists to trust the results. To address the challenges, we focus on a trustworthy RRG model, which not only generates accurate descriptions of abnormalities, but also provides basis of its predictions. To this end, we propose a framework named chain of diagnosis (CoD), which maintains a chain of diagnostic process for clinically accurate and explainable RRG. It first generates question-answer (QA) pairs via diagnostic conversation to extract key findings, then prompts a large language model with QA diagnoses for accurate generation. To enhance explainability, a diagnosis grounding module is designed to match QA diagnoses and generated sentences, where the diagnoses act as a reference. Moreover, a lesion grounding module is designed to locate abnormalities in the image, further improving the working efficiency of radiologists. To facilitate label-efficient training, we propose an omni-supervised learning strategy with clinical consistency to leverage various types of annotations from different datasets. Our efforts lead to 1) an omni-labeled RRG dataset with QA pairs and lesion boxes; 2) a evaluation tool for assessing the accuracy of reports in describing lesion location and severity; 3) extensive experiments to demonstrate the effectiveness of CoD, where it outperforms both specialist and generalist models consistently on two RRG benchmarks and shows promising explainability by accurately grounding generated sentences to QA diagnoses and images.

GazeLT: Visual attention-guided long-tailed disease classification in chest radiographs

Moinak Bhattacharya, Gagandeep Singh, Shubham Jain, Prateek Prasanna

arxiv logopreprintAug 13 2025
In this work, we present GazeLT, a human visual attention integration-disintegration approach for long-tailed disease classification. A radiologist's eye gaze has distinct patterns that capture both fine-grained and coarser level disease related information. While interpreting an image, a radiologist's attention varies throughout the duration; it is critical to incorporate this into a deep learning framework to improve automated image interpretation. Another important aspect of visual attention is that apart from looking at major/obvious disease patterns, experts also look at minor/incidental findings (few of these constituting long-tailed classes) during the course of image interpretation. GazeLT harnesses the temporal aspect of the visual search process, via an integration and disintegration mechanism, to improve long-tailed disease classification. We show the efficacy of GazeLT on two publicly available datasets for long-tailed disease classification, namely the NIH-CXR-LT (n=89237) and the MIMIC-CXR-LT (n=111898) datasets. GazeLT outperforms the best long-tailed loss by 4.1% and the visual attention-based baseline by 21.7% in average accuracy metrics for these datasets. Our code is available at https://github.com/lordmoinak1/gazelt.

SKOOTS: Skeleton oriented object segmentation for mitochondria

Buswinka, C. J., Osgood, R. T., Nitta, H., Indzhykulian, A. A.

biorxiv logopreprintAug 13 2025
Segmenting individual instances of mitochondria from imaging datasets can provide rich quantitative information, but is prohibitively time-consuming when done manually, prompting interest in the development of automated algorithms using deep neural networks. Existing solutions for various segmentation tasks are optimized for either: high-resolution three-dimensional imaging, relying on well-defined object boundaries (e.g., whole neuron segmentation in volumetric electron microscopy datasets); or low-resolution two-dimensional imaging, boundary-invariant but poorly suited to large 3D objects (e.g., whole-cell segmentation of light microscopy images). Mitochondria in whole-cell 3D electron microscopy datasets often lie in the middle ground - large, yet with ambiguous borders, challenging current segmentation tools. To address this, we developed skeleton-oriented object segmentation (SKOOTS) - a novel approach that efficiently segments large, densely packed mitochondria. SKOOTS accurately and efficiently segments mitochondria in previously difficult contexts and can also be applied to segment other objects in 3D light microscopy datasets. This approach bridges a critical gap between existing segmentation approaches, improving the utility of automated analysis of three-dimensional biomedical imaging data. We demonstrate the utility of SKOOTS by applying it to segment over 15,000 cochlear hair cell mitochondria across experimental conditions in under 2 hours on a consumer-grade PC, enabling downstream morphological analysis that revealed subtle structural changes following aminoglycoside exposure - differences not detectable using analysis approaches currently used in the field.
Page 6 of 3293286 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.