Sort by:
Page 193 of 3623611 results

LangMamba: A Language-driven Mamba Framework for Low-dose CT Denoising with Vision-language Models

Zhihao Chen, Tao Chen, Chenhui Wang, Qi Gao, Huidong Xie, Chuang Niu, Ge Wang, Hongming Shan

arxiv logopreprintJul 8 2025
Low-dose computed tomography (LDCT) reduces radiation exposure but often degrades image quality, potentially compromising diagnostic accuracy. Existing deep learning-based denoising methods focus primarily on pixel-level mappings, overlooking the potential benefits of high-level semantic guidance. Recent advances in vision-language models (VLMs) suggest that language can serve as a powerful tool for capturing structured semantic information, offering new opportunities to improve LDCT reconstruction. In this paper, we introduce LangMamba, a Language-driven Mamba framework for LDCT denoising that leverages VLM-derived representations to enhance supervision from normal-dose CT (NDCT). LangMamba follows a two-stage learning strategy. First, we pre-train a Language-guided AutoEncoder (LangAE) that leverages frozen VLMs to map NDCT images into a semantic space enriched with anatomical information. Second, we synergize LangAE with two key components to guide LDCT denoising: Semantic-Enhanced Efficient Denoiser (SEED), which enhances NDCT-relevant local semantic while capturing global features with efficient Mamba mechanism, and Language-engaged Dual-space Alignment (LangDA) Loss, which ensures that denoised images align with NDCT in both perceptual and semantic spaces. Extensive experiments on two public datasets demonstrate that LangMamba outperforms conventional state-of-the-art methods, significantly improving detail preservation and visual fidelity. Remarkably, LangAE exhibits strong generalizability to unseen datasets, thereby reducing training costs. Furthermore, LangDA loss improves explainability by integrating language-guided insights into image reconstruction and offers a plug-and-play fashion. Our findings shed new light on the potential of language as a supervisory signal to advance LDCT denoising. The code is publicly available on https://github.com/hao1635/LangMamba.

A novel framework for fully-automated co-registration of intravascular ultrasound and optical coherence tomography imaging data

Xingwei He, Kit Mills Bransby, Ahmet Emir Ulutas, Thamil Kumaran, Nathan Angelo Lecaros Yap, Gonul Zeren, Hesong Zeng, Yaojun Zhang, Andreas Baumbach, James Moon, Anthony Mathur, Jouke Dijkstra, Qianni Zhang, Lorenz Raber, Christos V Bourantas

arxiv logopreprintJul 8 2025
Aims: To develop a deep-learning (DL) framework that will allow fully automated longitudinal and circumferential co-registration of intravascular ultrasound (IVUS) and optical coherence tomography (OCT) images. Methods and results: Data from 230 patients (714 vessels) with acute coronary syndrome that underwent near-infrared spectroscopy (NIRS)-IVUS and OCT imaging in their non-culprit vessels were included in the present analysis. The lumen borders annotated by expert analysts in 61,655 NIRS-IVUS and 62,334 OCT frames, and the side branches and calcific tissue identified in 10,000 NIRS-IVUS frames and 10,000 OCT frames, were used to train DL solutions for the automated extraction of these features. The trained DL solutions were used to process NIRS-IVUS and OCT images and their output was used by a dynamic time warping algorithm to co-register longitudinally the NIRS-IVUS and OCT images, while the circumferential registration of the IVUS and OCT was optimized through dynamic programming. On a test set of 77 vessels from 22 patients, the DL method showed high concordance with the expert analysts for the longitudinal and circumferential co-registration of the two imaging sets (concordance correlation coefficient >0.99 for the longitudinal and >0.90 for the circumferential co-registration). The Williams Index was 0.96 for longitudinal and 0.97 for circumferential co-registration, indicating a comparable performance to the analysts. The time needed for the DL pipeline to process imaging data from a vessel was <90s. Conclusion: The fully automated, DL-based framework introduced in this study for the co-registration of IVUS and OCT is fast and provides estimations that compare favorably to the expert analysts. These features renders it useful in research in the analysis of large-scale data collected in studies that incorporate multimodality imaging to characterize plaque composition.

An autonomous agent for auditing and improving the reliability of clinical AI models

Lukas Kuhn, Florian Buettner

arxiv logopreprintJul 8 2025
The deployment of AI models in clinical practice faces a critical challenge: models achieving expert-level performance on benchmarks can fail catastrophically when confronted with real-world variations in medical imaging. Minor shifts in scanner hardware, lighting or demographics can erode accuracy, but currently reliability auditing to identify such catastrophic failure cases before deployment is a bespoke and time-consuming process. Practitioners lack accessible and interpretable tools to expose and repair hidden failure modes. Here we introduce ModelAuditor, a self-reflective agent that converses with users, selects task-specific metrics, and simulates context-dependent, clinically relevant distribution shifts. ModelAuditor then generates interpretable reports explaining how much performance likely degrades during deployment, discussing specific likely failure modes and identifying root causes and mitigation strategies. Our comprehensive evaluation across three real-world clinical scenarios - inter-institutional variation in histopathology, demographic shifts in dermatology, and equipment heterogeneity in chest radiography - demonstrates that ModelAuditor is able correctly identify context-specific failure modes of state-of-the-art models such as the established SIIM-ISIC melanoma classifier. Its targeted recommendations recover 15-25% of performance lost under real-world distribution shift, substantially outperforming both baseline models and state-of-the-art augmentation methods. These improvements are achieved through a multi-agent architecture and execute on consumer hardware in under 10 minutes, costing less than US$0.50 per audit.

Just Say Better or Worse: A Human-AI Collaborative Framework for Medical Image Segmentation Without Manual Annotations

Yizhe Zhang

arxiv logopreprintJul 8 2025
Manual annotation of medical images is a labor-intensive and time-consuming process, posing a significant bottleneck in the development and deployment of robust medical imaging AI systems. This paper introduces a novel Human-AI collaborative framework for medical image segmentation that substantially reduces the annotation burden by eliminating the need for explicit manual pixel-level labeling. The core innovation lies in a preference learning paradigm, where human experts provide minimal, intuitive feedback -- simply indicating whether an AI-generated segmentation is better or worse than a previous version. The framework comprises four key components: (1) an adaptable foundation model (FM) for feature extraction, (2) label propagation based on feature similarity, (3) a clicking agent that learns from human better-or-worse feedback to decide where to click and with which label, and (4) a multi-round segmentation learning procedure that trains a state-of-the-art segmentation network using pseudo-labels generated by the clicking agent and FM-based label propagation. Experiments on three public datasets demonstrate that the proposed approach achieves competitive segmentation performance using only binary preference feedback, without requiring experts to directly manually annotate the images.

Modeling and Reversing Brain Lesions Using Diffusion Models

Omar Zamzam, Haleh Akrami, Anand Joshi, Richard Leahy

arxiv logopreprintJul 8 2025
Brain lesions are abnormalities or injuries in brain tissue that are often detectable using magnetic resonance imaging (MRI), which reveals structural changes in the affected areas. This broad definition of brain lesions includes areas of the brain that are irreversibly damaged, as well as areas of brain tissue that are deformed as a result of lesion growth or swelling. Despite the importance of differentiating between damaged and deformed tissue, existing lesion segmentation methods overlook this distinction, labeling both of them as a single anomaly. In this work, we introduce a diffusion model-based framework for analyzing and reversing the brain lesion process. Our pipeline first segments abnormal regions in the brain, then estimates and reverses tissue deformations by restoring displaced tissue to its original position, isolating the core lesion area representing the initial damage. Finally, we inpaint the core lesion area to arrive at an estimation of the pre-lesion healthy brain. This proposed framework reverses a forward lesion growth process model that is well-established in biomechanical studies that model brain lesions. Our results demonstrate improved accuracy in lesion segmentation, characterization, and brain labeling compared to traditional methods, offering a robust tool for clinical and research applications in brain lesion analysis. Since pre-lesion healthy versions of abnormal brains are not available in any public dataset for validation of the reverse process, we simulate a forward model to synthesize multiple lesioned brain images.

AI-enhanced patient-specific dosimetry in I-131 planar imaging with a single oblique view.

Jalilifar M, Sadeghi M, Emami-Ardekani A, Bitarafan-Rajabi A, Geravand K, Geramifar P

pubmed logopapersJul 8 2025
This study aims to enhance the dosimetry accuracy in <sup>131</sup>I planar imaging by utilizing a single oblique view and Monte Carlo (MC) validated dose point kernels (DPKs) alongside the integration of artificial intelligence (AI) for accurate dose prediction within planar imaging. Forty patients with thyroid cancers post-thyroidectomy surgery and 30 with neuroendocrine tumors underwent planar and SPECT/CT imaging. Using whole-body (WB) planar images with an additional oblique view, organ thicknesses were estimated. DPKs and organ-specific S-values were used to estimate the absorbed doses. Four AI algorithms- multilayer perceptron (MLP), linear regression, support vector regression model, decision tree, convolution neural network, and U-Net were used for dose estimation. Planar image counts, body thickness, patient BMI, age, S-values, and tissue attenuation coefficients were imported as input into the AI algorithm. To provide the ground truth, the CT-based segmentation generated binary masks for each organ, and the corresponding SPECT images were used for GATE MC dosimetry. The MLP-predicted dose values across all organs represented superior performance with the lowest mean absolute error in the liver but higher in the spleen and salivary glands. Notably, MLP-based dose estimations closely matched ground truth data with < 15% differences in most tissues. The MLP-estimated dose values present a robust patient-specific dosimetry approach capable of swiftly predicting absorbed doses in different organs using WB planar images and a single oblique view. This approach facilitates the implementation of 2D planar imaging as a pre-therapeutic technique for a more accurate assessment of the administrated activity.

Automated instance segmentation and registration of spinal vertebrae from CT-Scans with an improved 3D U-net neural network and corner point registration.

Hill J, Khokher MR, Nguyen C, Adcock M, Li R, Anderson S, Morrell T, Diprose T, Salvado O, Wang D, Tay GK

pubmed logopapersJul 8 2025
This paper presents a rapid and robust approach for 3D volumetric segmentation, labelling, and registration of human spinal vertebrae from CT scans using an optimised and improved 3D U-Net neural network architecture. The network is designed by incorporating residual and dense interconnections, followed by an extensive evaluation of different network setups by optimising the network components like activation functions, optimisers, and pooling operations. In addition, the network architecture is optimised for varying numbers of convolution layers per block and U-Net levels with fixed and cascading numbers of filters. For 3D virtual reality visualisation, the segmentation output of the improved 3D U-Net network is registered with the original scans through a corner point registration process. The registration takes into account the spatial coordinates of each segmented vertebra as a 3D volume and eight virtual fiducial markers to ensure alignment in all rotational planes. Trained on the VerSe'20 dataset, the proposed pipeline achieves a Dice score coefficient of 92.38% for vertebrae instance segmentation and a Hausdorff distance of 5.26 mm for vertebrae localisation on the VerSe'20 public test dataset, which outperforms many existing methods that participated in the VerSe'20 challenge. Integrated with Singular Health's MedVR software for virtual reality visualisation, the proposed solution has been deployed on standard edge-computing hardware in medical institutions. Depending on the scan size, the deployed solution takes between 90 and 210 s to label and segment vertebrae, including the cervical vertebrae. It is hoped that the acceleration of the segmentation and registration process will facilitate the easier preparation of future training datasets and benefit pre-surgical visualisation and planning.

Deep supervised transformer-based noise-aware network for low-dose PET denoising across varying count levels.

Azimi MS, Felfelian V, Zeraatkar N, Dadgar H, Arabi H, Zaidi H

pubmed logopapersJul 8 2025
Reducing radiation dose from PET imaging is essential to minimize cancer risks; however, it often leads to increased noise and degraded image quality, compromising diagnostic reliability. Recent advances in deep learning have shown promising results in addressing these limitations through effective denoising. However, existing networks trained on specific noise levels often fail to generalize across diverse acquisition conditions. Moreover, training multiple models for different noise levels is impractical due to data and computational constraints. This study aimed to develop a supervised Swin Transformer-based unified noise-aware (ST-UNN) network that handles diverse noise levels and reconstructs high-quality images in low-dose PET imaging. We present a Swin Transformer-based Noise-Aware Network (ST-UNN), which incorporates multiple sub-networks, each designed to address specific noise levels ranging from 1 % to 10 %. An adaptive weighting mechanism dynamically integrates the outputs of these sub-networks to achieve effective denoising. The model was trained and evaluated using PET/CT dataset encompassing the entire head and malignant lesions in the head and neck region. Performance was assessed using a combination of structural and statistical metrics, including the Structural Similarity Index (SSIM), Peak Signal-to-Noise Ratio (PSNR), Standardized Uptake Value (SUV) mean bias, SUV<sub>max</sub> bias, and Root Mean Square Error (RMSE). This comprehensive evaluation ensured reliable results for both global and localized regions within PET images. The ST-UNN consistently outperformed conventional networks, particularly in ultra-low-dose scenarios. At 1 % count level, it achieved a PSNR of 34.77, RMSE of 0.05, and SSIM of 0.97, notably surpassing the baseline networks. It also achieved the lowest SUV<sub>mean</sub> bias (0.08) and RMSE lesion (0.12) at this level. Across all count levels, ST-UNN maintained high performance and low error, demonstrating strong generalization and diagnostic integrity. ST-UNN offers a scalable, transformer-based solution for low-dose PET imaging. By dynamically integrating sub-networks, it effectively addresses noise variability and provides superior image quality, thereby advancing the capabilities of low-dose and dynamic PET imaging.

Enhancing stroke risk prediction through class balancing and data augmentation with CBDA-ResNet50.

Saleem MA, Javeed A, Akarathanawat W, Chutinet A, Suwanwela NC, Kaewplung P, Chaitusaney S, Benjapolakul W

pubmed logopapersJul 8 2025
Accurate prediction of stroke risk at an early stage is essential for timely intervention and prevention, especially given the serious health consequences and economic burden that strokes can cause. In this study, we proposed a class-balanced and data-augmented (CBDA-ResNet50) deep learning model to improve the prediction accuracy of the well-known ResNet50 architecture for stroke risk. Our approach uses advanced techniques such as class balancing and data augmentation to address common challenges in medical imaging datasets, such as class imbalance and limited training examples. In most cases, these problems lead to biased or less reliable predictions. To address these issues, the proposed model assures that the predictions are still accurate even when some stroke risk factors are absent in the data. The performance of CBDA-ResNet50 improves by using the Adam optimizer and the ReduceLROnPlateau scheduler to adjust the learning rate. The application of weighted cross entropy removes the imbalance between classes and significantly improves the results. It achieves an accuracy of 97.87% and a balanced accuracy of 98.27%, better than many of the previous best models. This shows that we can make more reliable predictions by combining modern deep-learning models with advanced data-processing techniques. CBDA-ResNet50 has the potential to be a model for early stroke prevention, aiming to improve patient outcomes and reduce healthcare costs.

Assessment of T2-weighted MRI-derived synthetic CT for the detection of suspected lumbar facet arthritis: a comparative analysis with conventional CT.

Cao G, Wang H, Xie S, Cai D, Guo J, Zhu J, Ye K, Wang Y, Xia J

pubmed logopapersJul 8 2025
We evaluated sCT generated from T2-weighted imaging (T2WI) using deep learning techniques to detect structural lesions in lumbar facet arthritis, with conventional CT as the reference standard. This single-center retrospective study included 40 patients who had lumbar MRI and CT with in 1 week (September 2020 to August 2021). A Pix2Pix-GAN framework generated CT images from MRI data, and image quality was assessed using structural similarity index (SSIM), mean absolute error (MAE), peak signal-to-noise ratio (PSNR), nd Dice similarity coefficient (DSC). Two senior radiologists evaluated 15 anatomical landmarks. Sensitivity, specificity, and accuracy for detecting bone erosion, osteosclerosis, and joint space alterations were analyzed for sCT, T2-weighted MRI, and conventional CT. Forty participants (21 men, 19 women) were enrolled, with a mean age of 39 ± 16.9 years. sCT showed strong agreement with conventional CT, with SSIM values of 0.888 for axial and 0.889 for sagittal views. PSNR and MAE values were 24.56 dB and 0.031 for axial and 23.75 dB and 0.038 for sagittal views, respectively. DSC values were 0.935 for axial and 0.876 for sagittal views. sCT showed excellent intra- and inter-reader reliability intraclass correlation coefficients (0.953-0.995 and 0.839-0.983, respectively). sCT had higher sensitivity (57.9% vs. 5.3%), specificity (98.8% vs. 84.6%), and accuracy (93.0% vs. 73.3%) for bone erosion than T2-weighted MRI and outperformed it for osteosclerosis and joint space changes. sCT outperformed conventional T2-weighted MRI in detecting structural lesions indicative of lumbar facet arthritis, with conventional CT as the reference standard.
Page 193 of 3623611 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.