Sort by:
Page 7 of 11102 results

Scale-Aware Super-Resolution Network With Dual Affinity Learning for Lesion Segmentation From Medical Images.

Luo L, Li Y, Chai Z, Lin H, Heng PA, Chen H

pubmed logopapersJun 1 2025
Convolutional neural networks (CNNs) have shown remarkable progress in medical image segmentation. However, the lesion segmentation remains a challenge to state-of-the-art CNN-based algorithms due to the variance in scales and shapes. On the one hand, tiny lesions are hard to delineate precisely from the medical images which are often of low resolutions. On the other hand, segmenting large-size lesions requires large receptive fields, which exacerbates the first challenge. In this article, we present a scale-aware super-resolution (SR) network to adaptively segment lesions of various sizes from low-resolution (LR) medical images. Our proposed network contains dual branches to simultaneously conduct lesion mask SR (LMSR) and lesion image SR (LISR). Meanwhile, we introduce scale-aware dilated convolution (SDC) blocks into the multitask decoders to adaptively adjust the receptive fields of the convolutional kernels according to the lesion sizes. To guide the segmentation branch to learn from richer high-resolution (HR) features, we propose a feature affinity (FA) module and a scale affinity (SA) module to enhance the multitask learning of the dual branches. On multiple challenging lesion segmentation datasets, our proposed network achieved consistent improvements compared with other state-of-the-art methods. Code will be available at: https://github.com/poiuohke/SASR_Net.

GAN Inversion for Data Augmentation to Improve Colonoscopy Lesion Classification.

Golhar MV, Bobrow TL, Ngamruengphong S, Durr NJ

pubmed logopapersJun 1 2025
A major challenge in applying deep learning to medical imaging is the paucity of annotated data. This study explores the use of synthetic images for data augmentation to address the challenge of limited annotated data in colonoscopy lesion classification. We demonstrate that synthetic colonoscopy images generated by Generative Adversarial Network (GAN) inversion can be used as training data to improve polyp classification performance by deep learning models. We invert pairs of images with the same label to a semantically rich and disentangled latent space and manipulate latent representations to produce new synthetic images. These synthetic images maintain the same label as the input pairs. We perform image modality translation (style transfer) between white light and narrow-band imaging (NBI). We also generate realistic synthetic lesion images by interpolating between original training images to increase the variety of lesion shapes in the training dataset. Our experiments show that GAN inversion can produce multiple colonoscopy data augmentations that improve the downstream polyp classification performance by 2.7% in F1-score and 4.9% in sensitivity over other methods, including state-of-the-art data augmentation. Testing on unseen out-of-domain data also showcased an improvement of 2.9% in F1-score and 2.7% in sensitivity. This approach outperforms other colonoscopy data augmentation techniques and does not require re-training multiple generative models. It also effectively uses information from diverse public datasets, even those not specifically designed for the targeted downstream task, resulting in strong domain generalizability. Project code and model: https://github.com/DurrLab/GAN-Inversion.

TDSF-Net: Tensor Decomposition-Based Subspace Fusion Network for Multimodal Medical Image Classification.

Zhang Y, Xu G, Zhao M, Wang H, Shi F, Chen S

pubmed logopapersJun 1 2025
Data from multimodalities bring complementary information for deep learning-based medical image classification models. However, data fusion methods simply concatenating features or images barely consider the correlations or complementarities among different modalities and easily suffer from exponential growth in dimensions and computational complexity when the modality increases. Consequently, this article proposes a subspace fusion network with tensor decomposition (TD) to heighten multimodal medical image classification. We first introduce a Tucker low-rank TD module to map the high-level dimensional tensor to the low-rank subspace, reducing the redundancy caused by multimodal data and high-dimensional features. Then, a cross-tensor attention mechanism is utilized to fuse features from the subspace into a high-dimension tensor, enhancing the representation ability of extracted features and constructing the interaction information among components in the subspace. Extensive comparison experiments with state-of-the-art (SOTA) methods are conducted on one self-established and three public multimodal medical image datasets, verifying the effectiveness and generalization ability of the proposed method. The code is available at https://github.com/1zhang-yi/TDSFNet.

Deep learning-based MRI reconstruction with Artificial Fourier Transform Network (AFTNet).

Yang Y, Zhang Y, Li Z, Tian JS, Dagommer M, Guo J

pubmed logopapersJun 1 2025
Deep complex-valued neural networks (CVNNs) provide a powerful way to leverage complex number operations and representations and have succeeded in several phase-based applications. However, previous networks have not fully explored the impact of complex-valued networks in the frequency domain. Here, we introduce a unified complex-valued deep learning framework - Artificial Fourier Transform Network (AFTNet) - which combines domain-manifold learning and CVNNs. AFTNet can be readily used to solve image inverse problems in domain transformation, especially for accelerated magnetic resonance imaging (MRI) reconstruction and other applications. While conventional methods typically utilize magnitude images or treat the real and imaginary components of k-space data as separate channels, our approach directly processes raw k-space data in the frequency domain, utilizing complex-valued operations. This allows for a mapping between the frequency (k-space) and image domain to be determined through cross-domain learning. We show that AFTNet achieves superior accelerated MRI reconstruction compared to existing approaches. Furthermore, our approach can be applied to various tasks, such as denoised magnetic resonance spectroscopy (MRS) reconstruction and datasets with various contrasts. The AFTNet presented here is a valuable preprocessing component for different preclinical studies and provides an innovative alternative for solving inverse problems in imaging and spectroscopy. The code is available at: https://github.com/yanting-yang/AFT-Net.

Automated Ensemble Multimodal Machine Learning for Healthcare.

Imrie F, Denner S, Brunschwig LS, Maier-Hein K, van der Schaar M

pubmed logopapersJun 1 2025
The application of machine learning in medicine and healthcare has led to the creation of numerous diagnostic and prognostic models. However, despite their success, current approaches generally issue predictions using data from a single modality. This stands in stark contrast with clinician decision-making which employs diverse information from multiple sources. While several multimodal machine learning approaches exist, significant challenges in developing multimodal systems remain that are hindering clinical adoption. In this paper, we introduce a multimodal framework, AutoPrognosis-M, that enables the integration of structured clinical (tabular) data and medical imaging using automated machine learning. AutoPrognosis-M incorporates 17 imaging models, including convolutional neural networks and vision transformers, and three distinct multimodal fusion strategies. In an illustrative application using a multimodal skin lesion dataset, we highlight the importance of multimodal machine learning and the power of combining multiple fusion strategies using ensemble learning. We have open-sourced our framework as a tool for the community and hope it will accelerate the uptake of multimodal machine learning in healthcare and spur further innovation.

Cross-site Validation of AI Segmentation and Harmonization in Breast MRI.

Huang Y, Leotta NJ, Hirsch L, Gullo RL, Hughes M, Reiner J, Saphier NB, Myers KS, Panigrahi B, Ambinder E, Di Carlo P, Grimm LJ, Lowell D, Yoon S, Ghate SV, Parra LC, Sutton EJ

pubmed logopapersJun 1 2025
This work aims to perform a cross-site validation of automated segmentation for breast cancers in MRI and to compare the performance to radiologists. A three-dimensional (3D) U-Net was trained to segment cancers in dynamic contrast-enhanced axial MRIs using a large dataset from Site 1 (n = 15,266; 449 malignant and 14,817 benign). Performance was validated on site-specific test data from this and two additional sites, and common publicly available testing data. Four radiologists from each of the three clinical sites provided two-dimensional (2D) segmentations as ground truth. Segmentation performance did not differ between the network and radiologists on the test data from Sites 1 and 2 or the common public data (median Dice score Site 1, network 0.86 vs. radiologist 0.85, n = 114; Site 2, 0.91 vs. 0.91, n = 50; common: 0.93 vs. 0.90). For Site 3, an affine input layer was fine-tuned using segmentation labels, resulting in comparable performance between the network and radiologist (0.88 vs. 0.89, n = 42). Radiologist performance differed on the common test data, and the network numerically outperformed 11 of the 12 radiologists (median Dice: 0.85-0.94, n = 20). In conclusion, a deep network with a novel supervised harmonization technique matches radiologists' performance in MRI tumor segmentation across clinical sites. We make code and weights publicly available to promote reproducible AI in radiology.

PEDRA-EFB0: colorectal cancer prognostication using deep learning with patch embeddings and dual residual attention.

Zhao Z, Wang H, Wu D, Zhu Q, Tan X, Hu S, Ge Y

pubmed logopapersJun 1 2025
In computer-aided diagnosis systems, precise feature extraction from CT scans of colorectal cancer using deep learning is essential for effective prognosis. However, existing convolutional neural networks struggle to capture long-range dependencies and contextual information, resulting in incomplete CT feature extraction. To address this, the PEDRA-EFB0 architecture integrates patch embeddings and a dual residual attention mechanism for enhanced feature extraction and survival prediction in colorectal cancer CT scans. A patch embedding method processes CT scans into patches, creating positional features for global representation and guiding spatial attention computation. Additionally, a dual residual attention mechanism during the upsampling stage selectively combines local and global features, enhancing CT data utilization. Furthermore, this paper proposes a feature selection algorithm that combines autoencoders and entropy technology, encoding and compressing high-dimensional data to reduce redundant information and using entropy to assess the importance of features, thereby achieving precise feature selection. Experimental results indicate the PEDRA-EFB0 model outperforms traditional methods on colorectal cancer CT metrics, notably in C-index, BS, MCC, and AUC, enhancing survival prediction accuracy. Our code is freely available at https://github.com/smile0208z/PEDRA .

Robust whole-body PET image denoising using 3D diffusion models: evaluation across various scanners, tracers, and dose levels.

Yu B, Ozdemir S, Dong Y, Shao W, Pan T, Shi K, Gong K

pubmed logopapersJun 1 2025
Whole-body PET imaging plays an essential role in cancer diagnosis and treatment but suffers from low image quality. Traditional deep learning-based denoising methods work well for a specific acquisition but are less effective in handling diverse PET protocols. In this study, we proposed and validated a 3D Denoising Diffusion Probabilistic Model (3D DDPM) as a robust and universal solution for whole-body PET image denoising. The proposed 3D DDPM gradually injected noise into the images during the forward diffusion phase, allowing the model to learn to reconstruct the clean data during the reverse diffusion process. A 3D convolutional network was trained using high-quality data from the Biograph Vision Quadra PET/CT scanner to generate the score function, enabling the model to capture accurate PET distribution information extracted from the total-body datasets. The trained 3D DDPM was evaluated on datasets from four scanners, four tracer types, and six dose levels representing a broad spectrum of clinical scenarios. The proposed 3D DDPM consistently outperformed 2D DDPM, 3D UNet, and 3D GAN, demonstrating its superior denoising performance across all tested conditions. Additionally, the model's uncertainty maps exhibited lower variance, reflecting its higher confidence in its outputs. The proposed 3D DDPM can effectively handle various clinical settings, including variations in dose levels, scanners, and tracers, establishing it as a promising foundational model for PET image denoising. The trained 3D DDPM model of this work can be utilized off the shelf by researchers as a whole-body PET image denoising solution. The code and model are available at https://github.com/Miche11eU/PET-Image-Denoising-Using-3D-Diffusion-Model .

UniBrain: Universal Brain MRI diagnosis with hierarchical knowledge-enhanced pre-training.

Lei J, Dai L, Jiang H, Wu C, Zhang X, Zhang Y, Yao J, Xie W, Zhang Y, Li Y, Zhang Y, Wang Y

pubmed logopapersJun 1 2025
Magnetic Resonance Imaging (MRI) has become a pivotal tool in diagnosing brain diseases, with a wide array of computer-aided artificial intelligence methods being proposed to enhance diagnostic accuracy. However, early studies were often limited by small-scale datasets and a narrow range of disease types, which posed challenges in model generalization. This study presents UniBrain, a hierarchical knowledge-enhanced pre-training framework designed for universal brain MRI diagnosis. UniBrain leverages a large-scale dataset comprising 24,770 imaging-report pairs from routine diagnostics for pre-training. Unlike previous approaches that either focused solely on visual representation learning or used brute-force alignment between vision and language, the framework introduces a hierarchical alignment mechanism. This mechanism extracts structured knowledge from free-text clinical reports at multiple granularities, enabling vision-language alignment at both the sequence and case levels, thereby significantly improving feature learning efficiency. A coupled vision-language perception module is further employed for text-guided multi-label classification, which facilitates zero-shot evaluation and fine-tuning of downstream tasks without modifying the model architecture. UniBrain is validated on both in-domain and out-of-domain datasets, consistently surpassing existing state-of-the-art diagnostic models and demonstrating performance on par with radiologists in specific disease categories. It shows strong generalization capabilities across diverse tasks, highlighting its potential for broad clinical application. The code is available at https://github.com/ljy19970415/UniBrain.

GAN-based synthetic FDG PET images from T1 brain MRI can serve to improve performance of deep unsupervised anomaly detection models.

Zotova D, Pinon N, Trombetta R, Bouet R, Jung J, Lartizien C

pubmed logopapersJun 1 2025
Research in the cross-modal medical image translation domain has been very productive over the past few years in tackling the scarce availability of large curated multi-modality datasets with the promising performance of GAN-based architectures. However, only a few of these studies assessed task-based related performance of these synthetic data, especially for the training of deep models. We design and compare different GAN-based frameworks for generating synthetic brain[18F]fluorodeoxyglucose (FDG) PET images from T1 weighted MRI data. We first perform standard qualitative and quantitative visual quality evaluation. Then, we explore further impact of using these fake PET data in the training of a deep unsupervised anomaly detection (UAD) model designed to detect subtle epilepsy lesions in T1 MRI and FDG PET images. We introduce novel diagnostic task-oriented quality metrics of the synthetic FDG PET data tailored to our unsupervised detection task, then use these fake data to train a use case UAD model combining a deep representation learning based on siamese autoencoders with a OC-SVM density support estimation model. This model is trained on normal subjects only and allows the detection of any variation from the pattern of the normal population. We compare the detection performance of models trained on 35 paired real MR T1 of normal subjects paired either on 35 true PET images or on 35 synthetic PET images generated from the best performing generative models. Performance analysis is conducted on 17 exams of epilepsy patients undergoing surgery. The best performing GAN-based models allow generating realistic fake PET images of control subject with SSIM and PSNR values around 0.9 and 23.8, respectively and in distribution (ID) with regard to the true control dataset. The best UAD model trained on these synthetic normative PET data allows reaching 74% sensitivity. Our results confirm that GAN-based models are the best suited for MR T1 to FDG PET translation, outperforming transformer or diffusion models. We also demonstrate the diagnostic value of these synthetic data for the training of UAD models and evaluation on clinical exams of epilepsy patients. Our code and the normative image dataset are available.
Page 7 of 11102 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.