Sort by:
Page 49 of 59587 results

Efficient Medical Vision-Language Alignment Through Adapting Masked Vision Models.

Lian C, Zhou HY, Liang D, Qin J, Wang L

pubmed logopapersJun 2 2025
Medical vision-language alignment through cross-modal contrastive learning shows promising performance in image-text matching tasks, such as retrieval and zero-shot classification. However, conventional cross-modal contrastive learning (CLIP-based) methods suffer from suboptimal visual representation capabilities, which also limits their effectiveness in vision-language alignment. In contrast, although the models pretrained via multimodal masked modeling struggle with direct cross-modal matching, they excel in visual representation. To address this contradiction, we propose ALTA (ALign Through Adapting), an efficient medical vision-language alignment method that utilizes only about 8% of the trainable parameters and less than 1/5 of the computational consumption required for masked record modeling. ALTA achieves superior performance in vision-language matching tasks like retrieval and zero-shot classification by adapting the pretrained vision model from masked record modeling. Additionally, we integrate temporal-multiview radiograph inputs to enhance the information consistency between radiographs and their corresponding descriptions in reports, further improving the vision-language alignment. Experimental evaluations show that ALTA outperforms the best-performing counterpart by over 4% absolute points in text-to-image accuracy and approximately 6% absolute points in image-to-text retrieval accuracy. The adaptation of vision-language models during efficient alignment also promotes better vision and language understanding. Code is publicly available at https://github.com/DopamineLcy/ALTA.

Disease-Grading Networks with Asymmetric Gaussian Distribution for Medical Imaging.

Tang W, Yang Z

pubmed logopapersJun 2 2025
Deep learning-based disease grading technologies facilitate timely medical intervention due to their high efficiency and accuracy. Recent advancements have enhanced grading performance by incorporating the ordinal relationships of disease labels. However, existing methods often assume same probability distributions for disease labels across instances within the same category, overlooking variations in label distributions. Additionally, the hyperparameters of these distributions are typically determined empirically, which may not accurately reflect the true distribution. To address these limitations, we propose a disease grading network utilizing a sample-aware asymmetric Gaussian label distribution, termed DGN-AGLD. This approach includes a variance predictor designed to learn and predict parameters that control the asymmetry of the Gaussian distribution, enabling distinct label distributions within the same category. This module can be seamlessly integrated into standard deep learning networks. Experimental results on four disease datasets validate the effectiveness and superiority of the proposed method, particularly on the IDRiD dataset, where it achieves a diabetic retinopathy accuracy of 77.67%. Furthermore, our method extends to joint disease grading tasks, yielding superior results and demonstrating significant generalization capabilities. Visual analysis indicates that our method more accurately captures the trend of disease progression by leveraging the asymmetry in label distribution. Our code is publicly available on https://github.com/ahtwq/AGNet.

Scale-Aware Super-Resolution Network With Dual Affinity Learning for Lesion Segmentation From Medical Images.

Luo L, Li Y, Chai Z, Lin H, Heng PA, Chen H

pubmed logopapersJun 1 2025
Convolutional neural networks (CNNs) have shown remarkable progress in medical image segmentation. However, the lesion segmentation remains a challenge to state-of-the-art CNN-based algorithms due to the variance in scales and shapes. On the one hand, tiny lesions are hard to delineate precisely from the medical images which are often of low resolutions. On the other hand, segmenting large-size lesions requires large receptive fields, which exacerbates the first challenge. In this article, we present a scale-aware super-resolution (SR) network to adaptively segment lesions of various sizes from low-resolution (LR) medical images. Our proposed network contains dual branches to simultaneously conduct lesion mask SR (LMSR) and lesion image SR (LISR). Meanwhile, we introduce scale-aware dilated convolution (SDC) blocks into the multitask decoders to adaptively adjust the receptive fields of the convolutional kernels according to the lesion sizes. To guide the segmentation branch to learn from richer high-resolution (HR) features, we propose a feature affinity (FA) module and a scale affinity (SA) module to enhance the multitask learning of the dual branches. On multiple challenging lesion segmentation datasets, our proposed network achieved consistent improvements compared with other state-of-the-art methods. Code will be available at: https://github.com/poiuohke/SASR_Net.

Efficient slice anomaly detection network for 3D brain MRI Volume.

Zhang Z, Mohsenzadeh Y

pubmed logopapersJun 1 2025
Current anomaly detection methods excel with benchmark industrial data but struggle with natural images and medical data due to varying definitions of 'normal' and 'abnormal.' This makes accurate identification of deviations in these fields particularly challenging. Especially for 3D brain MRI data, all the state-of-the-art models are reconstruction-based with 3D convolutional neural networks which are memory-intensive, time-consuming and producing noisy outputs that require further post-processing. We propose a framework called Simple Slice-based Network (SimpleSliceNet), which utilizes a model pre-trained on ImageNet and fine-tuned on a separate MRI dataset as a 2D slice feature extractor to reduce computational cost. We aggregate the extracted features to perform anomaly detection tasks on 3D brain MRI volumes. Our model integrates a conditional normalizing flow to calculate log likelihood of features and employs the contrastive loss to enhance anomaly detection accuracy. The results indicate improved performance, showcasing our model's remarkable adaptability and effectiveness when addressing the challenges exists in brain MRI data. In addition, for the large-scale 3D brain volumes, our model SimpleSliceNet outperforms the state-of-the-art 2D and 3D models in terms of accuracy, memory usage and time consumption. Code is available at: https://github.com/Jarvisarmy/SimpleSliceNet.

Multi-level feature fusion network for kidney disease detection.

Rehman Khan SU

pubmed logopapersJun 1 2025
Kidney irregularities pose a significant public health challenge, often leading to severe complications, yet the limited availability of nephrologists makes early detection costly and time-consuming. To address this issue, we propose a deep learning framework for automated kidney disease detection, leveraging feature fusion and sequential modeling techniques to enhance diagnostic accuracy. Our study thoroughly evaluates six pretrained models under identical experimental conditions, identifying ResNet50 and VGG19 as the highly efficient models for feature extraction due to their deep residual learning and hierarchical representations. Our proposed methodology integrates feature fusion with an inception block to extract diverse feature representations while maintaining imbalance dataset overhead. To enhance sequential learning and capture long-term dependencies in disease progression, ConvLSTM is incorporated after feature fusion. Additionally, Inception block is employed after ConvLSTM to refine hierarchical feature extraction, further strengthening the proposed model ability to leverage both spatial and temporal patterns. To validate our approach, we introduce a new named Multiple Hospital Collected (MHC-CT) dataset, consisting of 1860 tumor and 1024 normal kidney CT scans, meticulously annotated by medical experts. Our model achieves 99.60 % accuracy on this dataset, demonstrating its robustness in binary classification. Furthermore, to assess its generalization capability, we evaluate the model on a publicly available benchmark multiclass CT scan dataset, achieving 91.31 % accuracy. The superior performance is attributed to the effective feature fusion using inception blocks and the sequential learning capabilities of ConvLSTM, which together enhance spatial and temporal feature representations. These results highlight the efficacy of the proposed framework in automating kidney disease detection, providing a reliable, and efficient solution for clinical decision-making. https://github.com/VS-EYE/KidneyDiseaseDetection.git.

PEDRA-EFB0: colorectal cancer prognostication using deep learning with patch embeddings and dual residual attention.

Zhao Z, Wang H, Wu D, Zhu Q, Tan X, Hu S, Ge Y

pubmed logopapersJun 1 2025
In computer-aided diagnosis systems, precise feature extraction from CT scans of colorectal cancer using deep learning is essential for effective prognosis. However, existing convolutional neural networks struggle to capture long-range dependencies and contextual information, resulting in incomplete CT feature extraction. To address this, the PEDRA-EFB0 architecture integrates patch embeddings and a dual residual attention mechanism for enhanced feature extraction and survival prediction in colorectal cancer CT scans. A patch embedding method processes CT scans into patches, creating positional features for global representation and guiding spatial attention computation. Additionally, a dual residual attention mechanism during the upsampling stage selectively combines local and global features, enhancing CT data utilization. Furthermore, this paper proposes a feature selection algorithm that combines autoencoders and entropy technology, encoding and compressing high-dimensional data to reduce redundant information and using entropy to assess the importance of features, thereby achieving precise feature selection. Experimental results indicate the PEDRA-EFB0 model outperforms traditional methods on colorectal cancer CT metrics, notably in C-index, BS, MCC, and AUC, enhancing survival prediction accuracy. Our code is freely available at https://github.com/smile0208z/PEDRA .

Cross-site Validation of AI Segmentation and Harmonization in Breast MRI.

Huang Y, Leotta NJ, Hirsch L, Gullo RL, Hughes M, Reiner J, Saphier NB, Myers KS, Panigrahi B, Ambinder E, Di Carlo P, Grimm LJ, Lowell D, Yoon S, Ghate SV, Parra LC, Sutton EJ

pubmed logopapersJun 1 2025
This work aims to perform a cross-site validation of automated segmentation for breast cancers in MRI and to compare the performance to radiologists. A three-dimensional (3D) U-Net was trained to segment cancers in dynamic contrast-enhanced axial MRIs using a large dataset from Site 1 (n = 15,266; 449 malignant and 14,817 benign). Performance was validated on site-specific test data from this and two additional sites, and common publicly available testing data. Four radiologists from each of the three clinical sites provided two-dimensional (2D) segmentations as ground truth. Segmentation performance did not differ between the network and radiologists on the test data from Sites 1 and 2 or the common public data (median Dice score Site 1, network 0.86 vs. radiologist 0.85, n = 114; Site 2, 0.91 vs. 0.91, n = 50; common: 0.93 vs. 0.90). For Site 3, an affine input layer was fine-tuned using segmentation labels, resulting in comparable performance between the network and radiologist (0.88 vs. 0.89, n = 42). Radiologist performance differed on the common test data, and the network numerically outperformed 11 of the 12 radiologists (median Dice: 0.85-0.94, n = 20). In conclusion, a deep network with a novel supervised harmonization technique matches radiologists' performance in MRI tumor segmentation across clinical sites. We make code and weights publicly available to promote reproducible AI in radiology.

Robust whole-body PET image denoising using 3D diffusion models: evaluation across various scanners, tracers, and dose levels.

Yu B, Ozdemir S, Dong Y, Shao W, Pan T, Shi K, Gong K

pubmed logopapersJun 1 2025
Whole-body PET imaging plays an essential role in cancer diagnosis and treatment but suffers from low image quality. Traditional deep learning-based denoising methods work well for a specific acquisition but are less effective in handling diverse PET protocols. In this study, we proposed and validated a 3D Denoising Diffusion Probabilistic Model (3D DDPM) as a robust and universal solution for whole-body PET image denoising. The proposed 3D DDPM gradually injected noise into the images during the forward diffusion phase, allowing the model to learn to reconstruct the clean data during the reverse diffusion process. A 3D convolutional network was trained using high-quality data from the Biograph Vision Quadra PET/CT scanner to generate the score function, enabling the model to capture accurate PET distribution information extracted from the total-body datasets. The trained 3D DDPM was evaluated on datasets from four scanners, four tracer types, and six dose levels representing a broad spectrum of clinical scenarios. The proposed 3D DDPM consistently outperformed 2D DDPM, 3D UNet, and 3D GAN, demonstrating its superior denoising performance across all tested conditions. Additionally, the model's uncertainty maps exhibited lower variance, reflecting its higher confidence in its outputs. The proposed 3D DDPM can effectively handle various clinical settings, including variations in dose levels, scanners, and tracers, establishing it as a promising foundational model for PET image denoising. The trained 3D DDPM model of this work can be utilized off the shelf by researchers as a whole-body PET image denoising solution. The code and model are available at https://github.com/Miche11eU/PET-Image-Denoising-Using-3D-Diffusion-Model .

UniBrain: Universal Brain MRI diagnosis with hierarchical knowledge-enhanced pre-training.

Lei J, Dai L, Jiang H, Wu C, Zhang X, Zhang Y, Yao J, Xie W, Zhang Y, Li Y, Zhang Y, Wang Y

pubmed logopapersJun 1 2025
Magnetic Resonance Imaging (MRI) has become a pivotal tool in diagnosing brain diseases, with a wide array of computer-aided artificial intelligence methods being proposed to enhance diagnostic accuracy. However, early studies were often limited by small-scale datasets and a narrow range of disease types, which posed challenges in model generalization. This study presents UniBrain, a hierarchical knowledge-enhanced pre-training framework designed for universal brain MRI diagnosis. UniBrain leverages a large-scale dataset comprising 24,770 imaging-report pairs from routine diagnostics for pre-training. Unlike previous approaches that either focused solely on visual representation learning or used brute-force alignment between vision and language, the framework introduces a hierarchical alignment mechanism. This mechanism extracts structured knowledge from free-text clinical reports at multiple granularities, enabling vision-language alignment at both the sequence and case levels, thereby significantly improving feature learning efficiency. A coupled vision-language perception module is further employed for text-guided multi-label classification, which facilitates zero-shot evaluation and fine-tuning of downstream tasks without modifying the model architecture. UniBrain is validated on both in-domain and out-of-domain datasets, consistently surpassing existing state-of-the-art diagnostic models and demonstrating performance on par with radiologists in specific disease categories. It shows strong generalization capabilities across diverse tasks, highlighting its potential for broad clinical application. The code is available at https://github.com/ljy19970415/UniBrain.

GAN-based synthetic FDG PET images from T1 brain MRI can serve to improve performance of deep unsupervised anomaly detection models.

Zotova D, Pinon N, Trombetta R, Bouet R, Jung J, Lartizien C

pubmed logopapersJun 1 2025
Research in the cross-modal medical image translation domain has been very productive over the past few years in tackling the scarce availability of large curated multi-modality datasets with the promising performance of GAN-based architectures. However, only a few of these studies assessed task-based related performance of these synthetic data, especially for the training of deep models. We design and compare different GAN-based frameworks for generating synthetic brain[18F]fluorodeoxyglucose (FDG) PET images from T1 weighted MRI data. We first perform standard qualitative and quantitative visual quality evaluation. Then, we explore further impact of using these fake PET data in the training of a deep unsupervised anomaly detection (UAD) model designed to detect subtle epilepsy lesions in T1 MRI and FDG PET images. We introduce novel diagnostic task-oriented quality metrics of the synthetic FDG PET data tailored to our unsupervised detection task, then use these fake data to train a use case UAD model combining a deep representation learning based on siamese autoencoders with a OC-SVM density support estimation model. This model is trained on normal subjects only and allows the detection of any variation from the pattern of the normal population. We compare the detection performance of models trained on 35 paired real MR T1 of normal subjects paired either on 35 true PET images or on 35 synthetic PET images generated from the best performing generative models. Performance analysis is conducted on 17 exams of epilepsy patients undergoing surgery. The best performing GAN-based models allow generating realistic fake PET images of control subject with SSIM and PSNR values around 0.9 and 23.8, respectively and in distribution (ID) with regard to the true control dataset. The best UAD model trained on these synthetic normative PET data allows reaching 74% sensitivity. Our results confirm that GAN-based models are the best suited for MR T1 to FDG PET translation, outperforming transformer or diffusion models. We also demonstrate the diagnostic value of these synthetic data for the training of UAD models and evaluation on clinical exams of epilepsy patients. Our code and the normative image dataset are available.
Page 49 of 59587 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.