Sort by:
Page 20 of 35349 results

Multimodal medical image-to-image translation via variational autoencoder latent space mapping.

Liang Z, Cheng M, Ma J, Hu Y, Li S, Tian X

pubmed logopapersMay 29 2025
Medical image translation has become an essential tool in modern radiotherapy, providing complementary information for target delineation and dose calculation. However, current approaches are constrained by their modality-specific nature, requiring separate model training for each pair of imaging modalities. This limitation hinders the efficient deployment of comprehensive multimodal solutions in clinical practice. To develop a unified image translation method using variational autoencoder (VAE) latent space mapping, which enables flexible conversion between different medical imaging modalities to meet clinical demands. We propose a three-stage approach to construct a unified image translation model. Initially, a VAE is trained to learn a shared latent space for various medical images. A stacked bidirectional transformer is subsequently utilized to learn the mapping between different modalities within the latent space under the guidance of the image modality. Finally, the VAE decoder is fine-tuned to improve image quality. Our internal dataset collected paired imaging data from 87 head and neck cases, with each case containing cone beam computed tomography (CBCT), computed tomography (CT), MR T1c, and MR T2W images. The effectiveness of this strategy is quantitatively evaluated on our internal dataset and a public dataset by the mean absolute error (MAE), peak-signal-to-noise ratio (PSNR), and structural similarity index (SSIM). Additionally, the dosimetry characteristics of the synthetic CT images are evaluated, and subjective quality assessments of the synthetic MR images are conducted to determine their clinical value. The VAE with the Kullback‒Leibler (KL)-16 image tokenizer demonstrates superior image reconstruction ability, achieving a Fréchet inception distance (FID) of 4.84, a PSNR of 32.80 dB, and an SSIM of 92.33%. In synthetic CT tasks, the model shows greater accuracy in intramodality translations than in cross-modality translations, as evidenced by an MAE of 21.60 ± 8.80 Hounsfield unit (HU) in the CBCT-to-CT task and 45.23 ± 13.21 HU/47.55 ± 13.88 in the MR T1c/T2w-to-CT tasks. For the cross-contrast MR translation tasks, the results are very close, with mean PSNR and SSIM values of 26.33 ± 1.36 dB and 85.21% ± 2.21%, respectively, for the T1c-to-T2w translation and 26.03 ± 1.67 dB and 85.73% ± 2.66%, respectively, for the T2w-to-T1c translation. Dosimetric results indicate that all the gamma pass rates for synthetic CTs are higher than 99% for photon intensity-modulated radiation therapy (IMRT) planning. However, the subjective quality assessment scores for synthetic MR images are lower than those for real MR images. The proposed three-stage approach successfully develops a unified image translation model that can effectively handle a wide range of medical image translation tasks. This flexibility and effectiveness make it a valuable tool for clinical applications.

Enhanced Pelvic CT Segmentation via Deep Learning: A Study on Loss Function Effects.

Ghaedi E, Asadi A, Hosseini SA, Arabi H

pubmed logopapersMay 29 2025
Effective radiotherapy planning requires precise delineation of organs at risk (OARs), but the traditional manual method is laborious and subject to variability. This study explores using convolutional neural networks (CNNs) for automating OAR segmentation in pelvic CT images, focusing on the bladder, prostate, rectum, and femoral heads (FHs) as an efficient alternative to manual segmentation. Utilizing the Medical Open Network for AI (MONAI) framework, we implemented and compared U-Net, ResU-Net, SegResNet, and Attention U-Net models and explored different loss functions to enhance segmentation accuracy. Our study involved 240 patients for prostate segmentation and 220 patients for the other organs. The models' performance was evaluated using metrics such as the Dice similarity coefficient (DSC), Jaccard index (JI), and the 95th percentile Hausdorff distance (95thHD), benchmarking the results against expert segmentation masks. SegResNet outperformed all models, achieving DSC values of 0.951 for the bladder, 0.829 for the prostate, 0.860 for the rectum, 0.979 for the left FH, and 0.985 for the right FH (p < 0.05 vs. U-Net and ResU-Net). Attention U-Net also excelled, particularly for bladder and rectum segmentation. Experiments with loss functions on SegResNet showed that Dice loss consistently delivered optimal or equivalent performance across OARs, while DiceCE slightly enhanced prostate segmentation (DSC = 0.845, p = 0.0138). These results indicate that advanced CNNs, especially SegResNet, paired with optimized loss functions, provide a reliable, efficient alternative to manual methods, promising improved precision in radiotherapy planning.

A combined attention mechanism for brain tumor segmentation of lower-grade glioma in magnetic resonance images.

Hedibi H, Beladgham M, Bouida A

pubmed logopapersMay 29 2025
Low-grade gliomas (LGGs) are among the most problematic brain tumors to reliably segment in FLAIR MRI, and effective delineation of these lesions is critical for clinical diagnosis, treatment planning, and patient monitoring. Nevertheless, conventional U-Net-based approaches usually suffer from the loss of critical structural details owing to repetitive down-sampling, while the encoder features often retain irrelevant information that is not properly utilized by the decoder. To solve these challenges, this paper offers a dual-attention U-shaped design, named ECASE-Unet, which seamlessly integrates Efficient Channel Attention (ECA) and Squeeze-and-Excitation (SE) blocks in both the encoder and decoder stages. By selectively recalibrating channel-wise information, the model increases diagnostically significant regions of interest and reduces noise. Furthermore, dilated convolutions are introduced at the bottleneck layer to capture multi-scale contextual cues without inflating computational complexity, and dropout regularization is systematically applied to prevent overfitting on heterogeneous data. Experimental results on the Kaggle Low-Grade-Glioma dataset suggest that ECASE-Unet greatly outperforms previous segmentation algorithms, reaching a Dice coefficient of 0.9197 and an Intersection over Union (IoU) of 0.8521. Comprehensive ablation studies further reveal that integrating ECA and SE modules delivers complementing benefits, supporting the model's robust efficacy in precisely identifying LGG boundaries. These findings underline the potential of ECASE-Unet to expedite clinical operations and improve patient outcomes. Future work will focus on improving the model's applicability to new MRI modalities and studying the integration of clinical characteristics for a more comprehensive characterization of brain tumors.

Exploring best-performing radiomic features with combined multilevel discrete wavelet decompositions for multiclass COVID-19 classification using chest X-ray images.

Özcan H

pubmed logopapersMay 29 2025
Discrete wavelet transforms have been applied in many machine learning models for the analysis of COVID-19; however, little is known about the impact of combined multilevel wavelet decompositions for the disease identification. This study proposes a computer-aided diagnosis system for addressing the combined multilevel effects of multiscale radiomic features on multiclass COVID-19 classification using chest X-ray images. A two-level discrete wavelet transform was applied to an optimal region of interest to obtain multiscale decompositions. Both approximation and detail coefficients were extensively investigated in varying frequency bands through 1240 experimental models. High dimensionality in the feature space was managed using a proposed filter- and wrapper-based feature selection approach. A comprehensive comparison was conducted between the bands and features to explore best-performing ensemble algorithm models. The results indicated that incorporating multilevel decompositions could lead to improved model performance. An inclusive region of interest, encompassing both lungs and the mediastinal regions, was identified to enhance feature representation. The light gradient-boosting machine, applied on combined bands with the features of basic, gray-level, Gabor, histogram of oriented gradients and local binary patterns, achieved the highest weighted precision, sensitivity, specificity, and accuracy of 97.50 %, 97.50 %, 98.75 %, and 97.50 %, respectively. The COVID-19-versus-the-rest receiver operating characteristic area under the curve was 0.9979. These results underscore the potential of combining decomposition levels with the original signals and employing an inclusive region of interest for effective COVID-19 detection, while the feature selection and training processes remain efficient within a practical computational time.

Gaussian random fields as an abstract representation of patient metadata for multimodal medical image segmentation.

Cassidy B, McBride C, Kendrick C, Reeves ND, Pappachan JM, Raad S, Yap MH

pubmed logopapersMay 29 2025
Growing rates of chronic wound occurrence, especially in patients with diabetes, has become a recent concerning trend. Chronic wounds are difficult and costly to treat, and have become a serious burden on health care systems worldwide. Innovative deep learning methods for the detection and monitoring of such wounds have the potential to reduce the impact to patients and clinicians. We present a novel multimodal segmentation method which allows for the introduction of patient metadata into the training workflow whereby the patient data are expressed as Gaussian random fields. Our results indicate that the proposed method improved performance when utilising multiple models, each trained on different metadata categories. Using the Diabetic Foot Ulcer Challenge 2022 test set, when compared to the baseline results (intersection over union = 0.4670, Dice similarity coefficient = 0.5908) we demonstrate improvements of +0.0220 and +0.0229 for intersection over union and Dice similarity coefficient respectively. This paper presents the first study to focus on integrating patient data into a chronic wound segmentation workflow. Our results show significant performance gains when training individual models using specific metadata categories, followed by average merging of prediction masks using distance transforms. All source code for this study is available at: https://github.com/mmu-dermatology-research/multimodal-grf.

Automated classification of midpalatal suture maturation stages from CBCTs using an end-to-end deep learning framework.

Milani OH, Mills L, Nikho A, Tliba M, Allareddy V, Ansari R, Cetin AE, Elnagar MH

pubmed logopapersMay 29 2025
Accurate classification of midpalatal suture maturation stages is critical for orthodontic diagnosis, treatment planning, and the assessment of maxillary growth. Cone Beam Computed Tomography (CBCT) imaging offers detailed insights into this craniofacial structure but poses unique challenges for deep learning image recognition model design due to its high dimensionality, noise artifacts, and variability in image quality. To address these challenges, we propose a novel technique that highlights key image features through a simple filtering process to improve image clarity prior to analysis, thereby enhancing the learning process and better aligning with the distribution of the input data domain. Our preprocessing steps include region-of-interest extraction, followed by high-pass and Sobel filtering for emphasis of low-level features. The feature extraction integrates Convolutional Neural Networks (CNN) architectures, such as EfficientNet and ResNet18, alongside our novel Multi-Filter Convolutional Residual Attention Network (MFCRAN) enhanced with Discrete Cosine Transform (DCT) layers. Moreover, to better capture the inherent order within the data classes, we augment the supervised training process with a ranking loss by attending to the relationship within the label domain. Furthermore, to adhere to diagnostic constraints while training the model, we introduce a tailored data augmentation strategy to improve classification accuracy and robustness. In order to validate our method, we employed a k-fold cross-validation protocol on a private dataset comprising 618 CBCT images, annotated into five stages (A, B, C, D, and E) by expert evaluators. The experimental results demonstrate the effectiveness of our proposed approach, achieving the highest classification accuracy of 79.02%, significantly outperforming competing architectures, which achieved accuracies ranging from 71.87 to 78.05%. This work introduces a novel and fully automated framework for midpalatal suture maturation classification, marking a substantial advancement in orthodontic diagnostics and treatment planning.

CT-denoimer: efficient contextual transformer network for low-dose CT denoising.

Zhang Y, Xu F, Zhang R, Guo Y, Wang H, Wei B, Ma F, Meng J, Liu J, Lu H, Chen Y

pubmed logopapersMay 29 2025
Low-dose computed tomography (LDCT) effectively reduces radiation exposure to patients, but introduces severe noise artifacts that affect diagnostic accuracy. Recently, Transformer-based network architectures have been widely applied to LDCT image denoising, generally achieving superior results compared to traditional convolutional methods. However, these methods are often hindered by high computational costs and struggles in capturing complex local contextual features, which negatively impact denoising performance. In this work, we propose CT-Denoimer, an efficient CT Denoising Transformer network that captures both global correlations and intricate, spatially varying local contextual details in CT images, enabling the generation of high-quality images. The core of our framework is a Transformer module that consists of two key components: the Multi-Dconv head Transposed Attention (MDTA) and the Mixed Contextual Feed-forward Network (MCFN). The MDTA block captures global correlations in the image with linear computational complexity, while the MCFN block manages multi-scale local contextual information, both static and dynamic, through a series of Enhanced Contextual Transformer (eCoT) modules. In addition, we incorporate Operation-Wise Attention Layers (OWALs) to enable collaborative refinement in the proposed CT-Denoimer, enhancing its ability to more effectively handle complex and varying noise patterns in LDCT images. Extensive experimental validation on both the AAPM-Mayo public dataset and a real-world clinical dataset demonstrated the state-of-the-art performance of the proposed CT-Denoimer. It achieved a peak signal-to-noise ratio (PSNR) of 33.681 dB, a structural similarity index measure (SSIM) of 0.921, an information fidelity criterion (IFC) of 2.857 and a visual information fidelity (VIF) of 0.349. Subjective assessment by radiologists gave an average score of 4.39, confirming its clinical applicability and clear advantages over existing methods. This study presents an innovative CT denoising Transformer network that sets a new benchmark in LDCT image denoising, excelling in both noise reduction and fine structure preservation.

ADC-MambaNet: A Lightweight U-Shaped Architecture with Mamba and Multi-Dimensional Priority Attention for Medical Image Segmentation.

Nguyen TN, Ho QH, Nguyen VQ, Pham VT, Tran TT

pubmed logopapersMay 29 2025
Medical image segmentation is becoming a growing crucial step in assisting with disease detection and diagnosis. However, medical images often exhibit complex structures and textures, resulting in the need for highly complex methods. Particularly, when Deep Learning methods are utilized, they often require large-scale pretraining, leading to significant memory demands and increased computational costs. The well-known Convolutional Neural Networks (CNNs) have become the backbone of medical image segmentation tasks thanks to their effective feature extraction abilities. However, they often struggle to capture global context due to the limited sizes of their kernels. To address this, various Transformer-based models have been introduced to learn long-range dependencies through self-attention mechanisms. However, these architectures typically incur relatively high computational complexity.&#xD;Methods: To address the aforementioned challenges, we propose a lightweight and computationally efficient model named ADC-MambaNet, which combines the conventional Depthwise Convolutional layers with the Mamba algorithm that can address the computational complexity of Transformers. In the proposed model, a new feature extractor named Harmonious Mamba-Convolution (HMC) block, and the Multi-Dimensional Priority Attention (MDPA) block have been designed. These blocks enhance the feature extraction process, thereby improving the overall performance of the model. In particular, the mechanisms enable the model to effectively capture local and global patterns from the feature maps while keeping the computational costs low. A novel loss function called the Balanced Normalized Cross Entropy is also introduced, bringing promising performance compared to other losses. Evaluations on five public medical image datasets: ISIC 2018 Lesion Segmentation, PH2, Data Science Bowl 2018, GlaS, and Lung X-ray demonstrate that ADC-MambaNet achieves higher evaluation scores while maintaining compact parameters and low computational complexity.&#xD;Conclusion: ADC-MambaNet offers a promising solution for accurate and efficient medical image segmentation, especially in resource-limited or edge-computing environments. Implementation code will be publicly accessible at: https://github.com/nqnguyen812/mambaseg-model.

RNN-AHF Framework: Enhancing Multi-focal Nature of Hypoxic Ischemic Encephalopathy Lesion Region in MRI Image Using Optimized Rough Neural Network Weight and Anti-Homomorphic Filter.

Thangeswari M, Muthucumaraswamy R, Anitha K, Shanker NR

pubmed logopapersMay 29 2025
Image enhancement of the Hypoxic-Ischemic Encephalopathy (HIE) lesion region in neonatal brain MR images is a challenging task due to the diffuse (i.e., multi-focal) nature, small size, and low contrast of the lesions. Classifying the stages of HIE is also difficult because of the unclear boundaries and edges of the lesions, which are dispersedthroughout the brain. Moreover, unclear boundaries and edges are due to chemical shifts, partial volume artifacts, and motion artifacts. Further, voxels may reflect signals from adjacent tissues. Existing algorithms perform poorly in HIE lesion enhancement due to artifacts, voxels, and the diffuse nature of the lesion. In this paper, we propose a Rough Neural Network and Anti-Homomorphic Filter (RNN-AHF) framework for the enhancement of the HIE lesion region. The RNN-AHF framework reduces the pixel dimensionality of the feature space, eliminates unnecessary pixels, and preserves essential pixels for lesion enhancement. The RNN efficiently learns and identifies pixel patterns and facilitates adaptive enhancement based on different weights in the neural network. The proposed RNN-AHF framework operates using optimized neural weights and an optimized training function. The hybridization of optimized weights and the training function enhances the lesion region with high contrast while preserving the boundaries and edges. The proposed RNN-AHF framework achieves a lesion image enhancement and classification accuracy of approximately 93.5%, which is better than traditional algorithms.

RadCLIP: Enhancing Radiologic Image Analysis Through Contrastive Language-Image Pretraining.

Lu Z, Li H, Parikh NA, Dillman JR, He L

pubmed logopapersMay 28 2025
The integration of artificial intelligence (AI) with radiology signifies a transformative era in medicine. Vision foundation models have been adopted to enhance radiologic imaging analysis. However, the inherent complexities of 2D and 3D radiologic data present unique challenges that existing models, which are typically pretrained on general nonmedical images, do not adequately address. To bridge this gap and harness the diagnostic precision required in radiologic imaging, we introduce radiologic contrastive language-image pretraining (RadCLIP): a cross-modal vision-language foundational model that utilizes a vision-language pretraining (VLP) framework to improve radiologic image analysis. Building on the contrastive language-image pretraining (CLIP) approach, RadCLIP incorporates a slice pooling mechanism designed for volumetric image analysis and is pretrained using a large, diverse dataset of radiologic image-text pairs. This pretraining effectively aligns radiologic images with their corresponding text annotations, resulting in a robust vision backbone for radiologic imaging. Extensive experiments demonstrate RadCLIP's superior performance in both unimodal radiologic image classification and cross-modal image-text matching, underscoring its significant promise for enhancing diagnostic accuracy and efficiency in clinical settings. Our key contributions include curating a large dataset featuring diverse radiologic 2D/3D image-text pairs, pretraining RadCLIP as a vision-language foundation model on this dataset, developing a slice pooling adapter with an attention mechanism for integrating 2D images, and conducting comprehensive evaluations of RadCLIP on various radiologic downstream tasks.
Page 20 of 35349 results
Show
per page
Get Started

Upload your X-ray image and get interpretation.

Upload now →

Disclaimer: X-ray Interpreter's AI-generated results are for informational purposes only and not a substitute for professional medical advice. Always consult a healthcare professional for medical diagnosis and treatment.