Sort by:
Page 178 of 3593587 results

A Brain Tumor Segmentation Method Based on CLIP and 3D U-Net with Cross-Modal Semantic Guidance and Multi-Level Feature Fusion

Mingda Zhang

arxiv logopreprintJul 14 2025
Precise segmentation of brain tumors from magnetic resonance imaging (MRI) is essential for neuro-oncology diagnosis and treatment planning. Despite advances in deep learning methods, automatic segmentation remains challenging due to tumor morphological heterogeneity and complex three-dimensional spatial relationships. Current techniques primarily rely on visual features extracted from MRI sequences while underutilizing semantic knowledge embedded in medical reports. This research presents a multi-level fusion architecture that integrates pixel-level, feature-level, and semantic-level information, facilitating comprehensive processing from low-level data to high-level concepts. The semantic-level fusion pathway combines the semantic understanding capabilities of Contrastive Language-Image Pre-training (CLIP) models with the spatial feature extraction advantages of 3D U-Net through three mechanisms: 3D-2D semantic bridging, cross-modal semantic guidance, and semantic-based attention mechanisms. Experimental validation on the BraTS 2020 dataset demonstrates that the proposed model achieves an overall Dice coefficient of 0.8567, representing a 4.8% improvement compared to traditional 3D U-Net, with a 7.3% Dice coefficient increase in the clinically important enhancing tumor (ET) region.

Graph-based Multi-Modal Interaction Lightweight Network for Brain Tumor Segmentation (GMLN-BTS) in Edge Iterative MRI Lesion Localization System (EdgeIMLocSys)

Guohao Huo, Ruiting Dai, Hao Tang

arxiv logopreprintJul 14 2025
Brain tumor segmentation plays a critical role in clinical diagnosis and treatment planning, yet the variability in imaging quality across different MRI scanners presents significant challenges to model generalization. To address this, we propose the Edge Iterative MRI Lesion Localization System (EdgeIMLocSys), which integrates Continuous Learning from Human Feedback to adaptively fine-tune segmentation models based on clinician feedback, thereby enhancing robustness to scanner-specific imaging characteristics. Central to this system is the Graph-based Multi-Modal Interaction Lightweight Network for Brain Tumor Segmentation (GMLN-BTS), which employs a Modality-Aware Adaptive Encoder (M2AE) to extract multi-scale semantic features efficiently, and a Graph-based Multi-Modal Collaborative Interaction Module (G2MCIM) to model complementary cross-modal relationships via graph structures. Additionally, we introduce a novel Voxel Refinement UpSampling Module (VRUM) that synergistically combines linear interpolation and multi-scale transposed convolutions to suppress artifacts while preserving high-frequency details, improving segmentation boundary accuracy. Our proposed GMLN-BTS model achieves a Dice score of 85.1% on the BraTS2017 dataset with only 4.58 million parameters, representing a 98% reduction compared to mainstream 3D Transformer models, and significantly outperforms existing lightweight approaches. This work demonstrates a synergistic breakthrough in achieving high-accuracy, resource-efficient brain tumor segmentation suitable for deployment in resource-constrained clinical environments.

Automated multiclass segmentation of liver vessel structures in CT images using deep learning approaches: a liver surgery pre-planning tool.

Sarkar S, Rahmani M, Farnia P, Ahmadian A, Mozayani N

pubmed logopapersJul 14 2025
Accurate liver vessel segmentation is essential for effective liver surgery pre-planning, and reducing surgical risks since it enables the precise localization and extensive assessment of complex vessel structures. Manual liver vessel segmentation is a time-intensive process reliant on operator expertise and skill. The complex, tree-like architecture of hepatic and portal veins, which are interwoven and anatomically variable, further complicates this challenge. This study addresses these challenges by proposing the UNETR (U-Net Transformers) architecture for the multi-class segmentation of portal and hepatic veins in liver CT images. UNETR leverages a transformer-based encoder to effectively capture long-range dependencies, overcoming the limitations of convolutional neural networks (CNNs) in handling complex anatomical structures. The proposed method was evaluated on contrast-enhanced CT images from the IRCAD as well as a locally dataset developed from a hospital. On the local dataset, the UNETR model achieved Dice coefficients of 49.71% for portal veins, 69.39% for hepatic veins, and 76.74% for overall vessel segmentation, while reaching Dice coefficients of 62.54% for vessel segmentation on the IRCAD dataset. These results highlight the method's effectiveness in identifying complex vessel structures across diverse datasets. These findings underscore the critical role of advanced architectures and precise annotations in improving segmentation accuracy. This work provides a foundation for future advancements in automated liver surgery pre-planning, with the potential to enhance clinical outcomes significantly. The implementation code is available on GitHub: https://github.com/saharsarkar/Multiclass-Vessel-Segmentation .

Associations of Computerized Tomography-Based Body Composition and Food Insecurity in Bariatric Surgery Patients.

Sizemore JA, Magudia K, He H, Landa K, Bartholomew AJ, Howell TC, Michaels AD, Fong P, Greenberg JA, Wilson L, Palakshappa D, Seymour KA

pubmed logopapersJul 14 2025
Food insecurity (FI) is associated with increased adiposity and obesity-related medical conditions, and body composition can affect metabolic risk. Bariatric surgery effectively treats obesity and metabolic diseases. The association of FI with baseline computerized tomography (CT)-based body composition and bariatric surgery outcomes was investigated in this exploratory study. Fifty-four retrospectively identified adults had bariatric surgery, preoperative CT scan from 2017 to 2019, completed a six-item food security survey, and had body composition measured by bioelectrical impedance analysis (BIA). Skeletal muscle, visceral fat, and subcutaneous fat areas were determined from abdominal CT and normalized to published age, sex, and race reference values. Anthropometric data, related medical conditions, and medications were collected preoperatively, and at 6 months and at 12 months postoperatively. Patients were stratified into food security (FS) or FI based on survey responses. Fourteen (26%) patients were categorized as FI. Patients with FI had lower skeletal muscle area and higher subcutaneous fat area than patients with FS on baseline CT exam (p < 0.05). There was no difference in baseline BIA between patients with FS and FI. The two groups had similar weight loss, reduction in obesity-related medications, and healthcare utilization following bariatric surgery at 6 and 12 months postoperatively. Patients with FI had higher subcutaneous fat and lower skeletal muscle than patients with FS by baseline CT exam, findings which were not detected by BIA. CT analysis enabled by an artificial intelligence workflow offers more precise and detailed body composition data.

Graph-based Multi-Modal Interaction Lightweight Network for Brain Tumor Segmentation (GMLN-BTS) in Edge Iterative MRI Lesion Localization System (EdgeIMLocSys)

Guohao Huo, Ruiting Dai, Hao Tang

arxiv logopreprintJul 14 2025
Brain tumor segmentation plays a critical role in clinical diagnosis and treatment planning, yet the variability in imaging quality across different MRI scanners presents significant challenges to model generalization. To address this, we propose the Edge Iterative MRI Lesion Localization System (EdgeIMLocSys), which integrates Continuous Learning from Human Feedback to adaptively fine-tune segmentation models based on clinician feedback, thereby enhancing robustness to scanner-specific imaging characteristics. Central to this system is the Graph-based Multi-Modal Interaction Lightweight Network for Brain Tumor Segmentation (GMLN-BTS), which employs a Modality-Aware Adaptive Encoder (M2AE) to extract multi-scale semantic features efficiently, and a Graph-based Multi-Modal Collaborative Interaction Module (G2MCIM) to model complementary cross-modal relationships via graph structures. Additionally, we introduce a novel Voxel Refinement UpSampling Module (VRUM) that synergistically combines linear interpolation and multi-scale transposed convolutions to suppress artifacts while preserving high-frequency details, improving segmentation boundary accuracy. Our proposed GMLN-BTS model achieves a Dice score of 85.1% on the BraTS2017 dataset with only 4.58 million parameters, representing a 98% reduction compared to mainstream 3D Transformer models, and significantly outperforms existing lightweight approaches. This work demonstrates a synergistic breakthrough in achieving high-accuracy, resource-efficient brain tumor segmentation suitable for deployment in resource-constrained clinical environments.

Self-supervised Upsampling for Reconstructions with Generalized Enhancement in Photoacoustic Computed Tomography.

Deng K, Luo Y, Zuo H, Chen Y, Gu L, Liu MY, Lan H, Luo J, Ma C

pubmed logopapersJul 14 2025
Photoacoustic computed tomography (PACT) is an emerging hybrid imaging modality with potential applications in biomedicine. A major roadblock to the widespread adoption of PACT is the limited number of detectors, which gives rise to spatial aliasing and manifests as streak artifacts in the reconstructed image. A brute-force solution to the problem is to increase the number of detectors, which, however, is often undesirable due to escalated costs. In this study, we present a novel self-supervised learning approach, to overcome this long-standing challenge. We found that small blocks of PACT channel data show similarity at various downsampling rates. Based on this observation, a neural network trained on downsampled data can reliably perform accurate interpolation without requiring densely-sampled ground truth data, which is typically unavailable in real practice. Our method has undergone validation through numerical simulations, controlled phantom experiments, as well as ex vivo and in vivo animal tests, across multiple PACT systems. We have demonstrated that our technique provides an effective and cost-efficient solution to address the under-sampling issue in PACT, thereby enhancing the capabilities of this imaging technology.

Leveraging Swin Transformer for enhanced diagnosis of Alzheimer's disease using multi-shell diffusion MRI

Quentin Dessain, Nicolas Delinte, Bernard Hanseeuw, Laurence Dricot, Benoît Macq

arxiv logopreprintJul 14 2025
Objective: This study aims to support early diagnosis of Alzheimer's disease and detection of amyloid accumulation by leveraging the microstructural information available in multi-shell diffusion MRI (dMRI) data, using a vision transformer-based deep learning framework. Methods: We present a classification pipeline that employs the Swin Transformer, a hierarchical vision transformer model, on multi-shell dMRI data for the classification of Alzheimer's disease and amyloid presence. Key metrics from DTI and NODDI were extracted and projected onto 2D planes to enable transfer learning with ImageNet-pretrained models. To efficiently adapt the transformer to limited labeled neuroimaging data, we integrated Low-Rank Adaptation. We assessed the framework on diagnostic group prediction (cognitively normal, mild cognitive impairment, Alzheimer's disease dementia) and amyloid status classification. Results: The framework achieved competitive classification results within the scope of multi-shell dMRI-based features, with the best balanced accuracy of 95.2% for distinguishing cognitively normal individuals from those with Alzheimer's disease dementia using NODDI metrics. For amyloid detection, it reached 77.2% balanced accuracy in distinguishing amyloid-positive mild cognitive impairment/Alzheimer's disease dementia subjects from amyloid-negative cognitively normal subjects, and 67.9% for identifying amyloid-positive individuals among cognitively normal subjects. Grad-CAM-based explainability analysis identified clinically relevant brain regions, including the parahippocampal gyrus and hippocampus, as key contributors to model predictions. Conclusion: This study demonstrates the promise of diffusion MRI and transformer-based architectures for early detection of Alzheimer's disease and amyloid pathology, supporting biomarker-driven diagnostics in data-limited biomedical settings.

Region Uncertainty Estimation for Medical Image Segmentation with Noisy Labels.

Han K, Wang S, Chen J, Qian C, Lyu C, Ma S, Qiu C, Sheng VS, Huang Q, Liu Z

pubmed logopapersJul 14 2025
The success of deep learning in 3D medical image segmentation hinges on training with a large dataset of fully annotated 3D volumes, which are difficult and time-consuming to acquire. Although recent foundation models (e.g., segment anything model, SAM) can utilize sparse annotations to reduce annotation costs, segmentation tasks involving organs and tissues with blurred boundaries remain challenging. To address this issue, we propose a region uncertainty estimation framework for Computed Tomography (CT) image segmentation using noisy labels. Specifically, we propose a sample-stratified training strategy that stratifies samples according to their varying quality labels, prioritizing confident and fine-grained information at each training stage. This sample-to-voxel level processing enables more reliable supervision information to propagate to noisy label data, thus effectively mitigating the impact of noisy annotations. Moreover, we further design a boundary-guided regional uncertainty estimation module that adapts sample hierarchical training to assist in evaluating sample confidence. Experiments conducted across multiple CT datasets demonstrate the superiority of our proposed method over several competitive approaches under various noise conditions. Our proposed reliable label propagation strategy not only significantly reduces the cost of medical image annotation and robust model training but also improves the segmentation performance in scenarios with imperfect annotations, thus paving the way towards the application of medical segmentation foundation models under low-resource and remote scenarios. Code will be available at https://github.com/KHan-UJS/NoisyLabel.

Advanced U-Net Architectures with CNN Backbones for Automated Lung Cancer Detection and Segmentation in Chest CT Images

Alireza Golkarieha, Kiana Kiashemshakib, Sajjad Rezvani Boroujenic, Nasibeh Asadi Isakand

arxiv logopreprintJul 14 2025
This study investigates the effectiveness of U-Net architectures integrated with various convolutional neural network (CNN) backbones for automated lung cancer detection and segmentation in chest CT images, addressing the critical need for accurate diagnostic tools in clinical settings. A balanced dataset of 832 chest CT images (416 cancerous and 416 non-cancerous) was preprocessed using Contrast Limited Adaptive Histogram Equalization (CLAHE) and resized to 128x128 pixels. U-Net models were developed with three CNN backbones: ResNet50, VGG16, and Xception, to segment lung regions. After segmentation, CNN-based classifiers and hybrid models combining CNN feature extraction with traditional machine learning classifiers (Support Vector Machine, Random Forest, and Gradient Boosting) were evaluated using 5-fold cross-validation. Metrics included accuracy, precision, recall, F1-score, Dice coefficient, and ROC-AUC. U-Net with ResNet50 achieved the best performance for cancerous lungs (Dice: 0.9495, Accuracy: 0.9735), while U-Net with VGG16 performed best for non-cancerous segmentation (Dice: 0.9532, Accuracy: 0.9513). For classification, the CNN model using U-Net with Xception achieved 99.1 percent accuracy, 99.74 percent recall, and 99.42 percent F1-score. The hybrid CNN-SVM-Xception model achieved 96.7 percent accuracy and 97.88 percent F1-score. Compared to prior methods, our framework consistently outperformed existing models. In conclusion, combining U-Net with advanced CNN backbones provides a powerful method for both segmentation and classification of lung cancer in CT scans, supporting early diagnosis and clinical decision-making.

Multimodal Deep Learning Model Based on Ultrasound and Cytological Images Predicts Risk Stratification of cN0 Papillary Thyroid Carcinoma.

He F, Chen S, Liu X, Yang X, Qin X

pubmed logopapersJul 14 2025
Accurately assessing the risk stratification of cN0 papillary thyroid carcinoma (PTC) preoperatively aids in making treatment decisions. We integrated preoperative ultrasound and cytological images of patients to develop and validate a multimodal deep learning (DL) model for non-invasive assessment of N0 PTC risk stratification before surgery. In this retrospective multicenter group study, we developed a comprehensive DL model based on ultrasound and cytological images. The model was trained and validated on 890 PTC patients undergoing thyroidectomy and lymph node dissection across five medical centers. The testing group included 107 patients from one medical center. We analyzed the model's performance, including the area under the receiver operating characteristic curve, accuracy, sensitivity, and specificity. The combined DL model demonstrated strong performance, with an area under the curve (AUC) of 0.922 (0.866-0.979) in the internal validation group and an AUC of 0.845 (0.794-0.895) in the testing group. The diagnostic performance of the combined DL model surpassed that of clinical models. Image region heatmaps assisted in interpreting the diagnosis of risk stratification. The multimodal DL model based on ultrasound and cytological images can accurately determine the risk stratification of N0 PTC and guide treatment decisions.
Page 178 of 3593587 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.