Sort by:
Page 43 of 1081080 results

A Brain Tumor Segmentation Method Based on CLIP and 3D U-Net with Cross-Modal Semantic Guidance and Multi-Level Feature Fusion

Mingda Zhang

arxiv logopreprintJul 14 2025
Precise segmentation of brain tumors from magnetic resonance imaging (MRI) is essential for neuro-oncology diagnosis and treatment planning. Despite advances in deep learning methods, automatic segmentation remains challenging due to tumor morphological heterogeneity and complex three-dimensional spatial relationships. Current techniques primarily rely on visual features extracted from MRI sequences while underutilizing semantic knowledge embedded in medical reports. This research presents a multi-level fusion architecture that integrates pixel-level, feature-level, and semantic-level information, facilitating comprehensive processing from low-level data to high-level concepts. The semantic-level fusion pathway combines the semantic understanding capabilities of Contrastive Language-Image Pre-training (CLIP) models with the spatial feature extraction advantages of 3D U-Net through three mechanisms: 3D-2D semantic bridging, cross-modal semantic guidance, and semantic-based attention mechanisms. Experimental validation on the BraTS 2020 dataset demonstrates that the proposed model achieves an overall Dice coefficient of 0.8567, representing a 4.8% improvement compared to traditional 3D U-Net, with a 7.3% Dice coefficient increase in the clinically important enhancing tumor (ET) region.

A Lightweight and Robust Framework for Real-Time Colorectal Polyp Detection Using LOF-Based Preprocessing and YOLO-v11n

Saadat Behzadi, Danial Sharifrazi, Bita Mesbahzadeh, Javad Hassannataj Joloudarid, Roohallah Alizadehsani

arxiv logopreprintJul 14 2025
Objectives: Timely and accurate detection of colorectal polyps plays a crucial role in diagnosing and preventing colorectal cancer, a major cause of mortality worldwide. This study introduces a new, lightweight, and efficient framework for polyp detection that combines the Local Outlier Factor (LOF) algorithm for filtering noisy data with the YOLO-v11n deep learning model. Study design: An experimental study leveraging deep learning and outlier removal techniques across multiple public datasets. Methods: The proposed approach was tested on five diverse and publicly available datasets: CVC-ColonDB, CVC-ClinicDB, Kvasir-SEG, ETIS, and EndoScene. Since these datasets originally lacked bounding box annotations, we converted their segmentation masks into suitable detection labels. To enhance the robustness and generalizability of our model, we apply 5-fold cross-validation and remove anomalous samples using the LOF method configured with 30 neighbors and a contamination ratio of 5%. Cleaned data are then fed into YOLO-v11n, a fast and resource-efficient object detection architecture optimized for real-time applications. We train the model using a combination of modern augmentation strategies to improve detection accuracy under diverse conditions. Results: Our approach significantly improves polyp localization performance, achieving a precision of 95.83%, recall of 91.85%, F1-score of 93.48%, [email protected] of 96.48%, and [email protected]:0.95 of 77.75%. Compared to previous YOLO-based methods, our model demonstrates enhanced accuracy and efficiency. Conclusions: These results suggest that the proposed method is well-suited for real-time colonoscopy support in clinical settings. Overall, the study underscores how crucial data preprocessing and model efficiency are when designing effective AI systems for medical imaging.

3D Wavelet Latent Diffusion Model for Whole-Body MR-to-CT Modality Translation

Jiaxu Zheng, Meiman He, Xuhui Tang, Xiong Wang, Tuoyu Cao, Tianyi Zeng, Lichi Zhang, Chenyu You

arxiv logopreprintJul 14 2025
Magnetic Resonance (MR) imaging plays an essential role in contemporary clinical diagnostics. It is increasingly integrated into advanced therapeutic workflows, such as hybrid Positron Emission Tomography/Magnetic Resonance (PET/MR) imaging and MR-only radiation therapy. These integrated approaches are critically dependent on accurate estimation of radiation attenuation, which is typically facilitated by synthesizing Computed Tomography (CT) images from MR scans to generate attenuation maps. However, existing MR-to-CT synthesis methods for whole-body imaging often suffer from poor spatial alignment between the generated CT and input MR images, and insufficient image quality for reliable use in downstream clinical tasks. In this paper, we present a novel 3D Wavelet Latent Diffusion Model (3D-WLDM) that addresses these limitations by performing modality translation in a learned latent space. By incorporating a Wavelet Residual Module into the encoder-decoder architecture, we enhance the capture and reconstruction of fine-scale features across image and latent spaces. To preserve anatomical integrity during the diffusion process, we disentangle structural and modality-specific characteristics and anchor the structural component to prevent warping. We also introduce a Dual Skip Connection Attention mechanism within the diffusion model, enabling the generation of high-resolution CT images with improved representation of bony structures and soft-tissue contrast.

Graph-based Multi-Modal Interaction Lightweight Network for Brain Tumor Segmentation (GMLN-BTS) in Edge Iterative MRI Lesion Localization System (EdgeIMLocSys)

Guohao Huo, Ruiting Dai, Hao Tang

arxiv logopreprintJul 14 2025
Brain tumor segmentation plays a critical role in clinical diagnosis and treatment planning, yet the variability in imaging quality across different MRI scanners presents significant challenges to model generalization. To address this, we propose the Edge Iterative MRI Lesion Localization System (EdgeIMLocSys), which integrates Continuous Learning from Human Feedback to adaptively fine-tune segmentation models based on clinician feedback, thereby enhancing robustness to scanner-specific imaging characteristics. Central to this system is the Graph-based Multi-Modal Interaction Lightweight Network for Brain Tumor Segmentation (GMLN-BTS), which employs a Modality-Aware Adaptive Encoder (M2AE) to extract multi-scale semantic features efficiently, and a Graph-based Multi-Modal Collaborative Interaction Module (G2MCIM) to model complementary cross-modal relationships via graph structures. Additionally, we introduce a novel Voxel Refinement UpSampling Module (VRUM) that synergistically combines linear interpolation and multi-scale transposed convolutions to suppress artifacts while preserving high-frequency details, improving segmentation boundary accuracy. Our proposed GMLN-BTS model achieves a Dice score of 85.1% on the BraTS2017 dataset with only 4.58 million parameters, representing a 98% reduction compared to mainstream 3D Transformer models, and significantly outperforms existing lightweight approaches. This work demonstrates a synergistic breakthrough in achieving high-accuracy, resource-efficient brain tumor segmentation suitable for deployment in resource-constrained clinical environments.

A Brain Tumor Segmentation Method Based on CLIP and 3D U-Net with Cross-Modal Semantic Guidance and Multi-Level Feature Fusion

Mingda Zhang

arxiv logopreprintJul 14 2025
Precise segmentation of brain tumors from magnetic resonance imaging (MRI) is essential for neuro-oncology diagnosis and treatment planning. Despite advances in deep learning methods, automatic segmentation remains challenging due to tumor morphological heterogeneity and complex three-dimensional spatial relationships. Current techniques primarily rely on visual features extracted from MRI sequences while underutilizing semantic knowledge embedded in medical reports. This research presents a multi-level fusion architecture that integrates pixel-level, feature-level, and semantic-level information, facilitating comprehensive processing from low-level data to high-level concepts. The semantic-level fusion pathway combines the semantic understanding capabilities of Contrastive Language-Image Pre-training (CLIP) models with the spatial feature extraction advantages of 3D U-Net through three mechanisms: 3D-2D semantic bridging, cross-modal semantic guidance, and semantic-based attention mechanisms. Experimental validation on the BraTS 2020 dataset demonstrates that the proposed model achieves an overall Dice coefficient of 0.8567, representing a 4.8% improvement compared to traditional 3D U-Net, with a 7.3% Dice coefficient increase in the clinically important enhancing tumor (ET) region.

A Clinically-Informed Framework for Evaluating Vision-Language Models in Radiology Report Generation: Taxonomy of Errors and Risk-Aware Metric

Guan, H., Hou, P. C., Hong, P., Wang, L., Zhang, W., Du, X., Zhou, Z., Zhou, L.

medrxiv logopreprintJul 14 2025
Recent advances in vision-language models (VLMs) have enabled automatic radiology report generation, yet current evaluation methods remain limited to general-purpose NLP metrics or coarse classification-based clinical scores. In this study, we propose a clinically informed evaluation framework for VLM-generated radiology reports that goes beyond traditional performance measures. We define a taxonomy of 12 radiology-specific error types, each annotated with clinical risk levels (low, medium, high) in collaboration with physicians. Using this framework, we conduct a comprehensive error analysis of three representative VLMs, i.e., DeepSeek VL2, CXR-LLaVA, and CheXagent, on 685 gold-standard, expert-annotated MIMIC-CXR cases. We further introduce a risk-aware evaluation metric, the Clinical Risk-weighted Error Score for Text-generation (CREST), to quantify safety impact. Our findings reveal critical model vulnerabilities, common error patterns, and condition-specific risk profiles, offering actionable insights for model development and deployment. This work establishes a safety-centric foundation for evaluating and improving medical report generation models. The source code of our evaluation framework, including CREST computation and error taxonomy analysis, is available at https://github.com/guanharry/VLM-CREST.

The Potential of ChatGPT as an Aiding Tool for the Neuroradiologist

nikola, s., paz, d.

medrxiv logopreprintJul 14 2025
PurposeThis study aims to explore whether ChatGPT can serve as an assistive tool for neuroradiologists in establishing a reasonable differential diagnosis in central nervous system tumors based on MRI images characteristics. MethodsThis retrospective study included 50 patients aged 18-90 who underwent imaging and surgery at the Western Galilee Medical Center. ChatGPT was provided with demographic and radiological information of the patients to generate differential diagnoses. We compared ChatGPTs performance to an experienced neuroradiologist, using pathological reports as the gold standard. Quantitative data were described using means and standard deviations, median and range. Qualitative data were described using frequencies and percentages. The level of agreement between examiners (neuroradiologist versus ChatGPT) was assessed using Fleiss kappa coefficient. A significance value below 5% was considered statistically significant. Statistical analysis was performed using IBM SPSS Statistics, version 27. ResultsThe results showed that while ChatGPT demonstrated good performance, particularly in identifying common tumors such as glioblastoma and meningioma, its overall accuracy (48%) was lower than that of the neuroradiologist (70%). The AI tool showed moderate agreement with the neuroradiologist (kappa = 0.445) and with pathology results (kappa = 0.419). ChatGPTs performance varied across tumor types, performing better with common tumors but struggling with rarer ones. ConclusionThis study suggests that ChatGPT has the potential to serve as an assistive tool in neuroradiology for establishing a reasonable differential diagnosis in central nervous system tumors based on MRI images characteristics. However, its limitations and potential risks must be considered, and it should therefore be used with caution.

Three-dimensional high-content imaging of unstained soft tissue with subcellular resolution using a laboratory-based multi-modal X-ray microscope

Esposito, M., Astolfo, A., Zhou, Y., Buchanan, I., Teplov, A., Endrizzi, M., Egido Vinogradova, A., Makarova, O., Divan, R., Tang, C.-M., Yagi, Y., Lee, P. D., Walsh, C. L., Ferrara, J. D., Olivo, A.

medrxiv logopreprintJul 14 2025
With increasing interest in studying biological systems across spatial scales--from centimetres down to nanometres--histology continues to be the gold standard for tissue imaging at cellular resolution, providing an essential bridge between macroscopic and nanoscopic analysis. However, its inherently destructive and two-dimensional nature limits its ability to capture the full three-dimensional complexity of tissue architecture. Here we show that phase-contrast X-ray microscopy can enable three-dimensional virtual histology with subcellular resolution. This technique provides direct quantification of electron density without restrictive assumptions, allowing for direct characterisation of cellular nuclei in a standard laboratory setting. By combining high spatial resolution and soft tissue contrast, with automated segmentation of cell nuclei, we demonstrated virtual H&E staining using machine learning-based style transfer, yielding volumetric datasets compatible with existing histopathological analysis tools. Furthermore, by integrating electron density and the sensitivity to nanometric features of the dark field contrast channel, we achieve stain-free, high-content imaging capable of distinguishing nuclei and extracellular matrix.

Explainable AI for Precision Oncology: A Task-Specific Approach Using Imaging, Multi-omics, and Clinical Data

Park, Y., Park, S., Bae, E.

medrxiv logopreprintJul 14 2025
Despite continued advances in oncology, cancer remains a leading cause of global mortality, highlighting the need for diagnostic and prognostic tools that are both accurate and interpretable. Unimodal approaches often fail to capture the biological and clinical complexity of tumors. In this study, we present a suite of task-specific AI models that leverage CT imaging, multi-omics profiles, and structured clinical data to address distinct challenges in segmentation, classification, and prognosis. We developed three independent models across large public datasets. Task 1 applied a 3D U-Net to segment pancreatic tumors from CT scans, achieving a Dice Similarity Coefficient (DSC) of 0.7062. Task 2 employed a hierarchical ensemble of omics-based classifiers to distinguish tumor from normal tissue and classify six major cancer types with 98.67% accuracy. Task 3 benchmarked classical machine learning models on clinical data for prognosis prediction across three cancers (LIHC, KIRC, STAD), achieving strong performance (e.g., C-index of 0.820 in KIRC, AUC of 0.978 in LIHC). Across all tasks, explainable AI methods such as SHAP and attention-based visualization enabled transparent interpretation of model outputs. These results demonstrate the value of tailored, modality-aware models and underscore the clinical potential of applying such tailored AI systems for precision oncology. Technical FoundationsO_LISegmentation (Task 1): A custom 3D U-Net was trained using the Task07_Pancreas dataset from the Medical Segmentation Decathlon (MSD). CT images were preprocessed with MONAI-based pipelines, resampled to (64, 96, 96) voxels, and intensity-windowed to HU ranges of -100 to 240. C_LIO_LIClassification (Task 2): Multi-omics data from TCGA--including gene expression, methylation, miRNA, CNV, and mutation profiles--were log-transformed and normalized. Five modality-specific LightGBM classifiers generated meta-features for a late-fusion ensemble. Stratified 5-fold cross-validation was used for evaluation. C_LIO_LIPrognosis (Task 3): Clinical variables from TCGA were curated and imputed (median/mode), with high-missing-rate columns removed. Survival models (e.g., Cox-PH, Random Forest, XGBoost) were trained with early stopping. No omics or imaging data were used in this task. C_LIO_LIInterpretability: SHAP values were computed for all tree-based models, and attention-based overlays were used in imaging tasks to visualize salient regions. C_LI

A Multi-Modal Deep Learning Framework for Predicting PSA Progression-Free Survival in Metastatic Prostate Cancer Using PSMA PET/CT Imaging

Ghaderi, H., Shen, C., Issa, W., Pomper, M. G., Oz, O. K., Zhang, T., Wang, J., Yang, D. X.

medrxiv logopreprintJul 14 2025
PSMA PET/CT imaging has been increasingly utilized in the management of patients with metastatic prostate cancer (mPCa). Imaging biomarkers derived from PSMA PET may provide improved prognostication and prediction of treatment response for mPCa patients. This study investigates a novel deep learning-derived imaging biomarker framework for outcome prediction using multi-modal PSMA PET/CT and clinical features. A single institution cohort of 99 mPCa patients with 396 lesions was evaluated. Imaging features were extracted from cropped lesion areas and combined with clinical variables including body mass index, ECOG performance status, prostate specific antigen (PSA) level, Gleason score, and treatments received. The PSA progression-free survival (PFS) model was trained using a ResNet architecture with a Cox proportional hazards loss function using five-fold cross-validation. Performance was assessed using concordance index (C-index) and Kaplan-Meier survival analysis. Among evaluated model architectures, the ResNet-18 backbone offered the best performance. The multi-modal deep learning framework achieved a 5-fold cross-validation C-index ranging from 0.75 to 0.94, outperforming models incorporating imaging only (0.70-0.89) and clinical features only (0.53-0.65). Kaplan-Meir survival analysis performed on the deep learning-derived predictions demonstrated clear risk stratification, with a median PSA progression free survival (PFS) of 19.7 months in the high-risk group and 26 months in the low-risk group (P < 0.001). Deep learning-derived imaging biomarker based on PSMA PET/CT can effectively predict PSA PFS for mPCa patients. Further clinical validation in prospective cohorts is warranted.
Page 43 of 1081080 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.