Sort by:
Page 7 of 1331322 results

HiPerformer: A High-Performance Global-Local Segmentation Model with Modular Hierarchical Fusion Strategy

Dayu Tan, Zhenpeng Xu, Yansen Su, Xin Peng, Chunhou Zheng, Weimin Zhong

arxiv logopreprintSep 24 2025
Both local details and global context are crucial in medical image segmentation, and effectively integrating them is essential for achieving high accuracy. However, existing mainstream methods based on CNN-Transformer hybrid architectures typically employ simple feature fusion techniques such as serial stacking, endpoint concatenation, or pointwise addition, which struggle to address the inconsistencies between features and are prone to information conflict and loss. To address the aforementioned challenges, we innovatively propose HiPerformer. The encoder of HiPerformer employs a novel modular hierarchical architecture that dynamically fuses multi-source features in parallel, enabling layer-wise deep integration of heterogeneous information. The modular hierarchical design not only retains the independent modeling capability of each branch in the encoder, but also ensures sufficient information transfer between layers, effectively avoiding the degradation of features and information loss that come with traditional stacking methods. Furthermore, we design a Local-Global Feature Fusion (LGFF) module to achieve precise and efficient integration of local details and global semantic information, effectively alleviating the feature inconsistency problem and resulting in a more comprehensive feature representation. To further enhance multi-scale feature representation capabilities and suppress noise interference, we also propose a Progressive Pyramid Aggregation (PPA) module to replace traditional skip connections. Experiments on eleven public datasets demonstrate that the proposed method outperforms existing segmentation techniques, demonstrating higher segmentation accuracy and robustness. The code is available at https://github.com/xzphappy/HiPerformer.

TCF-Net: A Hierarchical Transformer Convolution Fusion Network for Prostate Cancer Segmentation in Transrectal Ultrasound Images.

Lu X, Zhou Q, Xiao Z, Guo Y, Peng Q, Zhao S, Liu S, Huang J, Yang C, Yuan Y

pubmed logopapersSep 24 2025
Accurate prostate segmentation from transrectal ultrasound (TRUS) images is the key to the computer-aided diagnosis of prostate cancer. However, this task faces serious challenges, including various interferences, variational prostate shapes, and insufficient datasets. To address these challenges, a region-adaptive transformer convolution fusion net (TCF-Net) for accurate and robust segmentation of TRUS images is proposed. As a high-performance segmentation network, the TCF-Net contains a hierarchical encoder-decoder structure with two main modules: (1) a region-adaptive transformer-based encoder to identify and localize prostate regions, which learns the relationship between objects and pixels. It assists the model in overcoming various interferences and prostate shape variations. (2) A convolution-based decoder to improve the applicability to small datasets. Besides, a patch-based fusion module is also proposed to introduce an inductive bias for fine prostate segmentation. TCF-Net is trained and evaluated on a challenging clinical TRUS image dataset collected from the First Affiliated Hospital of Jinan University in China. The dataset contains 1000 TRUS images of 135 patients. Experimental results show that the mIoU of TCF-Net is 94.4%, which exceeds other state-of-the-art (SOTA) models by more than 1%.

HiPerformer: A High-Performance Global-Local Segmentation Model with Modular Hierarchical Fusion Strategy

Dayu Tan, Zhenpeng Xu, Yansen Su, Xin Peng, Chunhou Zheng, Weimin Zhong

arxiv logopreprintSep 24 2025
Both local details and global context are crucial in medical image segmentation, and effectively integrating them is essential for achieving high accuracy. However, existing mainstream methods based on CNN-Transformer hybrid architectures typically employ simple feature fusion techniques such as serial stacking, endpoint concatenation, or pointwise addition, which struggle to address the inconsistencies between features and are prone to information conflict and loss. To address the aforementioned challenges, we innovatively propose HiPerformer. The encoder of HiPerformer employs a novel modular hierarchical architecture that dynamically fuses multi-source features in parallel, enabling layer-wise deep integration of heterogeneous information. The modular hierarchical design not only retains the independent modeling capability of each branch in the encoder, but also ensures sufficient information transfer between layers, effectively avoiding the degradation of features and information loss that come with traditional stacking methods. Furthermore, we design a Local-Global Feature Fusion (LGFF) module to achieve precise and efficient integration of local details and global semantic information, effectively alleviating the feature inconsistency problem and resulting in a more comprehensive feature representation. To further enhance multi-scale feature representation capabilities and suppress noise interference, we also propose a Progressive Pyramid Aggregation (PPA) module to replace traditional skip connections. Experiments on eleven public datasets demonstrate that the proposed method outperforms existing segmentation techniques, demonstrating higher segmentation accuracy and robustness. The code is available at https://github.com/xzphappy/HiPerformer.

Artificial Intelligence Chest CT Imaging for the Diagnosis of Tuberculosis-Destroyed Lung with PH.

Yu W, Liu M, Qin W, Liu J, Chen S, Chen Y, Hu B, Chen Y, Liu E, Jin X, Liu S, Li C, Zhu Z

pubmed logopapersSep 24 2025
Explore the clinical characteristics of Tuberculosis Destroyed Lung (TDL) with pulmonary hypertension. Use Artificial Intelligence (AI) CT Imaging for the Diagnosis of TDL Patients with PH. 51 cases of TDL patients. Based on the results of the right heart catheterization examination, the patients were divided into two groups: TDL with group (n=31) and TDL Non-PH (n=20). The original chest CT data of the patients were reconstructed, segmented, and rendered using AI, and lung volume-related data were calculated. The differences in clinical data, hemodynamic data, and lung volume-related data between the two groups of patients were compared. The proportion of TDL patients with PH is significantly higher than those without TDL (61.82% vs. 22.64%, P<0.01). There were significant differences between the two groups of patients in terms of pulmonary function, PCWP/PVR, PASP/TRV and total volume of destroyed lung tissue (V<sub>TDLT</sub>) (P<0.05), and V<sub>TDLT</sub> is positively correlated with mean pulmonary arterial pressure (mPAP). Combined Diagnosis (V<sub>TDLT</sub> + PSAP): The area under the AUC was 0.917 (95%CI: 0.802-1), with a predicted probability of 0.51 and a Youden index of 0.789. The sensitivity was 90% and specificity was 88.9%. Patients with TDL accompanied by pulmonary hypertension are related to restrictive disorders. The V<sub>TDLT</sub> is positively correlated with mPAP. By calculating the V<sub>TDLT</sub> and combining it with the estimated PASP from echocardiography, it assists in the diagnosis of PH in these patients.

Exploiting Cross-modal Collaboration and Discrepancy for Semi-supervised Ischemic Stroke Lesion Segmentation from Multi-sequence MRI Images.

Cao Y, Qin T, Liu Y

pubmed logopapersSep 23 2025
Accurate ischemic stroke lesion segmentation is useful to define the optimal reperfusion treatment and unveil the stroke etiology. Despite the importance of diffusion-weighted MRI (DWI) for stroke diagnosis, learning from multi-sequence MRI images like apparent diffusion coefficient (ADC) can capitalize on the complementary nature of information from various modalities and show strong potential to improve the performance of segmentation. However, existing deep learning-based methods require large amounts of well-annotated data from multiple modalities for training, while acquiring such datasets is often impractical. We conduct the exploration of semi-supervised stroke lesion segmentation from multi-sequence MRI images by utilizing unlabeled data to improve performance using limited annotation and propose a novel framework by exploiting cross-modality collaboration and discrepancy to efficiently utilize unlabeled data. Specifically, we adopt a cross-modal bidirectional copy-paste strategy to enable information collaboration between different modalities and a cross-modal discrepancy-informed correction strategy to efficiently learn from limited labeled multi-sequence MRI data and abundant unlabeled data. Extensive experiments on the ischemic stroke lesion segmentation (ISLES 22) dataset demonstrate that our method efficiently utilizes unlabeled data with 12.32% DSC improvements compared with a supervised baseline using 10% annotations and outperforms existing semi-supervised segmentation methods with better performance.

3D CoAt U SegNet-enhanced deep learning framework for accurate segmentation of acute ischemic stroke lesions from non-contrast CT scans.

Nag MK, Sadhu AK, Das S, Kumar C, Choudhary S

pubmed logopapersSep 23 2025
Segmenting ischemic stroke lesions from Non-Contrast CT (NCCT) scans is a complex task due to the hypo-intense nature of these lesions compared to surrounding healthy brain tissue and their iso-intensity with lateral ventricles in many cases. Identifying early acute ischemic stroke lesions in NCCT remains particularly challenging. Computer-assisted detection and segmentation can serve as valuable tools to support clinicians in stroke diagnosis. This paper introduces CoAt U SegNet, a novel deep learning model designed to detect and segment acute ischemic stroke lesions from NCCT scans. Unlike conventional 3D segmentation models, this study presents an advanced 3D deep learning approach to enhance delineation accuracy. Traditional machine learning models have struggled to achieve satisfactory segmentation performance, highlighting the need for more sophisticated techniques. For model training, 50 NCCT scans were used, with 10 scans for validation and 500 scans for testing. The encoder convolution blocks incorporated dilation rates of 1, 3, and 5 to capture multi-scale features effectively. Performance evaluation on 500 unseen NCCT scans yielded a Dice similarity score of 75% and a Jaccard index of 70%, demonstrating notable improvement in segmentation accuracy. An enhanced similarity index was employed to refine lesion segmentation, which can further aid in distinguishing the penumbra from the core infarct area, contributing to improved clinical decision-making.

Refining the Classroom: The Self-Supervised Professor Model for Improved Segmentation of Locally Advanced Pancreatic Ductal Adenocarcinoma.

Bereska JI, Palic S, Bereska LF, Gavves E, Nio CY, Kop MPM, Struik F, Daams F, van Dam MA, Dijkhuis T, Besselink MG, Marquering HA, Stoker J, Verpalen IM

pubmed logopapersSep 23 2025
Pancreatic ductal adenocarcinoma (PDAC) is a leading cause of cancer-related deaths, with accurate staging being critical for treatment planning. Automated 3D segmentation models can aid in staging, but segmenting PDAC, especially in cases of locally advanced pancreatic cancer (LAPC), is challenging due to the tumor's heterogeneous appearance, irregular shapes, and extensive infiltration. This study developed and evaluated a tripartite self-supervised learning architecture for improved 3D segmentation of LAPC, addressing the challenges of heterogeneous appearance, irregular shapes, and extensive infiltration in PDAC. We implemented a tripartite architecture consisting of a teacher model, a professor model, and a student model. The teacher model, trained on manually segmented CT scans, generated initial pseudo-segmentations. The professor model refined these segmentations, which were then used to train the student model. We utilized 1115 CT scans from 903 patients for training. Three expert abdominal radiologists manually segmented 30 CT scans from 27 patients with LAPC, serving as reference standards. We evaluated the performance using DICE, Hausdorff distance (HD95), and mean surface distance (MSD). The teacher, professor, and student models achieved average DICE scores of 0.60, 0.73, and 0.75, respectively, with significant boundary accuracy improvements (teacher HD95/MSD, 25.71/5.96 mm; professor, 9.68/1.96 mm; student, 4.79/1.34 mm). Our findings demonstrate that the professor model significantly enhances segmentation accuracy for LAPC (p < 0.01). Both the professor and student models offer substantial improvements over previous work. The introduced tripartite self-supervised learning architecture shows promise for improving automated 3D segmentation of LAPC, potentially aiding in more accurate staging and treatment planning.

MOIS-SAM2: Exemplar-based Segment Anything Model 2 for multilesion interactive segmentation of neurofibromas in whole-body MRI

Georgii Kolokolnikov, Marie-Lena Schmalhofer, Sophie Goetz, Lennart Well, Said Farschtschi, Victor-Felix Mautner, Inka Ristow, Rene Werner

arxiv logopreprintSep 23 2025
Background and Objectives: Neurofibromatosis type 1 is a genetic disorder characterized by the development of numerous neurofibromas (NFs) throughout the body. Whole-body MRI (WB-MRI) is the clinical standard for detection and longitudinal surveillance of NF tumor growth. Existing interactive segmentation methods fail to combine high lesion-wise precision with scalability to hundreds of lesions. This study proposes a novel interactive segmentation model tailored to this challenge. Methods: We introduce MOIS-SAM2, a multi-object interactive segmentation model that extends the state-of-the-art, transformer-based, promptable Segment Anything Model 2 (SAM2) with exemplar-based semantic propagation. MOIS-SAM2 was trained and evaluated on 119 WB-MRI scans from 84 NF1 patients acquired using T2-weighted fat-suppressed sequences. The dataset was split at the patient level into a training set and four test sets (one in-domain and three reflecting different domain shift scenarios, e.g., MRI field strength variation, low tumor burden, differences in clinical site and scanner vendor). Results: On the in-domain test set, MOIS-SAM2 achieved a scan-wise DSC of 0.60 against expert manual annotations, outperforming baseline 3D nnU-Net (DSC: 0.54) and SAM2 (DSC: 0.35). Performance of the proposed model was maintained under MRI field strength shift (DSC: 0.53) and scanner vendor variation (DSC: 0.50), and improved in low tumor burden cases (DSC: 0.61). Lesion detection F1 scores ranged from 0.62 to 0.78 across test sets. Preliminary inter-reader variability analysis showed model-to-expert agreement (DSC: 0.62-0.68), comparable to inter-expert agreement (DSC: 0.57-0.69). Conclusions: The proposed MOIS-SAM2 enables efficient and scalable interactive segmentation of NFs in WB-MRI with minimal user input and strong generalization, supporting integration into clinical workflows.

Enhancing AI-based decision support system with automatic brain tumor segmentation for EGFR mutation classification.

Gökmen N, Kocadağlı O, Cevik S, Aktan C, Eghbali R, Liu C

pubmed logopapersSep 23 2025
Glioblastoma (GBM) carries poor prognosis; epidermal-growth-factor-receptor (EGFR) mutations further shorten survival. We propose a fully automated MRI-based decision-support system (DSS) that segments GBM and classifies EGFR status, reducing reliance on invasive biopsy. The segmentation module (UNet SI) fuses multiresolution, entropy-ranked shearlet features with CNN features, preserving fine detail through identity long-skip connections, to yield a Lightweight 1.9 M-parameter network. Tumour masks are fed to an Inception ResNet-v2 classifier via a 512-D bottleneck. The pipeline was five-fold cross-validated on 98 contrast-enhanced T1-weighted scans (Memorial Hospital; Ethics 24.12.2021/008) and externally validated on BraTS 2019. On the Memorial cohort UNet SI achieved Dice 0.873, Jaccard 0.853, SSIM 0.992, HD95 24.19 mm. EGFR classification reached Accuracy 0.960, Precision 1.000, Recall 0.871, AUC 0.94, surpassing published state-of-the-art results. Inference time is ≤ 0.18 s per slice on a 4 GB GPU. By combining shearlet-enhanced segmentation with streamlined classification, the DSS delivers superior EGFR prediction and is suitable for integration into routine clinical workflows.

Citrus-V: Advancing Medical Foundation Models with Unified Medical Image Grounding for Clinical Reasoning

Guoxin Wang, Jun Zhao, Xinyi Liu, Yanbo Liu, Xuyang Cao, Chao Li, Zhuoyun Liu, Qintian Sun, Fangru Zhou, Haoqiang Xing, Zhenhong Yang

arxiv logopreprintSep 23 2025
Medical imaging provides critical evidence for clinical diagnosis, treatment planning, and surgical decisions, yet most existing imaging models are narrowly focused and require multiple specialized networks, limiting their generalization. Although large-scale language and multimodal models exhibit strong reasoning and multi-task capabilities, real-world clinical applications demand precise visual grounding, multimodal integration, and chain-of-thought reasoning. We introduce Citrus-V, a multimodal medical foundation model that combines image analysis with textual reasoning. The model integrates detection, segmentation, and multimodal chain-of-thought reasoning, enabling pixel-level lesion localization, structured report generation, and physician-like diagnostic inference in a single framework. We propose a novel multimodal training approach and release a curated open-source data suite covering reasoning, detection, segmentation, and document understanding tasks. Evaluations demonstrate that Citrus-V outperforms existing open-source medical models and expert-level imaging systems across multiple benchmarks, delivering a unified pipeline from visual grounding to clinical reasoning and supporting precise lesion quantification, automated reporting, and reliable second opinions.
Page 7 of 1331322 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.