Sort by:
Page 47 of 61604 results

FreqU-FNet: Frequency-Aware U-Net for Imbalanced Medical Image Segmentation

Ruiqi Xing

arxiv logopreprintMay 23 2025
Medical image segmentation faces persistent challenges due to severe class imbalance and the frequency-specific distribution of anatomical structures. Most conventional CNN-based methods operate in the spatial domain and struggle to capture minority class signals, often affected by frequency aliasing and limited spectral selectivity. Transformer-based models, while powerful in modeling global dependencies, tend to overlook critical local details necessary for fine-grained segmentation. To overcome these limitations, we propose FreqU-FNet, a novel U-shaped segmentation architecture operating in the frequency domain. Our framework incorporates a Frequency Encoder that leverages Low-Pass Frequency Convolution and Daubechies wavelet-based downsampling to extract multi-scale spectral features. To reconstruct fine spatial details, we introduce a Spatial Learnable Decoder (SLD) equipped with an adaptive multi-branch upsampling strategy. Furthermore, we design a frequency-aware loss (FAL) function to enhance minority class learning. Extensive experiments on multiple medical segmentation benchmarks demonstrate that FreqU-FNet consistently outperforms both CNN and Transformer baselines, particularly in handling under-represented classes, by effectively exploiting discriminative frequency bands.

CENet: Context Enhancement Network for Medical Image Segmentation

Afshin Bozorgpour, Sina Ghorbani Kolahi, Reza Azad, Ilker Hacihaliloglu, Dorit Merhof

arxiv logopreprintMay 23 2025
Medical image segmentation, particularly in multi-domain scenarios, requires precise preservation of anatomical structures across diverse representations. While deep learning has advanced this field, existing models often struggle with accurate boundary representation, variability in organ morphology, and information loss during downsampling, limiting their accuracy and robustness. To address these challenges, we propose the Context Enhancement Network (CENet), a novel segmentation framework featuring two key innovations. First, the Dual Selective Enhancement Block (DSEB) integrated into skip connections enhances boundary details and improves the detection of smaller organs in a context-aware manner. Second, the Context Feature Attention Module (CFAM) in the decoder employs a multi-scale design to maintain spatial integrity, reduce feature redundancy, and mitigate overly enhanced representations. Extensive evaluations on both radiology and dermoscopic datasets demonstrate that CENet outperforms state-of-the-art (SOTA) methods in multi-organ segmentation and boundary detail preservation, offering a robust and accurate solution for complex medical image analysis tasks. The code is publicly available at https://github.com/xmindflow/cenet.

PDS-UKAN: Subdivision hopping connected to the U-KAN network for medical image segmentation.

Deng L, Wang W, Chen S, Yang X, Huang S, Wang J

pubmed logopapersMay 23 2025
Accurate and efficient segmentation of medical images plays a vital role in clinical tasks, such as diagnostic procedures and planning treatments. Traditional U-shaped encoder-decoder architectures, built on convolutional and transformer-based networks, have shown strong performance in medical image processing. However, the simple skip connections commonly used in these networks face limitations, such as insufficient nonlinear modeling capacity, weak global multiscale context modeling, and limited interpretability. To address these challenges, this study proposes the PDS-UKAN network, an innovative subdivision-based U-KAN architecture, designed to improve segmentation accuracy. The PDS-UKAN incorporates a PKAN module-comprising partial convolutions and Kolmogorov - Arnold network layers-into the encoder bottleneck, enhancing the network's nonlinear modeling and interpretability. Additionally, the proposed Dual-Branch Convolutional Boundary Enhancement Module (DBE) focuses on pixel-level boundary refinement, improving edge detail preservation in shallow skip connections. Meanwhile, the Skip Connection Channel Spatial Attention Module (SCCSA) mechanism is applied in the deeper skip connections to strengthen cross-dimensional interactions between channels and spatial features, mitigating the loss of spatial information due to downsampling. Extensive experiments across multiple medical imaging datasets demonstrate that PDS-UKAN consistently achieves superior performance compared to state-of-the-art (SOTA) methods.

How We Won the ISLES'24 Challenge by Preprocessing

Tianyi Ren, Juampablo E. Heras Rivera, Hitender Oswal, Yutong Pan, William Henry, Sophie Walters, Mehmet Kurt

arxiv logopreprintMay 23 2025
Stroke is among the top three causes of death worldwide, and accurate identification of stroke lesion boundaries is critical for diagnosis and treatment. Supervised deep learning methods have emerged as the leading solution for stroke lesion segmentation but require large, diverse, and annotated datasets. The ISLES'24 challenge addresses this need by providing longitudinal stroke imaging data, including CT scans taken on arrival to the hospital and follow-up MRI taken 2-9 days from initial arrival, with annotations derived from follow-up MRI. Importantly, models submitted to the ISLES'24 challenge are evaluated using only CT inputs, requiring prediction of lesion progression that may not be visible in CT scans for segmentation. Our winning solution shows that a carefully designed preprocessing pipeline including deep-learning-based skull stripping and custom intensity windowing is beneficial for accurate segmentation. Combined with a standard large residual nnU-Net architecture for segmentation, this approach achieves a mean test Dice of 28.5 with a standard deviation of 21.27.

AMVLM: Alignment-Multiplicity Aware Vision-Language Model for Semi-Supervised Medical Image Segmentation.

Pan Q, Li Z, Qiao W, Lou J, Yang Q, Yang G, Ji B

pubmed logopapersMay 23 2025
Low-quality pseudo labels pose a significant obstacle in semi-supervised medical image segmentation (SSMIS), impeding consistency learning on unlabeled data. Leveraging vision-language model (VLM) holds promise in ameliorating pseudo label quality by employing textual prompts to delineate segmentation regions, but it faces the challenge of cross-modal alignment uncertainty due to multiple correspondences (multiple images/texts tend to correspond to one text/image). Existing VLMs address this challenge by modeling semantics as distributions but such distributions lead to semantic degradation. To address these problems, we propose Alignment-Multiplicity Aware Vision-Language Model (AMVLM), a new VLM pre-training paradigm with two novel similarity metric strategies. (i) Cross-modal Similarity Supervision (CSS) proposes a probability distribution transformer to supervise similarity scores across fine-granularity semantics through measuring cross-modal distribution disparities, thus learning cross-modal multiple alignments. (ii) Intra-modal Contrastive Learning (ICL) takes into account the similarity metric of coarse-fine granularity information within each modality to encourage cross-modal semantic consistency. Furthermore, using the pretrained AMVLM, we propose a pioneering text-guided SSMIS network to compensate for the quality deficiencies of pseudo-labels. This network incorporates a text mask generator to produce multimodal supervision information, enhancing pseudo label quality and the model's consistency learning. Extensive experimentation validates the efficacy of our AMVLM-driven SSMIS, showcasing superior performance across four publicly available datasets. The code will be available at: https://github.com/QingtaoPan/AMVLM.

ESR Essentials: a step-by-step guide of segmentation for radiologists-practice recommendations by the European Society of Medical Imaging Informatics.

Chupetlovska K, Akinci D'Antonoli T, Bodalal Z, Abdelatty MA, Erenstein H, Santinha J, Huisman M, Visser JJ, Trebeschi S, Groot Lipman KBW

pubmed logopapersMay 22 2025
High-quality segmentation is important for AI-driven radiological research and clinical practice, with the potential to play an even more prominent role in the future. As medical imaging advances, accurately segmenting anatomical and pathological structures is increasingly used to obtain quantitative data and valuable insights. Segmentation and volumetric analysis could enable more precise diagnosis, treatment planning, and patient monitoring. These guidelines aim to improve segmentation accuracy and consistency, allowing for better decision-making in both research and clinical environments. Practical advice on planning and organization is provided, focusing on quality, precision, and communication among clinical teams. Additionally, tips and strategies for improving segmentation practices in radiology and radiation oncology are discussed, as are potential pitfalls to avoid. KEY POINTS: As AI continues to advance, volumetry will become more integrated into clinical practice, making it essential for radiologists to stay informed about its applications in diagnosis and treatment planning. There is a significant lack of practical guidelines and resources tailored specifically for radiologists on technical topics like segmentation and volumetric analysis. Establishing clear rules and best practices for segmentation can streamline volumetric assessment in clinical settings, making it easier to manage and leading to more accurate decision-making for patient care.

Generative adversarial DacFormer network for MRI brain tumor segmentation.

Zhang M, Sun Q, Han Y, Zhang M, Wang W, Zhang J

pubmed logopapersMay 22 2025
Current brain tumor segmentation methods often utilize a U-Net architecture based on efficient convolutional neural networks. While effective, these architectures primarily model local dependencies, lacking the ability to capture global interactions like pure Transformer. However, using pure Transformer directly causes the network to lose local feature information. To address this limitation, we propose the Generative Adversarial Dilated Attention Convolutional Transformer(GDacFormer). GDacFormer enhances interactions between tumor regions while balancing global and local information through the integration of adversarial learning with an improved transformer module. Specifically, GDacFormer leverages a generative adversarial segmentation network to learn richer and more detailed features. It integrates a novel Transformer module, DacFormer, featuring multi-scale dilated attention and a next convolution block. This module, embedded within the generator, aggregates semantic multi-scale information, efficiently reduces the redundancy in the self-attention mechanism, and enhances local feature representations, thus refining the brain tumor segmentation results. GDacFormer achieves Dice values for whole tumor, core tumor, and enhancing tumor segmentation of 90.9%/90.8%/93.7%, 84.6%/85.7%/93.5%, and 77.9%/79.3%/86.3% on BraTS2019-2021 datasets. Extensive evaluations demonstrate the effectiveness and competitiveness of GDacFormer. The code for GDacFormer will be made publicly available at https://github.com/MuqinZ/GDacFormer.

FLAMeS: A Robust Deep Learning Model for Automated Multiple Sclerosis Lesion Segmentation

Dereskewicz, E., La Rosa, F., dos Santos Silva, J., Sizer, E., Kohli, A., Wynen, M., Mullins, W. A., Maggi, P., Levy, S., Onyemeh, K., Ayci, B., Solomon, A. J., Assländer, J., Al-Louzi, O., Reich, D. S., Sumowski, J. F., Beck, E. S.

medrxiv logopreprintMay 22 2025
Background and Purpose Assessment of brain lesions on MRI is crucial for research in multiple sclerosis (MS). Manual segmentation is time consuming and inconsistent. We aimed to develop an automated MS lesion segmentation algorithm for T2-weighted fluid-attenuated inversion recovery (FLAIR) MRI. Methods We developed FLAIR Lesion Analysis in Multiple Sclerosis (FLAMeS), a deep learning-based MS lesion segmentation algorithm based on the nnU-Net 3D full-resolution U-Net and trained on 668 FLAIR 1.5 and 3 tesla scans from persons with MS. FLAMeS was evaluated on three external datasets: MSSEG-2 (n=14), MSLesSeg (n=51), and a clinical cohort (n=10), and compared to SAMSEG, LST-LPA, and LST-AI. Performance was assessed qualitatively by two blinded experts and quantitatively by comparing automated and ground truth lesion masks using standard segmentation metrics. Results In a blinded qualitative review of 20 scans, both raters selected FLAMeS as the most accurate segmentation in 15 cases, with one rater favoring FLAMeS in two additional cases. Across all testing datasets, FLAMeS achieved a mean Dice score of 0.74, a true positive rate of 0.84, and an F1 score of 0.78, consistently outperforming the benchmark methods. For other metrics, including positive predictive value, relative volume difference, and false positive rate, FLAMeS performed similarly or better than benchmark methods. Most lesions missed by FLAMeS were smaller than 10 mm3, whereas the benchmark methods missed larger lesions in addition to smaller ones. Conclusions FLAMeS is an accurate, robust method for MS lesion segmentation that outperforms other publicly available methods.

SAMba-UNet: Synergizing SAM2 and Mamba in UNet with Heterogeneous Aggregation for Cardiac MRI Segmentation

Guohao Huo, Ruiting Dai, Hao Tang

arxiv logopreprintMay 22 2025
To address the challenge of complex pathological feature extraction in automated cardiac MRI segmentation, this study proposes an innovative dual-encoder architecture named SAMba-UNet. The framework achieves cross-modal feature collaborative learning by integrating the vision foundation model SAM2, the state-space model Mamba, and the classical UNet. To mitigate domain discrepancies between medical and natural images, a Dynamic Feature Fusion Refiner is designed, which enhances small lesion feature extraction through multi-scale pooling and a dual-path calibration mechanism across channel and spatial dimensions. Furthermore, a Heterogeneous Omni-Attention Convergence Module (HOACM) is introduced, combining global contextual attention with branch-selective emphasis mechanisms to effectively fuse SAM2's local positional semantics and Mamba's long-range dependency modeling capabilities. Experiments on the ACDC cardiac MRI dataset demonstrate that the proposed model achieves a Dice coefficient of 0.9103 and an HD95 boundary error of 1.0859 mm, significantly outperforming existing methods, particularly in boundary localization for complex pathological structures such as right ventricular anomalies. This work provides an efficient and reliable solution for automated cardiac disease diagnosis, and the code will be open-sourced.

CMRINet: Joint Groupwise Registration and Segmentation for Cardiac Function Quantification from Cine-MRI

Mohamed S. Elmahdy, Marius Staring, Patrick J. H. de Koning, Samer Alabed, Mahan Salehi, Faisal Alandejani, Michael Sharkey, Ziad Aldabbagh, Andrew J. Swift, Rob J. van der Geest

arxiv logopreprintMay 22 2025
Accurate and efficient quantification of cardiac function is essential for the estimation of prognosis of cardiovascular diseases (CVDs). One of the most commonly used metrics for evaluating cardiac pumping performance is left ventricular ejection fraction (LVEF). However, LVEF can be affected by factors such as inter-observer variability and varying pre-load and after-load conditions, which can reduce its reproducibility. Additionally, cardiac dysfunction may not always manifest as alterations in LVEF, such as in heart failure and cardiotoxicity diseases. An alternative measure that can provide a relatively load-independent quantitative assessment of myocardial contractility is myocardial strain and strain rate. By using LVEF in combination with myocardial strain, it is possible to obtain a thorough description of cardiac function. Automated estimation of LVEF and other volumetric measures from cine-MRI sequences can be achieved through segmentation models, while strain calculation requires the estimation of tissue displacement between sequential frames, which can be accomplished using registration models. These tasks are often performed separately, potentially limiting the assessment of cardiac function. To address this issue, in this study we propose an end-to-end deep learning (DL) model that jointly estimates groupwise (GW) registration and segmentation for cardiac cine-MRI images. The proposed anatomically-guided Deep GW network was trained and validated on a large dataset of 4-chamber view cine-MRI image series of 374 subjects. A quantitative comparison with conventional GW registration using elastix and two DL-based methods showed that the proposed model improved performance and substantially reduced computation time.
Page 47 of 61604 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.