Sort by:
Page 74 of 1431421 results

MLRU++: Multiscale Lightweight Residual UNETR++ with Attention for Efficient 3D Medical Image Segmentation

Nand Kumar Yadav, Rodrigue Rizk, Willium WC Chen, KC

arxiv logopreprintJul 22 2025
Accurate and efficient medical image segmentation is crucial but challenging due to anatomical variability and high computational demands on volumetric data. Recent hybrid CNN-Transformer architectures achieve state-of-the-art results but add significant complexity. In this paper, we propose MLRU++, a Multiscale Lightweight Residual UNETR++ architecture designed to balance segmentation accuracy and computational efficiency. It introduces two key innovations: a Lightweight Channel and Bottleneck Attention Module (LCBAM) that enhances contextual feature encoding with minimal overhead, and a Multiscale Bottleneck Block (M2B) in the decoder that captures fine-grained details via multi-resolution feature aggregation. Experiments on four publicly available benchmark datasets (Synapse, BTCV, ACDC, and Decathlon Lung) demonstrate that MLRU++ achieves state-of-the-art performance, with average Dice scores of 87.57% (Synapse), 93.00% (ACDC), and 81.12% (Lung). Compared to existing leading models, MLRU++ improves Dice scores by 5.38% and 2.12% on Synapse and ACDC, respectively, while significantly reducing parameter count and computational cost. Ablation studies evaluating LCBAM and M2B further confirm the effectiveness of the proposed architectural components. Results suggest that MLRU++ offers a practical and high-performing solution for 3D medical image segmentation tasks. Source code is available at: https://github.com/1027865/MLRUPP

Dyna3DGR: 4D Cardiac Motion Tracking with Dynamic 3D Gaussian Representation

Xueming Fu, Pei Wu, Yingtai Li, Xin Luo, Zihang Jiang, Junhao Mei, Jian Lu, Gao-Jun Teng, S. Kevin Zhou

arxiv logopreprintJul 22 2025
Accurate analysis of cardiac motion is crucial for evaluating cardiac function. While dynamic cardiac magnetic resonance imaging (CMR) can capture detailed tissue motion throughout the cardiac cycle, the fine-grained 4D cardiac motion tracking remains challenging due to the homogeneous nature of myocardial tissue and the lack of distinctive features. Existing approaches can be broadly categorized into image based and representation-based, each with its limitations. Image-based methods, including both raditional and deep learning-based registration approaches, either struggle with topological consistency or rely heavily on extensive training data. Representation-based methods, while promising, often suffer from loss of image-level details. To address these limitations, we propose Dynamic 3D Gaussian Representation (Dyna3DGR), a novel framework that combines explicit 3D Gaussian representation with implicit neural motion field modeling. Our method simultaneously optimizes cardiac structure and motion in a self-supervised manner, eliminating the need for extensive training data or point-to-point correspondences. Through differentiable volumetric rendering, Dyna3DGR efficiently bridges continuous motion representation with image-space alignment while preserving both topological and temporal consistency. Comprehensive evaluations on the ACDC dataset demonstrate that our approach surpasses state-of-the-art deep learning-based diffeomorphic registration methods in tracking accuracy. The code will be available in https://github.com/windrise/Dyna3DGR.

MLRU++: Multiscale Lightweight Residual UNETR++ with Attention for Efficient 3D Medical Image Segmentation

Nand Kumar Yadav, Rodrigue Rizk, William CW Chen, KC

arxiv logopreprintJul 22 2025
Accurate and efficient medical image segmentation is crucial but challenging due to anatomical variability and high computational demands on volumetric data. Recent hybrid CNN-Transformer architectures achieve state-of-the-art results but add significant complexity. In this paper, we propose MLRU++, a Multiscale Lightweight Residual UNETR++ architecture designed to balance segmentation accuracy and computational efficiency. It introduces two key innovations: a Lightweight Channel and Bottleneck Attention Module (LCBAM) that enhances contextual feature encoding with minimal overhead, and a Multiscale Bottleneck Block (M2B) in the decoder that captures fine-grained details via multi-resolution feature aggregation. Experiments on four publicly available benchmark datasets (Synapse, BTCV, ACDC, and Decathlon Lung) demonstrate that MLRU++ achieves state-of-the-art performance, with average Dice scores of 87.57% (Synapse), 93.00% (ACDC), and 81.12% (Lung). Compared to existing leading models, MLRU++ improves Dice scores by 5.38% and 2.12% on Synapse and ACDC, respectively, while significantly reducing parameter count and computational cost. Ablation studies evaluating LCBAM and M2B further confirm the effectiveness of the proposed architectural components. Results suggest that MLRU++ offers a practical and high-performing solution for 3D medical image segmentation tasks. Source code is available at: https://github.com/1027865/MLRUPP

A Benchmark Framework for the Right Atrium Cavity Segmentation From LGE-MRIs.

Bai J, Zhu J, Chen Z, Yang Z, Lu Y, Li L, Li Q, Wang W, Zhang H, Wang K, Gan J, Zhao J, Lu H, Li S, Huang J, Chen X, Zhang X, Xu X, Li L, Tian Y, Campello VM, Lekadir K

pubmed logopapersJul 22 2025
The right atrium (RA) is critical for cardiac hemodynamics but is often overlooked in clinical diagnostics. This study presents a benchmark framework for RA cavity segmentation from late gadolinium-enhanced magnetic resonance imaging (LGE-MRIs), leveraging a two-stage strategy and a novel 3D deep learning network, RASnet. The architecture addresses challenges in class imbalance and anatomical variability by incorporating multi-path input, multi-scale feature fusion modules, Vision Transformers, context interaction mechanisms, and deep supervision. Evaluated on datasets comprising 354 LGE-MRIs, RASnet achieves SOTA performance with a Dice score of 92.19% on a primary dataset and demonstrates robust generalizability on an independent dataset. The proposed framework establishes a benchmark for RA cavity segmentation, enabling accurate and efficient analysis for cardiac imaging applications. Open-source code (https://github.com/zjinw/RAS) and data (https://zenodo.org/records/15524472) are provided to facilitate further research and clinical adoption.

A Hybrid CNN-VSSM model for Multi-View, Multi-Task Mammography Analysis: Robust Diagnosis with Attention-Based Fusion

Yalda Zafari, Roaa Elalfy, Mohamed Mabrok, Somaya Al-Maadeed, Tamer Khattab, Essam A. Rashed

arxiv logopreprintJul 22 2025
Early and accurate interpretation of screening mammograms is essential for effective breast cancer detection, yet it remains a complex challenge due to subtle imaging findings and diagnostic ambiguity. Many existing AI approaches fall short by focusing on single view inputs or single-task outputs, limiting their clinical utility. To address these limitations, we propose a novel multi-view, multitask hybrid deep learning framework that processes all four standard mammography views and jointly predicts diagnostic labels and BI-RADS scores for each breast. Our architecture integrates a hybrid CNN VSSM backbone, combining convolutional encoders for rich local feature extraction with Visual State Space Models (VSSMs) to capture global contextual dependencies. To improve robustness and interpretability, we incorporate a gated attention-based fusion module that dynamically weights information across views, effectively handling cases with missing data. We conduct extensive experiments across diagnostic tasks of varying complexity, benchmarking our proposed hybrid models against baseline CNN architectures and VSSM models in both single task and multi task learning settings. Across all tasks, the hybrid models consistently outperform the baselines. In the binary BI-RADS 1 vs. 5 classification task, the shared hybrid model achieves an AUC of 0.9967 and an F1 score of 0.9830. For the more challenging ternary classification, it attains an F1 score of 0.7790, while in the five-class BI-RADS task, the best F1 score reaches 0.4904. These results highlight the effectiveness of the proposed hybrid framework and underscore both the potential and limitations of multitask learning for improving diagnostic performance and enabling clinically meaningful mammography analysis.

MAN-GAN: a mask-adaptive normalization based generative adversarial networks for liver multi-phase CT image generation.

Zhao W, Chen W, Fan L, Shang Y, Wang Y, Situ W, Li W, Liu T, Yuan Y, Liu J

pubmed logopapersJul 22 2025
Liver multiphase enhanced computed tomography (MPECT) is vital in clinical practice, but its utility is limited by various factors. We aimed to develop a deep learning network capable of automatically generating MPECT images from standard non-contrast CT scans. Dataset 1 included 374 patients and was divided into three parts: a training set, a validation set and a test set. Dataset 2 included 144 patients with one specific liver disease and was used as an internal test dataset. We further collected another dataset comprising 83 patients for external validation. Then, we propose a Mask-Adaptive Normalization-based Generative Adversarial Network with Cycle-Consistency Loss (MAN-GAN) to achieve non-contrast CT to MPECT translation. To assess the efficiency of MAN-GAN, we conducted a comparative analysis with state-of-the-art methods commonly employed in diverse medical image synthesis tasks. Moreover, two subjective radiologist evaluation studies were performed to verify the clinical usefulness of the generated images. MAN-GAN outperformed the baseline network and other state-of-the-art methods in all generations of the three phases. These results were verified in internal and external datasets. According to radiological evaluation, the image quality of generated three phase images are all above average. Moreover, the similarities between real images and generated images in all three phases are satisfactory. MAN-GAN demonstrates the feasibility of liver MPECT image translation based on non-contrast images and achieves state-of-the-art performance via the subtraction strategy. It has great potential for solving the dilemma of liver CT contrast canning and aiding further liver interaction clinical scenarios.

SarAdapter: Prioritizing Attention on Semantic-Aware Representative Tokens for Enhanced Medical Image Segmentation.

Jiang W, Li Y, Liu Z, An L, Quellec G, Ou C

pubmed logopapersJul 22 2025
Transformer-based segmentation methods exhibit considerable potential in medical image analysis. However, their improved performance often comes with increased computational complexity, limiting their application in resource-constrained medical settings. Prior methods follow two independent tracks: (i) accelerating existing networks via semantic-aware routing, and (ii) optimizing token adapter design to enhance network performance. Despite directness, they encounter unavoidable defects (e.g., inflexible acceleration techniques or non-discriminative processing) limiting further improvements of quality-complexity trade-off. To address these shortcomings, we integrate these schemes by proposing the semantic-aware adapter (SarAdapter), which employs a semantic-based routing strategy, leveraging neural operators (ViT and CNN) of varying complexities. Specifically, it merges semantically similar tokens volume into low-resolution regions while preserving semantically distinct tokens as high-resolution regions. Additionally, we introduce a Mixed-adapter unit, which adaptively selects convolutional operators of varying complexities to better model regions at different scales. We evaluate our method on four medical datasets from three modalities and show that it achieves a superior balance between accuracy, model size, and efficiency. Notably, our proposed method achieves state-of-the-art segmentation quality on the Synapse dataset while reducing the number of tokens by 65.6%, signifying a substantial improvement in the efficiency of ViTs for the segmentation task.

EICSeg: Universal Medical Image Segmentation via Explicit In-Context Learning.

Xie S, Zhang L, Niu Z, Ye F, Zhong Q, Xie D, Chen YW, Lin L

pubmed logopapersJul 22 2025
Deep learning models for medical image segmentation often struggle with task-specific characteristics, limiting their generalization to unseen tasks with new anatomies, labels, or modalities. Retraining or fine-tuning these models requires substantial human effort and computational resources. To address this, in-context learning (ICL) has emerged as a promising paradigm, enabling query image segmentation by conditioning on example image-mask pairs provided as prompts. Unlike previous approaches that rely on implicit modeling or non-end-to-end pipelines, we redefine the core interaction mechanism in ICL as an explicit retrieval process, termed E-ICL, benefiting from the emergence of vision foundation models (VFMs). E-ICL captures dense correspondences between queries and prompts at minimal learning cost and leverages them to dynamically weight multi-class prompt masks. Built upon E-ICL, we propose EICSeg, the first end-to-end ICL framework that integrates complementary VFMs for universal medical image segmentation. Specifically, we introduce a lightweight SD-Adapter to bridge the distinct functionalities of the VFMs, enabling more accurate segmentation predictions. To fully exploit the potential of EICSeg, we further design a scalable self-prompt training strategy and an adaptive token-to-image prompt selection mechanism, facilitating both efficient training and inference. EICSeg is trained on 47 datasets covering diverse modalities and segmentation targets. Experiments on nine unseen datasets demonstrate its strong few-shot generalization ability, achieving an average Dice score of 74.0%, outperforming existing in-context and few-shot methods by 4.5%, and reducing the gap to task-specific models to 10.8%. Even with a single prompt, EICSeg achieves a competitive average Dice score of 60.1%. Notably, it performs automatic segmentation without manual prompt engineering, delivering results comparable to interactive models while requiring minimal labeled data. Source code will be available at https://github.com/ zerone-fg/EICSeg.

MLRU++: Multiscale Lightweight Residual UNETR++ with Attention for Efficient 3D Medical Image Segmentation

Nand Kumar Yadav, Rodrigue Rizk, William CW Chen, KC Santosh

arxiv logopreprintJul 22 2025
Accurate and efficient medical image segmentation is crucial but challenging due to anatomical variability and high computational demands on volumetric data. Recent hybrid CNN-Transformer architectures achieve state-of-the-art results but add significant complexity. In this paper, we propose MLRU++, a Multiscale Lightweight Residual UNETR++ architecture designed to balance segmentation accuracy and computational efficiency. It introduces two key innovations: a Lightweight Channel and Bottleneck Attention Module (LCBAM) that enhances contextual feature encoding with minimal overhead, and a Multiscale Bottleneck Block (M2B) in the decoder that captures fine-grained details via multi-resolution feature aggregation. Experiments on four publicly available benchmark datasets (Synapse, BTCV, ACDC, and Decathlon Lung) demonstrate that MLRU++ achieves state-of-the-art performance, with average Dice scores of 87.57% (Synapse), 93.00% (ACDC), and 81.12% (Lung). Compared to existing leading models, MLRU++ improves Dice scores by 5.38% and 2.12% on Synapse and ACDC, respectively, while significantly reducing parameter count and computational cost. Ablation studies evaluating LCBAM and M2B further confirm the effectiveness of the proposed architectural components. Results suggest that MLRU++ offers a practical and high-performing solution for 3D medical image segmentation tasks. Source code is available at: https://github.com/1027865/MLRUPP

MedSR-Impact: Transformer-Based Super-Resolution for Lung CT Segmentation, Radiomics, Classification, and Prognosis

Marc Boubnovski Martell, Kristofer Linton-Reid, Mitchell Chen, Sumeet Hindocha, Benjamin Hunter, Marco A. Calzado, Richard Lee, Joram M. Posma, Eric O. Aboagye

arxiv logopreprintJul 21 2025
High-resolution volumetric computed tomography (CT) is essential for accurate diagnosis and treatment planning in thoracic diseases; however, it is limited by radiation dose and hardware costs. We present the Transformer Volumetric Super-Resolution Network (\textbf{TVSRN-V2}), a transformer-based super-resolution (SR) framework designed for practical deployment in clinical lung CT analysis. Built from scalable components, including Through-Plane Attention Blocks (TAB) and Swin Transformer V2 -- our model effectively reconstructs fine anatomical details in low-dose CT volumes and integrates seamlessly with downstream analysis pipelines. We evaluate its effectiveness on three critical lung cancer tasks -- lobe segmentation, radiomics, and prognosis -- across multiple clinical cohorts. To enhance robustness across variable acquisition protocols, we introduce pseudo-low-resolution augmentation, simulating scanner diversity without requiring private data. TVSRN-V2 demonstrates a significant improvement in segmentation accuracy (+4\% Dice), higher radiomic feature reproducibility, and enhanced predictive performance (+0.06 C-index and AUC). These results indicate that SR-driven recovery of structural detail significantly enhances clinical decision support, positioning TVSRN-V2 as a well-engineered, clinically viable system for dose-efficient imaging and quantitative analysis in real-world CT workflows.
Page 74 of 1431421 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.