Sort by:
Page 143 of 3993982 results

Prompt learning with bounding box constraints for medical image segmentation

Mélanie Gaillochet, Mehrdad Noori, Sahar Dastani, Christian Desrosiers, Hervé Lombaert

arxiv logopreprintJul 3 2025
Pixel-wise annotations are notoriously labourious and costly to obtain in the medical domain. To mitigate this burden, weakly supervised approaches based on bounding box annotations-much easier to acquire-offer a practical alternative. Vision foundation models have recently shown noteworthy segmentation performance when provided with prompts such as points or bounding boxes. Prompt learning exploits these models by adapting them to downstream tasks and automating segmentation, thereby reducing user intervention. However, existing prompt learning approaches depend on fully annotated segmentation masks. This paper proposes a novel framework that combines the representational power of foundation models with the annotation efficiency of weakly supervised segmentation. More specifically, our approach automates prompt generation for foundation models using only bounding box annotations. Our proposed optimization scheme integrates multiple constraints derived from box annotations with pseudo-labels generated by the prompted foundation model. Extensive experiments across multimodal datasets reveal that our weakly supervised method achieves an average Dice score of 84.90% in a limited data setting, outperforming existing fully-supervised and weakly-supervised approaches. The code is available at https://github.com/Minimel/box-prompt-learning-VFM.git

MedFormer: Hierarchical Medical Vision Transformer with Content-Aware Dual Sparse Selection Attention

Zunhui Xia, Hongxing Li, Libin Lan

arxiv logopreprintJul 3 2025
Medical image recognition serves as a key way to aid in clinical diagnosis, enabling more accurate and timely identification of diseases and abnormalities. Vision transformer-based approaches have proven effective in handling various medical recognition tasks. However, these methods encounter two primary challenges. First, they are often task-specific and architecture-tailored, limiting their general applicability. Second, they usually either adopt full attention to model long-range dependencies, resulting in high computational costs, or rely on handcrafted sparse attention, potentially leading to suboptimal performance. To tackle these issues, we present MedFormer, an efficient medical vision transformer with two key ideas. First, it employs a pyramid scaling structure as a versatile backbone for various medical image recognition tasks, including image classification and dense prediction tasks such as semantic segmentation and lesion detection. This structure facilitates hierarchical feature representation while reducing the computation load of feature maps, highly beneficial for boosting performance. Second, it introduces a novel Dual Sparse Selection Attention (DSSA) with content awareness to improve computational efficiency and robustness against noise while maintaining high performance. As the core building technique of MedFormer, DSSA is explicitly designed to attend to the most relevant content. In addition, a detailed theoretical analysis has been conducted, demonstrating that MedFormer has superior generality and efficiency in comparison to existing medical vision transformers. Extensive experiments on a variety of imaging modality datasets consistently show that MedFormer is highly effective in enhancing performance across all three above-mentioned medical image recognition tasks. The code is available at https://github.com/XiaZunhui/MedFormer.

TABNet: A Triplet Augmentation Self-Recovery Framework with Boundary-Aware Pseudo-Labels for Medical Image Segmentation

Peilin Zhang, Shaouxan Wua, Jun Feng, Zhuo Jin, Zhizezhang Gao, Jingkun Chen, Yaqiong Xing, Xiao Zhang

arxiv logopreprintJul 3 2025
Background and objective: Medical image segmentation is a core task in various clinical applications. However, acquiring large-scale, fully annotated medical image datasets is both time-consuming and costly. Scribble annotations, as a form of sparse labeling, provide an efficient and cost-effective alternative for medical image segmentation. However, the sparsity of scribble annotations limits the feature learning of the target region and lacks sufficient boundary supervision, which poses significant challenges for training segmentation networks. Methods: We propose TAB Net, a novel weakly-supervised medical image segmentation framework, consisting of two key components: the triplet augmentation self-recovery (TAS) module and the boundary-aware pseudo-label supervision (BAP) module. The TAS module enhances feature learning through three complementary augmentation strategies: intensity transformation improves the model's sensitivity to texture and contrast variations, cutout forces the network to capture local anatomical structures by masking key regions, and jigsaw augmentation strengthens the modeling of global anatomical layout by disrupting spatial continuity. By guiding the network to recover complete masks from diverse augmented inputs, TAS promotes a deeper semantic understanding of medical images under sparse supervision. The BAP module enhances pseudo-supervision accuracy and boundary modeling by fusing dual-branch predictions into a loss-weighted pseudo-label and introducing a boundary-aware loss for fine-grained contour refinement. Results: Experimental evaluations on two public datasets, ACDC and MSCMR seg, demonstrate that TAB Net significantly outperforms state-of-the-art methods for scribble-based weakly supervised segmentation. Moreover, it achieves performance comparable to that of fully supervised methods.

CineMyoPS: Segmenting Myocardial Pathologies from Cine Cardiac MR

Wangbin Ding, Lei Li, Junyi Qiu, Bogen Lin, Mingjing Yang, Liqin Huang, Lianming Wu, Sihan Wang, Xiahai Zhuang

arxiv logopreprintJul 3 2025
Myocardial infarction (MI) is a leading cause of death worldwide. Late gadolinium enhancement (LGE) and T2-weighted cardiac magnetic resonance (CMR) imaging can respectively identify scarring and edema areas, both of which are essential for MI risk stratification and prognosis assessment. Although combining complementary information from multi-sequence CMR is useful, acquiring these sequences can be time-consuming and prohibitive, e.g., due to the administration of contrast agents. Cine CMR is a rapid and contrast-free imaging technique that can visualize both motion and structural abnormalities of the myocardium induced by acute MI. Therefore, we present a new end-to-end deep neural network, referred to as CineMyoPS, to segment myocardial pathologies, \ie scars and edema, solely from cine CMR images. Specifically, CineMyoPS extracts both motion and anatomy features associated with MI. Given the interdependence between these features, we design a consistency loss (resembling the co-training strategy) to facilitate their joint learning. Furthermore, we propose a time-series aggregation strategy to integrate MI-related features across the cardiac cycle, thereby enhancing segmentation accuracy for myocardial pathologies. Experimental results on a multi-center dataset demonstrate that CineMyoPS achieves promising performance in myocardial pathology segmentation, motion estimation, and anatomy segmentation.

Multi-task machine learning reveals the functional neuroanatomy fingerprint of mental processing

Wang, Z., Chen, Y., Pan, Y., Yan, J., Mao, W., Xiao, Z., Cao, G., Toussaint, P.-J., Guo, W., Zhao, B., Sun, H., Zhang, T., Evans, A. C., Jiang, X.

biorxiv logopreprintJul 3 2025
Mental processing delineates the functions of human mind encompassing a wide range of motor, sensory, emotional, and cognitive processes, each of which is underlain by the neuroanatomical substrates. Identifying accurate representation of functional neuroanatomy substrates of mental processing could inform understanding of its neural mechanism. The challenge is that it is unclear whether a specific mental process possesses a 'functional neuroanatomy fingerprint', i.e., a unique and reliable pattern of functional neuroanatomy that underlies the mental process. To address this question, we utilized a multi-task deep learning model to disentangle the functional neuroanatomy fingerprint of seven different and representative mental processes including Emotion, Gambling, Language, Motor, Relational, Social, and Working Memory. Results based on the functional magnetic resonance imaging data of two independent cohorts of 1235 subjects from the US and China consistently show that each of the seven mental processes possessed a functional neuroanatomy fingerprint, which is represented by a unique set of functional activity weights of whole-brain regions characterizing the degree of each region involved in the mental process. The functional neuroanatomy fingerprint of a specific mental process exhibits high discrimination ability (93% classification accuracy and AUC of 0.99) with those of the other mental processes, and is robust across different datasets and using different brain atlases. This study provides a solid functional neuroanatomy foundation for investigating the neural mechanism of mental processing.

Deep neural hashing for content-based medical image retrieval: A survey.

Manna A, Sista R, Sheet D

pubmed logopapersJul 3 2025
The ever-growing digital repositories of medical data provide opportunities for advanced healthcare by forming a foundation for a digital healthcare ecosystem. Such an ecosystem facilitates digitized solutions to aspects like early diagnosis, evidence-based treatments, precision medicine, etc. Content-based medical image retrieval (CBMIR) plays a pivotal role in delivering advanced diagnostic healthcare within such an ecosystem. The concept of deep neural hashing (DNH) is introduced with CBMIR systems to aid in faster and more relevant retrievals from such large repositories. The fusion of DNH with CBMIR is an interesting and blooming area whose potential, impact, and methods have not been summarized so far. This survey attempts to summarize this blooming area through an in-depth exploration of the methods of DNH for CBMIR. This survey portrays an end-to-end pipeline for DNH within a CBMIR system. As part of this, concepts like the design of the DNH network, utilizing diverse learning strategies, different loss functions, and evaluation metrics for retrieval performance are discussed in detail. The learning strategies for DNH are further explored by categorizing them based on the loss function into pointwise, pairwise, and triplet-wise. Centered around this categorization, various existing methods are discussed in-depth, mainly focusing on the key contributing aspects of each method. Finally, the future vision for this field is shared in detail by emphasizing three key aspects: current and immediate areas of research, realizing the current and near-future research into practical applications, and finally, some unexplored research topics for the future. In summary, this survey depicts the current state of research and the future vision of the field of CBMIR systems with DNH.

Radiology report generation using automatic keyword adaptation, frequency-based multi-label classification and text-to-text large language models.

He Z, Wong ANN, Yoo JS

pubmed logopapersJul 3 2025
Radiology reports are essential in medical imaging, providing critical insights for diagnosis, treatment, and patient management by bridging the gap between radiologists and referring physicians. However, the manual generation of radiology reports is time-consuming and labor-intensive, leading to inefficiencies and delays in clinical workflows, particularly as case volumes increase. Although deep learning approaches have shown promise in automating radiology report generation, existing methods, particularly those based on the encoder-decoder framework, suffer from significant limitations. These include a lack of explainability due to black-box features generated by encoder and limited adaptability to diverse clinical settings. In this study, we address these challenges by proposing a novel deep learning framework for radiology report generation that enhances explainability, accuracy, and adaptability. Our approach replaces traditional black-box features in computer vision with transparent keyword lists, improving the interpretability of the feature extraction process. To generate these keyword lists, we apply a multi-label classification technique, which is further enhanced by an automatic keyword adaptation mechanism. This adaptation dynamically configures the multi-label classification to better adapt specific clinical environments, reducing the reliance on manually curated reference keyword lists and improving model adaptability across diverse datasets. We also introduce a frequency-based multi-label classification strategy to address the issue of keyword imbalance, ensuring that rare but clinically significant terms are accurately identified. Finally, we leverage a pre-trained text-to-text large language model (LLM) to generate human-like, clinically relevant radiology reports from the extracted keyword lists, ensuring linguistic quality and clinical coherence. We evaluate our method using two public datasets, IU-XRay and MIMIC-CXR, demonstrating superior performance over state-of-the-art methods. Our framework not only improves the accuracy and reliability of radiology report generation but also enhances the explainability of the process, fostering greater trust and adoption of AI-driven solutions in clinical practice. Comprehensive ablation studies confirm the robustness and effectiveness of each component, highlighting the significant contributions of our framework to advancing automated radiology reporting. In conclusion, we developed a novel deep-learning based radiology report generation method for preparing high-quality and explainable radiology report for chest X-ray images using the multi-label classification and a text-to-text large language model. Our method could address the lack of explainability in the current workflow and provide a clear and flexible automated pipeline to reduce the workload of radiologists and support the further applications related to Human-AI interactive communications.

Transformer attention-based neural network for cognitive score estimation from sMRI data.

Li S, Zhang Y, Zou C, Zhang L, Li F, Liu Q

pubmed logopapersJul 3 2025
Accurately predicting cognitive scores based on structural MRI holds significant clinical value for understanding the pathological stages of dementia and forecasting Alzheimer's disease (AD). Some existing deep learning methods often depend on anatomical priors, overlooking individual-specific structural differences during AD progression. To address these limitations, this work proposes a deep neural network that incorporates Transformer attention to jointly predict multiple cognitive scores, including ADAS, CDRSB, and MMSE. The architecture first employs a 3D convolutional neural network backbone to encode sMRI, capturing preliminary local structural information. Then an improved Transformer attention block integrated with 3D positional encoding and 3D convolutional layer to adaptively capture discriminative imaging features across the brain, thereby focusing on key cognitive-related regions effectively. Finally, an attention-aware regression network enables the joint prediction of multiple clinical scores. Experimental results demonstrate that our method outperforms some existing traditional and deep learning methods based on the ADNI dataset. Further qualitative analysis reveals that the dementia-related brain regions identified by the model hold important biological significance, effectively enhancing the performance of cognitive score prediction. Our code is publicly available at: https://github.com/lshsx/CTA_MRI.

BrainAGE latent representation clustering is associated with longitudinal disease progression in early-onset Alzheimer's disease.

Manouvriez D, Kuchcinski G, Roca V, Sillaire AR, Bertoux M, Delbeuck X, Pruvo JP, Lecerf S, Pasquier F, Lebouvier T, Lopes R

pubmed logopapersJul 3 2025
Early-onset Alzheimer's disease (EOAD) population is a clinically, genetically and pathologically heterogeneous condition. Identifying biomarkers related to disease progression is crucial for advancing clinical trials and improving therapeutic strategies. This study aims to differentiate EOAD patients with varying rates of progression using Brain Age Gap Estimation (BrainAGE)-based clustering algorithm applied to structural magnetic resonance images (MRI). A retrospective analysis of a longitudinal cohort consisting of 142 participants who met the criteria for early-onset probable Alzheimer's disease was conducted. Participants were assessed clinically, neuropsychologically and with structural MRI at baseline and annually for 6 years. A Brain Age Gap Estimation (BrainAGE) deep learning model pre-trained on 3,227 3D T1-weighted MRI of healthy subjects was used to extract encoded MRI representations at baseline. Then, k-means clustering was performed on these encoded representations to stratify the population. The resulting clusters were then analyzed for disease severity, cognitive phenotype and brain volumes at baseline and longitudinally. The optimal number of clusters was determined to be 2. Clusters differed significantly in BrainAGE scores (5.44 [± 8] years vs 15.25 [± 5 years], p < 0.001). The high BrainAGE cluster was associated with older age (p = 0.001) and higher proportion of female patients (p = 0.005), as well as greater disease severity based on Mini Mental State Examination (MMSE) scores (19.32 [±4.62] vs 14.14 [±6.93], p < 0.001) and gray matter volume (0.35 [±0.03] vs 0.32 [±0.02], p < 0.001). Longitudinal analyses revealed significant differences in disease progression (MMSE decline of -2.35 [±0.15] pts/year vs -3.02 [±0.25] pts/year, p = 0.02; CDR 1.58 [±0.10] pts/year vs 1.99 [±0.16] pts/year, p = 0.03). K-means clustering of BrainAGE encoded representations stratified EOAD patients based on varying rates of disease progression. These findings underscore the potential of using BrainAGE as a biomarker for better understanding and managing EOAD.

CT-Mamba: A hybrid convolutional State Space Model for low-dose CT denoising.

Li L, Wei W, Yang L, Zhang W, Dong J, Liu Y, Huang H, Zhao W

pubmed logopapersJul 3 2025
Low-dose CT (LDCT) significantly reduces the radiation dose received by patients, however, dose reduction introduces additional noise and artifacts. Currently, denoising methods based on convolutional neural networks (CNNs) face limitations in long-range modeling capabilities, while Transformer-based denoising methods, although capable of powerful long-range modeling, suffer from high computational complexity. Furthermore, the denoised images predicted by deep learning-based techniques inevitably exhibit differences in noise distribution compared to normal-dose CT (NDCT) images, which can also impact the final image quality and diagnostic outcomes. This paper proposes CT-Mamba, a hybrid convolutional State Space Model for LDCT image denoising. The model combines the local feature extraction advantages of CNNs with Mamba's strength in capturing long-range dependencies, enabling it to capture both local details and global context. Additionally, we introduce an innovative spatially coherent Z-shaped scanning scheme to ensure spatial continuity between adjacent pixels in the image. We design a Mamba-driven deep noise power spectrum (NPS) loss function to guide model training, ensuring that the noise texture of the denoised LDCT images closely resembles that of NDCT images, thereby enhancing overall image quality and diagnostic value. Experimental results have demonstrated that CT-Mamba performs excellently in reducing noise in LDCT images, enhancing detail preservation, and optimizing noise texture distribution, and exhibits higher statistical similarity with the radiomics features of NDCT images. The proposed CT-Mamba demonstrates outstanding performance in LDCT denoising and holds promise as a representative approach for applying the Mamba framework to LDCT denoising tasks.
Page 143 of 3993982 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.