Sort by:
Page 13 of 4494481 results

RAU: Reference-based Anatomical Understanding with Vision Language Models

Yiwei Li, Yikang Liu, Jiaqi Guo, Lin Zhao, Zheyuan Zhang, Xiao Chen, Boris Mailhe, Ankush Mukherjee, Terrence Chen, Shanhui Sun

arxiv logopreprintSep 26 2025
Anatomical understanding through deep learning is critical for automatic report generation, intra-operative navigation, and organ localization in medical imaging; however, its progress is constrained by the scarcity of expert-labeled data. A promising remedy is to leverage an annotated reference image to guide the interpretation of an unlabeled target. Although recent vision-language models (VLMs) exhibit non-trivial visual reasoning, their reference-based understanding and fine-grained localization remain limited. We introduce RAU, a framework for reference-based anatomical understanding with VLMs. We first show that a VLM learns to identify anatomical regions through relative spatial reasoning between reference and target images, trained on a moderately sized dataset. We validate this capability through visual question answering (VQA) and bounding box prediction. Next, we demonstrate that the VLM-derived spatial cues can be seamlessly integrated with the fine-grained segmentation capability of SAM2, enabling localization and pixel-level segmentation of small anatomical regions, such as vessel segments. Across two in-distribution and two out-of-distribution datasets, RAU consistently outperforms a SAM2 fine-tuning baseline using the same memory setup, yielding more accurate segmentations and more reliable localization. More importantly, its strong generalization ability makes it scalable to out-of-distribution datasets, a property crucial for medical image applications. To the best of our knowledge, RAU is the first to explore the capability of VLMs for reference-based identification, localization, and segmentation of anatomical structures in medical images. Its promising performance highlights the potential of VLM-driven approaches for anatomical understanding in automated clinical workflows.

Deep Learning-Based Cross-Anatomy CT Synthesis Using Adapted nnResU-Net with Anatomical Feature Prioritized Loss

Javier Sequeiro González, Arthur Longuefosse, Miguel Díaz Benito, Álvaro García Martín, Fabien Baldacci

arxiv logopreprintSep 26 2025
We present a patch-based 3D nnUNet adaptation for MR to CT and CBCT to CT image translation using the multicenter SynthRAD2025 dataset, covering head and neck (HN), thorax (TH), and abdomen (AB) regions. Our approach leverages two main network configurations: a standard UNet and a residual UNet, both adapted from nnUNet for image synthesis. The Anatomical Feature-Prioritized (AFP) loss was introduced, which compares multilayer features extracted from a compact segmentation network trained on TotalSegmentator labels, enhancing reconstruction of clinically relevant structures. Input volumes were normalized per-case using zscore normalization for MRIs, and clipping plus dataset level zscore normalization for CBCT and CT. Training used 3D patches tailored to each anatomical region without additional data augmentation. Models were trained for 1000 and 1500 epochs, with AFP fine-tuning performed for 500 epochs using a combined L1+AFP objective. During inference, overlapping patches were aggregated via mean averaging with step size of 0.3, and postprocessing included reverse zscore normalization. Both network configurations were applied across all regions, allowing consistent model design while capturing local adaptations through residual learning and AFP loss. Qualitative and quantitative evaluation revealed that residual networks combined with AFP yielded sharper reconstructions and improved anatomical fidelity, particularly for bone structures in MR to CT and lesions in CBCT to CT, while L1only networks achieved slightly better intensity-based metrics. This methodology provides a stable solution for cross modality medical image synthesis, demonstrating the effectiveness of combining the automatic nnUNet pipeline with residual learning and anatomically guided feature losses.

COVID-19 Pneumonia Diagnosis Using Medical Images: Deep Learning-Based Transfer Learning Approach.

Dharmik A

pubmed logopapersSep 26 2025
SARS-CoV-2, the causative agent of COVID-19, remains a global health concern due to its high transmissibility and evolving variants. Although vaccination efforts and therapeutic advancements have mitigated disease severity, emerging mutations continue to challenge diagnostics and containment strategies. As of mid-February 2025, global test positivity has risen to 11%, marking the highest level in over 6 months, despite widespread immunization efforts. Newer variants demonstrate enhanced host cell binding, increasing both infectivity and diagnostic complexity. This study aimed to evaluate the effectiveness of deep transfer learning in delivering a rapid, accurate, and mutation-resilient COVID-19 diagnosis from medical imaging, with a focus on scalability and accessibility. An automated detection system was developed using state-of-the-art convolutional neural networks, including VGG16 (Visual Geometry Group network-16 layers), ResNet50 (residual network-50 layers), ConvNeXtTiny (convolutional next-tiny), MobileNet (mobile network), NASNetMobile (neural architecture search network-mobile version), and DenseNet121 (densely connected convolutional network-121 layers), to detect COVID-19 from chest X-ray and computed tomography (CT) images. Among all the models evaluated, DenseNet121 emerged as the best-performing architecture for COVID-19 diagnosis using X-ray and CT images. It achieved an impressive accuracy of 98%, with a precision of 96.9%, a recall of 98.9%, an F1-score of 97.9%, and an area under the curve score of 99.8%, indicating a high degree of consistency and reliability in detecting both positive and negative cases. The confusion matrix showed minimal false positives and false negatives, underscoring the model's robustness in real-world diagnostic scenarios. Given its performance, DenseNet121 is a strong candidate for deployment in clinical settings and serves as a benchmark for future improvements in artificial intelligence-assisted diagnostic tools. The study results underscore the potential of artificial intelligence-powered diagnostics in supporting early detection and global pandemic response. With careful optimization, deep learning models can address critical gaps in testing, particularly in settings constrained by limited resources or emerging variants.

InfiMed-Foundation: Pioneering Advanced Multimodal Medical Models with Compute-Efficient Pre-Training and Multi-Stage Fine-Tuning

Guanghao Zhu, Zhitian Hou, Zeyu Liu, Zhijie Sang, Congkai Xie, Hongxia Yang

arxiv logopreprintSep 26 2025
Multimodal large language models (MLLMs) have shown remarkable potential in various domains, yet their application in the medical field is hindered by several challenges. General-purpose MLLMs often lack the specialized knowledge required for medical tasks, leading to uncertain or hallucinatory responses. Knowledge distillation from advanced models struggles to capture domain-specific expertise in radiology and pharmacology. Additionally, the computational cost of continual pretraining with large-scale medical data poses significant efficiency challenges. To address these issues, we propose InfiMed-Foundation-1.7B and InfiMed-Foundation-4B, two medical-specific MLLMs designed to deliver state-of-the-art performance in medical applications. We combined high-quality general-purpose and medical multimodal data and proposed a novel five-dimensional quality assessment framework to curate high-quality multimodal medical datasets. We employ low-to-high image resolution and multimodal sequence packing to enhance training efficiency, enabling the integration of extensive medical data. Furthermore, a three-stage supervised fine-tuning process ensures effective knowledge extraction for complex medical tasks. Evaluated on the MedEvalKit framework, InfiMed-Foundation-1.7B outperforms Qwen2.5VL-3B, while InfiMed-Foundation-4B surpasses HuatuoGPT-V-7B and MedGemma-27B-IT, demonstrating superior performance in medical visual question answering and diagnostic tasks. By addressing key challenges in data quality, training efficiency, and domain-specific knowledge extraction, our work paves the way for more reliable and effective AI-driven solutions in healthcare. InfiMed-Foundation-4B model is available at \href{https://huggingface.co/InfiX-ai/InfiMed-Foundation-4B}{InfiMed-Foundation-4B}.

Exploring learning transferability in deep segmentation of colorectal cancer liver metastases.

Abbas M, Badic B, Andrade-Miranda G, Bourbonne V, Jaouen V, Visvikis D, Conze PH

pubmed logopapersSep 26 2025
Ensuring the seamless transfer of knowledge and models across various datasets and clinical contexts is of paramount importance in medical image segmentation. This is especially true for liver lesion segmentation which plays a key role in pre-operative planning and treatment follow-up. Despite the progress of deep learning algorithms using Transformers, automatically segmenting small hepatic metastases remains a persistent challenge. This can be attributed to the degradation of small structures due to the intrinsic process of feature down-sampling inherent to many deep architectures, coupled with the imbalance between foreground metastases voxels and background. While similar challenges have been observed for liver tumors originated from hepatocellular carcinoma, their manifestation in the context of liver metastasis delineation remains under-explored and require well-defined guidelines. Through comprehensive experiments, this paper aims to bridge this gap and to demonstrate the impact of various transfer learning schemes from off-the-shelf datasets to a dataset containing liver metastases only. Our scale-specific evaluation reveals that models trained from scratch or with domain-specific pre-training demonstrate greater proficiency.

Bézier Meets Diffusion: Robust Generation Across Domains for Medical Image Segmentation

Chen Li, Meilong Xu, Xiaoling Hu, Weimin Lyu, Chao Chen

arxiv logopreprintSep 26 2025
Training robust learning algorithms across different medical imaging modalities is challenging due to the large domain gap. Unsupervised domain adaptation (UDA) mitigates this problem by using annotated images from the source domain and unlabeled images from the target domain to train the deep models. Existing approaches often rely on GAN-based style transfer, but these methods struggle to capture cross-domain mappings in regions with high variability. In this paper, we propose a unified framework, B\'ezier Meets Diffusion, for cross-domain image generation. First, we introduce a B\'ezier-curve-based style transfer strategy that effectively reduces the domain gap between source and target domains. The transferred source images enable the training of a more robust segmentation model across domains. Thereafter, using pseudo-labels generated by this segmentation model on the target domain, we train a conditional diffusion model (CDM) to synthesize high-quality, labeled target-domain images. To mitigate the impact of noisy pseudo-labels, we further develop an uncertainty-guided score matching method that improves the robustness of CDM training. Extensive experiments on public datasets demonstrate that our approach generates realistic labeled images, significantly augmenting the target domain and improving segmentation performance.

Prediction of neoadjuvant chemotherapy efficacy in patients with HER2-low breast cancer based on ultrasound radiomics.

Peng Q, Ji Z, Xu N, Dong Z, Zhang T, Ding M, Qu L, Liu Y, Xie J, Jin F, Chen B, Song J, Zheng A

pubmed logopapersSep 26 2025
Neoadjuvant chemotherapy (NAC) is a crucial therapeutic approach for treating breast cancer, yet accurately predicting treatment response remains a significant clinical challenge. Conventional ultrasound plays a vital role in assessing tumor morphology but lacks the ability to quantitatively capture intratumoral heterogeneity. Ultrasound radiomics, which extracts high-throughput quantitative imaging features, offers a novel approach to enhance NAC response prediction. This study aims to evaluate the predictive efficacy of ultrasound radiomics models based on pre-treatment, post-treatment, and combined imaging features for assessing the NAC response in patients with HER2-low breast cancer. This retrospective multicenter study included 359 patients with HER2-low breast cancer who underwent NAC between January 1, 2016, and December 31, 2020. A total of 488 radiomic features were extracted from pre- and post-treatment ultrasound images. Feature selection was conducted in two stages: first, Pearson correlation analysis (threshold: 0.65) was applied to remove highly correlated features and reduce redundancy; then, Recursive Feature Elimination with Cross-Validation (RFECV) was employed to identify the optimal feature subset for model construction. The dataset was divided into a training set (244 patients) and an external validation set (115 patients from independent centers). Model performance was assessed via the area under the receiver operating characteristic curve (AUC), accuracy, precision, recall, and F1 score. Three models were initially developed: (1) a pre-treatment model (AUC = 0.716), (2) a post-treatment model (AUC = 0.772), and (3) a combined pre- and post-treatment model (AUC = 0.762).To enhance feature selection, Recursive Feature Elimination with Cross-Validation was applied, resulting in optimized models with reduced feature sets: (1) the pre-treatment model (AUC = 0.746), (2) the post-treatment model (AUC = 0.712), and (3) the combined model (AUC = 0.759). Ultrasound radiomics is a non-invasive and promising approach for predicting response to neoadjuvant chemotherapy in HER2-low breast cancer. The pre-treatment model yielded reliable performance after feature selection. While the combined model did not substantially enhance predictive accuracy, its stable performance suggests that longitudinal ultrasound imaging may help capture treatment-induced phenotypic changes. These findings offer preliminary support for individualized therapeutic decision-making.

Deep learning-based cardiac computed tomography angiography left atrial segmentation and quantification in atrial fibrillation patients: a multi-model comparative study.

Feng L, Lu W, Liu J, Chen Z, Jin J, Qian N, Pan J, Wang L, Xiang J, Jiang J, Wang Y

pubmed logopapersSep 26 2025
Quantitative assessment of left atrial volume (LAV) is an important factor in the study of the pathogenesis of atrial fibrillation. However, automated left atrial segmentation with quantitative assessment usually faces many challenges. The main objective of this study was to find the optimal left atrial segmentation model based on cardiac computed tomography angiography (CTA) and to perform quantitative LAV measurement. A multi-center left atrial study cohort containing 182 cardiac CTAs with atrial fibrillation was created, each case accompanied by expert image annotation by a cardiologist. Then, based on this left atrium dataset, five recent states-of-the-art (SOTA) models in the field of medical image segmentation were used to train and validate the left atrium segmentation model, including DAResUNet, nnFormer, xLSTM-UNet, UNETR, and VNet, respectively. Further, the optimal segmentation model was used to assess the consistency validation of the LAV. DAResUNet achieved the best performance in DSC (0.924 ± 0.023) and JI (0.859 ± 0.065) among all models, while VNet is the best performer in HD (12.457 ± 6.831) and ASD (1.034 ± 0.178). The Bland-Altman plot demonstrated the extremely strong agreement (mean bias - 5.69 mL, 95% LoA - 19-7.6 mL) between the model's automatic prediction and manual measurements. Deep learning models based on a study cohort of 182 CTA left atrial images were capable of achieving competitive results in left atrium segmentation. LAV assessment based on deep learning models may be useful for biomarkers of the onset of atrial fibrillation.

Pathomics-based machine learning models for optimizing LungPro navigational bronchoscopy in peripheral lung lesion diagnosis: a retrospective study.

Ying F, Bao Y, Ma X, Tan Y, Li S

pubmed logopapersSep 26 2025
To construct a pathomics-based machine learning model to enhance the diagnostic efficacy of LungPro navigational bronchoscopy for peripheral pulmonary lesions and to optimize the management strategy for LungPro-diagnosed negative lesions. Clinical data and hematoxylin and eosin (H&E)-stained whole slide images (WSIs) were collected from 144 consecutive patients undergoing LungPro virtual bronchoscopy at a single institution between January 2022 and December 2023. Patients were stratified into diagnosis-positive and diagnosis-negative cohorts based on histopathological or etiological confirmation. An artificial intelligence (AI) model was developed and validated using 94 diagnosis-positive cases. Logistic regression (LR) identified associations between clinical/imaging characteristics and malignant pulmonary lesion risk factors. We implemented a convolutional neural network (CNN) with weakly supervised learning to extract image-level features, followed by multiple instance learning (MIL) for patient-level feature aggregation. Multiple machine learning (ML) algorithms were applied to model the extracted features. A multimodal diagnostic framework integrating clinical, imaging, and pathomics data were subsequently developed and evaluated on 50 LungPro-negative patients to assess the framework's diagnostic performance and predictive validity. Univariable and multivariable logistic regression analyses identified that age, lesion boundary and mean computed tomography (CT) attenuation were independent risk factors for malignant peripheral pulmonary lesions (P < 0.05). A histopathological model using a MIL fusion strategy showed strong diagnostic performance for lung cancer, with area under the curve (AUC) values of 0.792 (95% CI 0.680-0.903) in the training cohort and 0.777 (95% CI 0.531-1.000) in the test cohort. Combining predictive clinical features with pathological characteristics enhanced diagnostic yield for peripheral pulmonary lesions to 0.848 (95% CI 0.6945-1.0000). In patients with initially negative LungPro biopsy results, the model identified 20 of 28 malignant lesions (sensitivity: 71.43%) and 15 of 22 benign lesions (specificity: 68.18%). Class activation mapping (CAM) validated the model by highlighting key malignant features, including conspicuous nucleoli and nuclear atypia. The fusion diagnostic model that incorporates clinical and pathomic features markedly enhances the diagnostic accuracy of LungPro in this retrospective cohort. This model aids in the detection of subtle malignant characteristics, thereby offering evidence to support precise and targeted therapeutic interventions for lesions that LungPro classifies as negative in clinical settings.

Learning KAN-based Implicit Neural Representations for Deformable Image Registration

Nikita Drozdov, Marat Zinovev, Dmitry Sorokin

arxiv logopreprintSep 26 2025
Deformable image registration (DIR) is a cornerstone of medical image analysis, enabling spatial alignment for tasks like comparative studies and multi-modal fusion. While learning-based methods (e.g., CNNs, transformers) offer fast inference, they often require large training datasets and struggle to match the precision of classical iterative approaches on some organ types and imaging modalities. Implicit neural representations (INRs) have emerged as a promising alternative, parameterizing deformations as continuous mappings from coordinates to displacement vectors. However, this comes at the cost of requiring instance-specific optimization, making computational efficiency and seed-dependent learning stability critical factors for these methods. In this work, we propose KAN-IDIR and RandKAN-IDIR, the first integration of Kolmogorov-Arnold Networks (KANs) into deformable image registration with implicit neural representations (INRs). Our proposed randomized basis sampling strategy reduces the required number of basis functions in KAN while maintaining registration quality, thereby significantly lowering computational costs. We evaluated our approach on three diverse datasets (lung CT, brain MRI, cardiac MRI) and compared it with competing instance-specific learning-based approaches, dataset-trained deep learning models, and classical registration approaches. KAN-IDIR and RandKAN-IDIR achieved the highest accuracy among INR-based methods across all evaluated modalities and anatomies, with minimal computational overhead and superior learning stability across multiple random seeds. Additionally, we discovered that our RandKAN-IDIR model with randomized basis sampling slightly outperforms the model with learnable basis function indices, while eliminating its additional training-time complexity.
Page 13 of 4494481 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.