Sort by:
Page 276 of 4644638 results

SafeClick: Error-Tolerant Interactive Segmentation of Any Medical Volumes via Hierarchical Expert Consensus

Yifan Gao, Jiaxi Sheng, Wenbin Wu, Haoyue Li, Yaoxian Dong, Chaoyang Ge, Feng Yuan, Xin Gao

arxiv logopreprintJun 23 2025
Foundation models for volumetric medical image segmentation have emerged as powerful tools in clinical workflows, enabling radiologists to delineate regions of interest through intuitive clicks. While these models demonstrate promising capabilities in segmenting previously unseen anatomical structures, their performance is strongly influenced by prompt quality. In clinical settings, radiologists often provide suboptimal prompts, which affects segmentation reliability and accuracy. To address this limitation, we present SafeClick, an error-tolerant interactive segmentation approach for medical volumes based on hierarchical expert consensus. SafeClick operates as a plug-and-play module compatible with foundation models including SAM 2 and MedSAM 2. The framework consists of two key components: a collaborative expert layer (CEL) that generates diverse feature representations through specialized transformer modules, and a consensus reasoning layer (CRL) that performs cross-referencing and adaptive integration of these features. This architecture transforms the segmentation process from a prompt-dependent operation to a robust framework capable of producing accurate results despite imperfect user inputs. Extensive experiments across 15 public datasets demonstrate that our plug-and-play approach consistently improves the performance of base foundation models, with particularly significant gains when working with imperfect prompts. The source code is available at https://github.com/yifangao112/SafeClick.

Rethinking Decoder Design: Improving Biomarker Segmentation Using Depth-to-Space Restoration and Residual Linear Attention

Saad Wazir, Daeyoung Kim

arxiv logopreprintJun 23 2025
Segmenting biomarkers in medical images is crucial for various biotech applications. Despite advances, Transformer and CNN based methods often struggle with variations in staining and morphology, limiting feature extraction. In medical image segmentation, where datasets often have limited sample availability, recent state-of-the-art (SOTA) methods achieve higher accuracy by leveraging pre-trained encoders, whereas end-to-end methods tend to underperform. This is due to challenges in effectively transferring rich multiscale features from encoders to decoders, as well as limitations in decoder efficiency. To address these issues, we propose an architecture that captures multi-scale local and global contextual information and a novel decoder design, which effectively integrates features from the encoder, emphasizes important channels and regions, and reconstructs spatial dimensions to enhance segmentation accuracy. Our method, compatible with various encoders, outperforms SOTA methods, as demonstrated by experiments on four datasets and ablation studies. Specifically, our method achieves absolute performance gains of 2.76% on MoNuSeg, 3.12% on DSB, 2.87% on Electron Microscopy, and 4.03% on TNBC datasets compared to existing SOTA methods. Code: https://github.com/saadwazir/MCADS-Decoder

BrainSymphony: A Transformer-Driven Fusion of fMRI Time Series and Structural Connectivity

Moein Khajehnejad, Forough Habibollahi, Adeel Razi

arxiv logopreprintJun 23 2025
Existing foundation models for neuroimaging are often prohibitively large and data-intensive. We introduce BrainSymphony, a lightweight, parameter-efficient foundation model that achieves state-of-the-art performance while being pre-trained on significantly smaller public datasets. BrainSymphony's strong multimodal architecture processes functional MRI data through parallel spatial and temporal transformer streams, which are then efficiently distilled into a unified representation by a Perceiver module. Concurrently, it models structural connectivity from diffusion MRI using a novel signed graph transformer to encode the brain's anatomical structure. These powerful, modality-specific representations are then integrated via an adaptive fusion gate. Despite its compact design, our model consistently outperforms larger models on a diverse range of downstream benchmarks, including classification, prediction, and unsupervised network identification tasks. Furthermore, our model revealed novel insights into brain dynamics using attention maps on a unique external psilocybin neuroimaging dataset (pre- and post-administration). BrainSymphony establishes that architecturally-aware, multimodal models can surpass their larger counterparts, paving the way for more accessible and powerful research in computational neuroscience.

Open Set Recognition for Endoscopic Image Classification: A Deep Learning Approach on the Kvasir Dataset

Kasra Moazzami, Seoyoun Son, John Lin, Sun Min Lee, Daniel Son, Hayeon Lee, Jeongho Lee, Seongji Lee

arxiv logopreprintJun 23 2025
Endoscopic image classification plays a pivotal role in medical diagnostics by identifying anatomical landmarks and pathological findings. However, conventional closed-set classification frameworks are inherently limited in open-world clinical settings, where previously unseen conditions can arise andcompromise model reliability. To address this, we explore the application of Open Set Recognition (OSR) techniques on the Kvasir dataset, a publicly available and diverse endoscopic image collection. In this study, we evaluate and compare the OSR capabilities of several representative deep learning architectures, including ResNet-50, Swin Transformer, and a hybrid ResNet-Transformer model, under both closed-set and open-set conditions. OpenMax is adopted as a baseline OSR method to assess the ability of these models to distinguish known classes from previously unseen categories. This work represents one of the first efforts to apply open set recognition to the Kvasir dataset and provides a foundational benchmark for evaluating OSR performance in medical image analysis. Our results offer practical insights into model behavior in clinically realistic settings and highlight the importance of OSR techniques for the safe deployment of AI systems in endoscopy.

[Incidental pulmonary nodules on CT imaging: what to do?].

van der Heijden EHFM, Snoeren M, Jacobs C

pubmed logopapersJun 23 2025
Incidental pulmonary nodules are very frequently found on CT imaging and may represent (early stage) lung cancers without any signs or symptoms. These incidental findings can be solid lesions or ground glass lesions that may be solitary or multiple. Careful, and systematic evaluation of these findings in imaging is needed to determine the risk of malignancy, based on imaging characteristics, patient factors like smoking habits, prior cancers or family history, and growth rate preferably determined by volume measurements. Once the risk of malignancy is increased, minimal invasive image guided biopsy is warranted, preferably by navigation bronchoscopy. We present two cases to illustrate this clinical workup: one case with a benign solitary pulmonary nodule, and a second case with multiple ground glass opacities, diagnosed as synchronous primary adenocarcinomas of the lung. This is followed by a review of the current status of computer and artificial intelligence aided diagnostic support and clinical workflow optimization.

Fine-tuned large language model for classifying CT-guided interventional radiology reports.

Yasaka K, Nishimura N, Fukushima T, Kubo T, Kiryu S, Abe O

pubmed logopapersJun 23 2025
BackgroundManual data curation was necessary to extract radiology reports due to the ambiguities of natural language.PurposeTo develop a fine-tuned large language model that classifies computed tomography (CT)-guided interventional radiology reports into technique categories and to compare its performance with that of the readers.Material and MethodsThis retrospective study included patients who underwent CT-guided interventional radiology between August 2008 and November 2024. Patients were chronologically assigned to the training (n = 1142; 646 men; mean age = 64.1 ± 15.7 years), validation (n = 131; 83 men; mean age = 66.1 ± 16.1 years), and test (n = 332; 196 men; mean age = 66.1 ± 14.8 years) datasets. In establishing a reference standard, reports were manually classified into categories 1 (drainage), 2 (lesion biopsy within fat or soft tissue density tissues), 3 (lung biopsy), and 4 (bone biopsy). The bi-directional encoder representation from the transformers model was fine-tuned with the training dataset, and the model with the best performance in the validation dataset was selected. The performance and required time for classification in the test dataset were compared between the best-performing model and the two readers.ResultsCategories 1/2/3/4 included 309/367/270/196, 30/42/40/19, and 75/124/78/55 patients for the training, validation, and test datasets, respectively. The model demonstrated an accuracy of 0.979 in the test dataset, which was significantly better than that of the readers (0.922-0.940) (<i>P</i> ≤0.012). The model classified reports within a 49.8-53.5-fold shorter time compared to readers.ConclusionThe fine-tuned large language model classified CT-guided interventional radiology reports into four categories demonstrating high accuracy within a remarkably short time.

Multimodal deep learning for predicting neoadjuvant treatment outcomes in breast cancer: a systematic review.

Krasniqi E, Filomeno L, Arcuri T, Ferretti G, Gasparro S, Fulvi A, Roselli A, D'Onofrio L, Pizzuti L, Barba M, Maugeri-Saccà M, Botti C, Graziano F, Puccica I, Cappelli S, Pelle F, Cavicchi F, Villanucci A, Paris I, Calabrò F, Rea S, Costantini M, Perracchio L, Sanguineti G, Takanen S, Marucci L, Greco L, Kayal R, Moscetti L, Marchesini E, Calonaci N, Blandino G, Caravagna G, Vici P

pubmed logopapersJun 23 2025
Pathological complete response (pCR) to neoadjuvant systemic therapy (NAST) is an established prognostic marker in breast cancer (BC). Multimodal deep learning (DL), integrating diverse data sources (radiology, pathology, omics, clinical), holds promise for improving pCR prediction accuracy. This systematic review synthesizes evidence on multimodal DL for pCR prediction and compares its performance against unimodal DL. Following PRISMA, we searched PubMed, Embase, and Web of Science (January 2015-April 2025) for studies applying DL to predict pCR in BC patients receiving NAST, using data from radiology, digital pathology (DP), multi-omics, and/or clinical records, and reporting AUC. Data on study design, DL architectures, and performance (AUC) were extracted. A narrative synthesis was conducted due to heterogeneity. Fifty-one studies, mostly retrospective (90.2%, median cohort 281), were included. Magnetic resonance imaging and DP were common primary modalities. Multimodal approaches were used in 52.9% of studies, often combining imaging with clinical data. Convolutional neural networks were the dominant architecture (88.2%). Longitudinal imaging improved prediction over baseline-only (median AUC 0.91 vs. 0.82). Overall, the median AUC across studies was 0.88, with 35.3% achieving AUC ≥ 0.90. Multimodal models showed a modest but consistent improvement over unimodal approaches (median AUC 0.88 vs. 0.83). Omics and clinical text were rarely primary DL inputs. DL models demonstrate promising accuracy for pCR prediction, especially when integrating multiple modalities and longitudinal imaging. However, significant methodological heterogeneity, reliance on retrospective data, and limited external validation hinder clinical translation. Future research should prioritize prospective validation, integration underutilized data (multi-omics, clinical), and explainable AI to advance DL predictors to the clinical setting.

Intelligent Virtual Dental Implant Placement via 3D Segmentation Strategy.

Cai G, Wen B, Gong Z, Lin Y, Liu H, Zeng P, Shi M, Wang R, Chen Z

pubmed logopapersJun 23 2025
Virtual dental implant placement in cone-beam computed tomography (CBCT) is a prerequisite for digital implant surgery, carrying clinical significance. However, manual placement is a complex process that should meet clinical essential requirements of restoration orientation, bone adaptation, and anatomical safety. This complexity presents challenges in balancing multiple considerations comprehensively and automating the entire workflow efficiently. This study aims to achieve intelligent virtual dental implant placement through a 3-dimensional (3D) segmentation strategy. Focusing on the missing mandibular first molars, we developed a segmentation module based on nnU-Net to generate the virtual implant from the edentulous region of CBCT and employed an approximation module for mathematical optimization. The generated virtual implant was integrated with the original CBCT to meet clinical requirements. A total of 190 CBCT scans from 4 centers were collected for model development and testing. This tool segmented the virtual implant with a surface Dice coefficient (sDice) of 0.903 and 0.884 on internal and external testing sets. Compared to the ground truth, the average deviations of the implant platform, implant apex, and angle were 0.850 ± 0.554 mm, 1.442 ± 0.539 mm, and 4.927 ± 3.804° on the internal testing set and 0.822 ± 0.353 mm, 1.467 ± 0.560 mm, and 5.517 ± 2.850° on the external testing set, respectively. The 3D segmentation-based artificial intelligence tool demonstrated good performance in predicting both the dimension and position of the virtual implants, showing significant clinical application potential in implant planning.

Development and validation of a SOTA-based system for biliopancreatic segmentation and station recognition system in EUS.

Zhang J, Zhang J, Chen H, Tian F, Zhang Y, Zhou Y, Jiang Z

pubmed logopapersJun 23 2025
Endoscopic ultrasound (EUS) is a vital tool for diagnosing biliopancreatic disease, offering detailed imaging to identify key abnormalities. Its interpretation demands expertise, which limits its accessibility for less trained practitioners. Thus, the creation of tools or systems to assist in interpreting EUS images is crucial for improving diagnostic accuracy and efficiency. To develop an AI-assisted EUS system for accurate pancreatic and biliopancreatic duct segmentation, and evaluate its impact on endoscopists' ability to identify biliary-pancreatic diseases during segmentation and anatomical localization. The EUS-AI system was designed to perform station positioning and anatomical structure segmentation. A total of 45,737 EUS images from 1852 patients were used for model training. Among them, 2881 images were for internal testing, and 2747 images from 208 patients were for external validation. Additionally, 340 images formed a man-machine competition test set. During the research process, various newer state-of-the-art (SOTA) deep learning algorithms were also compared. In classification, in the station recognition task, compared to the ResNet-50 and YOLOv8-CLS algorithms, the Mean Teacher algorithm achieved the highest accuracy, with an average of 95.60% (92.07%-99.12%) in the internal test set and 92.72% (88.30%-97.15%) in the external test set. For segmentation, compared to the UNet ++ and YOLOv8 algorithms, the U-Net v2 algorithm was optimal. Ultimately, the EUS-AI system was constructed using the optimal models from two tasks, and a man-machine competition experiment was conducted. The results demonstrated that the performance of the EUS-AI system significantly outperformed that of mid-level endoscopists, both in terms of position recognition (p < 0.001) and pancreas and biliopancreatic duct segmentation tasks (p < 0.001, p = 0.004). The EUS-AI system is expected to significantly shorten the learning curve for the pancreatic EUS examination and enhance procedural standardization.

Chest X-ray Foundation Model with Global and Local Representations Integration.

Yang Z, Xu X, Zhang J, Wang G, Kalra MK, Yan P

pubmed logopapersJun 23 2025
Chest X-ray (CXR) is the most frequently ordered imaging test, supporting diverse clinical tasks from thoracic disease detection to postoperative monitoring. However, task-specific classification models are limited in scope, require costly labeled data, and lack generalizability to out-of-distribution datasets. To address these challenges, we introduce CheXFound, a self-supervised vision foundation model that learns robust CXR representations and generalizes effectively across a wide range of downstream tasks. We pretrained CheXFound on a curated CXR-987K dataset, comprising over approximately 987K unique CXRs from 12 publicly available sources. We propose a Global and Local Representations Integration (GLoRI) head for downstream adaptations, by incorporating fine- and coarse-grained disease-specific local features with global image features for enhanced performance in multilabel classification. Our experimental results showed that CheXFound outperformed state-of-the-art models in classifying 40 disease findings across different prevalence levels on the CXR-LT 24 dataset and exhibited superior label efficiency on downstream tasks with limited training data. Additionally, CheXFound achieved significant improvements on downstream tasks with out-of-distribution datasets, including opportunistic cardiovascular disease risk estimation, mortality prediction, malpositioned tube detection, and anatomical structure segmentation. The above results demonstrate CheXFound's strong generalization capabilities, which will enable diverse downstream adaptations with improved label efficiency in future applications. The project source code is publicly available at https://github.com/RPIDIAL/CheXFound.
Page 276 of 4644638 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.