Sort by:
Page 33 of 99986 results

SPARSE Data, Rich Results: Few-Shot Semi-Supervised Learning via Class-Conditioned Image Translation

Guido Manni, Clemente Lauretti, Loredana Zollo, Paolo Soda

arxiv logopreprintAug 8 2025
Deep learning has revolutionized medical imaging, but its effectiveness is severely limited by insufficient labeled training data. This paper introduces a novel GAN-based semi-supervised learning framework specifically designed for low labeled-data regimes, evaluated across settings with 5 to 50 labeled samples per class. Our approach integrates three specialized neural networks -- a generator for class-conditioned image translation, a discriminator for authenticity assessment and classification, and a dedicated classifier -- within a three-phase training framework. The method alternates between supervised training on limited labeled data and unsupervised learning that leverages abundant unlabeled images through image-to-image translation rather than generation from noise. We employ ensemble-based pseudo-labeling that combines confidence-weighted predictions from the discriminator and classifier with temporal consistency through exponential moving averaging, enabling reliable label estimation for unlabeled data. Comprehensive evaluation across eleven MedMNIST datasets demonstrates that our approach achieves statistically significant improvements over six state-of-the-art GAN-based semi-supervised methods, with particularly strong performance in the extreme 5-shot setting where the scarcity of labeled data is most challenging. The framework maintains its superiority across all evaluated settings (5, 10, 20, and 50 shots per class). Our approach offers a practical solution for medical imaging applications where annotation costs are prohibitive, enabling robust classification performance even with minimal labeled data. Code is available at https://github.com/GuidoManni/SPARSE.

Development and validation of a transformer-based deep learning model for predicting distant metastasis in non-small cell lung cancer using <sup>18</sup>FDG PET/CT images.

Hu N, Luo Y, Tang M, Yan G, Yuan S, Li F, Lei P

pubmed logopapersAug 8 2025
This study aimed to develop and validate a hybrid deep learning (DL) model that integrates convolutional neural network (CNN) and vision transformer (ViT) architectures to predict distant metastasis (DM) in patients with non-small cell lung cancer (NSCLC) using <sup>18</sup>F-FDG PET/CT images. A retrospective analysis was conducted on a cohort of consecutively registered patients who were newly diagnosed and untreated for NSCLC. A total of 167 patients with available PET/CT images were included in the analysis. DL features were extracted using a combination of CNN and ViT architectures, followed by feature selection, model construction, and evaluation of model performance using the receiver operating characteristic (ROC) and the area under the curve (AUC). The ViT-based DL model exhibited strong predictive capabilities in both the training and validation cohorts, achieving AUCs of 0.824 and 0.830 for CT features, and 0.602 and 0.694 for PET features, respectively. Notably, the model that integrated both PET and CT features demonstrated a notable AUC of 0.882 in the validation cohort, outperforming models that utilized either PET or CT features alone. Furthermore, this model outperformed the CNN model (ResNet 50), which achieved an AUC of 0.752 [95% CI 0.613, 0.890], p < 0.05. Decision curve analysis further supported the efficacy of the ViT-based DL model. The ViT-based DL developed in this study demonstrates considerable potential in predicting DM in patients with NSCLC, potentially informing the creation of personalized treatment strategies. Future validation through prospective studies with larger cohorts is necessary.

Explainable Cryobiopsy AI Model, CRAI, to Predict Disease Progression for Transbronchial Lung Cryobiopsies with Interstitial Pneumonia

Uegami, W., Okoshi, E. N., Lami, K., Nei, Y., Ozasa, M., Kataoka, K., Kitamura, Y., Kohashi, Y., Cooper, L. A. D., Sakanashi, H., Saito, Y., Kondoh, Y., the study group on CRYOSOLUTION,, Fukuoka, J.

medrxiv logopreprintAug 8 2025
BackgroundInterstitial lung disease (ILD) encompasses diverse pulmonary disorders with varied prognoses. Current pathological diagnoses suffer from inter-observer variability,necessitating more standardized approaches. We developed an ensemble model AI for cryobiopsy, CRAI, an artificial intelligence model to analyze transbronchial lung cryobiopsy (TBLC) specimens and predict patient outcomes. MethodsWe developed an explainable AI model, CRAI, to analyze TBLC. CRAI comprises seven modules for detecting histological features, generating 19 pathologically significant findings. A downstream XGBoost classifier was developed to predict disease progression using these findings. The models performance was evaluated using respiratory function changes and survival analysis in cross-validation and external test cohorts. FindingsIn the internal cross-validation (135 cases), the model predicted 105 cases without disease progression and 30 with disease progression. The annual {Delta}%FVC was -1.293 in the non-progressive group versus -5.198 in the progressive group, outperforming most pathologists diagnoses. In the external test cohort (48 cases), the model predicted 38 non-progressive and 10 progressive cases. Survival analysis demonstrated significantly shorter survival times in the progressive group (p=0.034). InterpretationCRAI provides a comprehensive, interpretable approach to analyzing TBLC specimens, offering potential for standardizing ILD diagnosis and predicting disease progression. The model could facilitate early identification of progressive cases and guide personalized therapeutic interventions. FundingNew Energy and Industrial Technology Development Organization (NEDO) and Japanese Ministry of Health, Labor, and Welfare.

Multivariate Fields of Experts

Stanislas Ducotterd, Michael Unser

arxiv logopreprintAug 8 2025
We introduce the multivariate fields of experts, a new framework for the learning of image priors. Our model generalizes existing fields of experts methods by incorporating multivariate potential functions constructed via Moreau envelopes of the $\ell_\infty$-norm. We demonstrate the effectiveness of our proposal across a range of inverse problems that include image denoising, deblurring, compressed-sensing magnetic-resonance imaging, and computed tomography. The proposed approach outperforms comparable univariate models and achieves performance close to that of deep-learning-based regularizers while being significantly faster, requiring fewer parameters, and being trained on substantially fewer data. In addition, our model retains a relatively high level of interpretability due to its structured design.

MedCLIP-SAMv2: Towards universal text-driven medical image segmentation.

Koleilat T, Asgariandehkordi H, Rivaz H, Xiao Y

pubmed logopapersAug 7 2025
Segmentation of anatomical structures and pathologies in medical images is essential for modern disease diagnosis, clinical research, and treatment planning. While significant advancements have been made in deep learning-based segmentation techniques, many of these methods still suffer from limitations in data efficiency, generalizability, and interactivity. As a result, developing robust segmentation methods that require fewer labeled datasets remains a critical challenge in medical image analysis. Recently, the introduction of foundation models like CLIP and Segment-Anything-Model (SAM), with robust cross-domain representations, has paved the way for interactive and universal image segmentation. However, further exploration of these models for data-efficient segmentation in medical imaging is an active field of research. In this paper, we introduce MedCLIP-SAMv2, a novel framework that integrates the CLIP and SAM models to perform segmentation on clinical scans using text prompts, in both zero-shot and weakly supervised settings. Our approach includes fine-tuning the BiomedCLIP model with a new Decoupled Hard Negative Noise Contrastive Estimation (DHN-NCE) loss, and leveraging the Multi-modal Information Bottleneck (M2IB) to create visual prompts for generating segmentation masks with SAM in the zero-shot setting. We also investigate using zero-shot segmentation labels in a weakly supervised paradigm to enhance segmentation quality further. Extensive validation across four diverse segmentation tasks and medical imaging modalities (breast tumor ultrasound, brain tumor MRI, lung X-ray, and lung CT) demonstrates the high accuracy of our proposed framework. Our code is available at https://github.com/HealthX-Lab/MedCLIP-SAMv2.

Structured Report Generation for Breast Cancer Imaging Based on Large Language Modeling: A Comparative Analysis of GPT-4 and DeepSeek.

Chen K, Hou X, Li X, Xu W, Yi H

pubmed logopapersAug 7 2025
The purpose of this study is to compare the performance of GPT-4 and DeepSeek large language models in generating structured breast cancer multimodality imaging integrated reports from free-text radiology reports including mammography, ultrasound, MRI, and PET/CT. A retrospective analysis was conducted on 1358 free-text reports from 501 breast cancer patients across two institutions. The study design involved synthesizing multimodal imaging data into structured reports with three components: primary lesion characteristics, metastatic lesions, and TNM staging. Input prompts were standardized for both models, with GPT-4 using predesigned instructions and DeepSeek requiring manual input. Reports were evaluated based on physician satisfaction using a Likert scale, descriptive accuracy including lesion localization, size, SUV, and metastasis assessment, and TNM staging correctness according to NCCN guidelines. Statistical analysis included McNemar tests for binary outcomes and correlation analysis for multiclass comparisons with a significance threshold of P < .05. Physician satisfaction scores showed strong correlation between models with r-values of 0.665 and 0.558 and P-values below .001. Both models demonstrated high accuracy in data extraction and integration. The mean accuracy for primary lesion features was 91.7% for GPT-4% and 92.1% for DeepSeek, while feature synthesis accuracy was 93.4% for GPT4 and 93.9% for DeepSeek. Metastatic lesion identification showed comparable overall accuracy at 93.5% for GPT4 and 94.4% for DeepSeek. GPT-4 performed better in pleural lesion detection with 94.9% accuracy compared to 79.5% for DeepSeek, whereas DeepSeek achieved higher accuracy in mesenteric metastasis identification at 87.5% vs 43.8% for GPT4. TNM staging accuracy exceeded 92% for T-stage and 94% for M-stage, with N-stage accuracy improving beyond 90% when supplemented with physical exam data. Both GPT-4 and DeepSeek effectively generate structured breast cancer imaging reports with high accuracy in data mining, integration, and TNM staging. Integrating these models into clinical practice is expected to enhance report standardization and physician productivity.

A Multimodal Deep Learning Ensemble Framework for Building a Spine Surgery Triage System.

Siavashpour M, McCabe E, Nataraj A, Pareek N, Zaiane O, Gross D

pubmed logopapersAug 7 2025
Spinal radiology reports and physician-completed questionnaires serve as crucial resources for medical decision-making for patients experiencing low back and neck pain. However, due to the time-consuming nature of this process, individuals with severe conditions may experience a deterioration in their health before receiving professional care. In this work, we propose an ensemble framework built on top of pre-trained BERT-based models which can classify patients on their need for surgery given their different data modalities including radiology reports and questionnaires. Our results demonstrate that our approach exceeds previous studies, effectively integrating information from multiple data modalities and serving as a valuable tool to assist physicians in decision making.

Role of AI in Clinical Decision-Making: An Analysis of FDA Medical Device Approvals.

Fernando P, Lyell D, Wang Y, Magrabi F

pubmed logopapersAug 7 2025
The U.S. Food and Drug Administration (FDA) plays an important role in ensuring safety and effectiveness of AI/ML-enabled devices through its regulatory processes. In recent years, there has been an increase in the number of these devices cleared by FDA. This study analyzes 104 FDA-approved ML-enabled medical devices from May 2021 to April 2023, extending previous research to provide a contemporary perspective on this evolving landscape. We examined clinical task, device task, device input and output, ML method and level of autonomy. Most approvals (n = 103) were via the 510(k) premarket notification pathway, indicating substantial equivalence to existing devices. Devices predominantly supported diagnostic tasks (n = 81). The majority of devices used imaging data (n = 99), with CT and MRI being the most common modalities. Device autonomy levels were distributed as follows: 52% assistive (requiring users to confirm or approve AI provided information or decision), 27% autonomous information, and 21% autonomous decision. The prevalence of assistive devices indicates a cautious approach to integrating ML into clinical decision-making, favoring support rather than replacement of human judgment.

MoMA: A Mixture-of-Multimodal-Agents Architecture for Enhancing Clinical Prediction Modelling

Jifan Gao, Mahmudur Rahman, John Caskey, Madeline Oguss, Ann O'Rourke, Randy Brown, Anne Stey, Anoop Mayampurath, Matthew M. Churpek, Guanhua Chen, Majid Afshar

arxiv logopreprintAug 7 2025
Multimodal electronic health record (EHR) data provide richer, complementary insights into patient health compared to single-modality data. However, effectively integrating diverse data modalities for clinical prediction modeling remains challenging due to the substantial data requirements. We introduce a novel architecture, Mixture-of-Multimodal-Agents (MoMA), designed to leverage multiple large language model (LLM) agents for clinical prediction tasks using multimodal EHR data. MoMA employs specialized LLM agents ("specialist agents") to convert non-textual modalities, such as medical images and laboratory results, into structured textual summaries. These summaries, together with clinical notes, are combined by another LLM ("aggregator agent") to generate a unified multimodal summary, which is then used by a third LLM ("predictor agent") to produce clinical predictions. Evaluating MoMA on three prediction tasks using real-world datasets with different modality combinations and prediction settings, MoMA outperforms current state-of-the-art methods, highlighting its enhanced accuracy and flexibility across various tasks.

MM2CT: MR-to-CT translation for multi-modal image fusion with mamba

Chaohui Gong, Zhiying Wu, Zisheng Huang, Gaofeng Meng, Zhen Lei, Hongbin Liu

arxiv logopreprintAug 7 2025
Magnetic resonance (MR)-to-computed tomography (CT) translation offers significant advantages, including the elimination of radiation exposure associated with CT scans and the mitigation of imaging artifacts caused by patient motion. The existing approaches are based on single-modality MR-to-CT translation, with limited research exploring multimodal fusion. To address this limitation, we introduce Multi-modal MR to CT (MM2CT) translation method by leveraging multimodal T1- and T2-weighted MRI data, an innovative Mamba-based framework for multi-modal medical image synthesis. Mamba effectively overcomes the limited local receptive field in CNNs and the high computational complexity issues in Transformers. MM2CT leverages this advantage to maintain long-range dependencies modeling capabilities while achieving multi-modal MR feature integration. Additionally, we incorporate a dynamic local convolution module and a dynamic enhancement module to improve MRI-to-CT synthesis. The experiments on a public pelvis dataset demonstrate that MM2CT achieves state-of-the-art performance in terms of Structural Similarity Index Measure (SSIM) and Peak Signal-to-Noise Ratio (PSNR). Our code is publicly available at https://github.com/Gots-ch/MM2CT.
Page 33 of 99986 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.