Sort by:
Page 104 of 2402393 results

MaskedCLIP: Bridging the Masked and CLIP Space for Semi-Supervised Medical Vision-Language Pre-training

Lei Zhu, Jun Zhou, Rick Siow Mong Goh, Yong Liu

arxiv logopreprintJul 23 2025
Foundation models have recently gained tremendous popularity in medical image analysis. State-of-the-art methods leverage either paired image-text data via vision-language pre-training or unpaired image data via self-supervised pre-training to learn foundation models with generalizable image features to boost downstream task performance. However, learning foundation models exclusively on either paired or unpaired image data limits their ability to learn richer and more comprehensive image features. In this paper, we investigate a novel task termed semi-supervised vision-language pre-training, aiming to fully harness the potential of both paired and unpaired image data for foundation model learning. To this end, we propose MaskedCLIP, a synergistic masked image modeling and contrastive language-image pre-training framework for semi-supervised vision-language pre-training. The key challenge in combining paired and unpaired image data for learning a foundation model lies in the incompatible feature spaces derived from these two types of data. To address this issue, we propose to connect the masked feature space with the CLIP feature space with a bridge transformer. In this way, the more semantic specific CLIP features can benefit from the more general masked features for semantic feature extraction. We further propose a masked knowledge distillation loss to distill semantic knowledge of original image features in CLIP feature space back to the predicted masked image features in masked feature space. With this mutually interactive design, our framework effectively leverages both paired and unpaired image data to learn more generalizable image features for downstream tasks. Extensive experiments on retinal image analysis demonstrate the effectiveness and data efficiency of our method.

Kissing Spine and Other Imaging Predictors of Postoperative Cement Displacement Following Percutaneous Kyphoplasty: A Machine Learning Approach.

Zhao Y, Bo L, Qian L, Chen X, Wang Y, Cui L, Xin Y, Liu L

pubmed logopapersJul 23 2025
To investigate the risk factors associated with postoperative cement displacement following percutaneous kyphoplasty (PKP) in patients with osteoporotic vertebral compression fractures (OVCF) and to develop predictive models for clinical risk assessment. This retrospective study included 198 patients with OVCF who underwent PKP. Imaging and clinical variables were collected. Multiple machine learning models, including logistic regression, L1- and L2-regularized logistic regression, support vector machine (SVM), decision tree, gradient boosting, and random forest, were developed to predict cement displacement. L1- and L2-regularized logistic regression models identified four key risk factors: kissing spine (L1: 1.11; L2: 0.91), incomplete anterior cortex (L1: -1.60; L2: -1.62), low vertebral body CT value (L1: -2.38; L2: -1.71), and large Cobb change (L1: 0.89; L2: 0.87). The support vector machine (SVM) model achieved the best performance (accuracy: 0.983, precision: 0.875, recall: 1.000, F1-score: 0.933, specificity: 0.981, AUC: 0.997). Other models, including logistic regression, decision tree, gradient boosting, and random forest, also showed high performance but were slightly inferior to SVM. Key predictors of cement displacement were identified, and machine learning models were developed for risk assessment. These findings can assist clinicians in identifying high-risk patients, optimizing treatment strategies, and improving patient outcomes.

Mammo-Mamba: A Hybrid State-Space and Transformer Architecture with Sequential Mixture of Experts for Multi-View Mammography

Farnoush Bayatmakou, Reza Taleei, Nicole Simone, Arash Mohammadi

arxiv logopreprintJul 23 2025
Breast cancer (BC) remains one of the leading causes of cancer-related mortality among women, despite recent advances in Computer-Aided Diagnosis (CAD) systems. Accurate and efficient interpretation of multi-view mammograms is essential for early detection, driving a surge of interest in Artificial Intelligence (AI)-powered CAD models. While state-of-the-art multi-view mammogram classification models are largely based on Transformer architectures, their computational complexity scales quadratically with the number of image patches, highlighting the need for more efficient alternatives. To address this challenge, we propose Mammo-Mamba, a novel framework that integrates Selective State-Space Models (SSMs), transformer-based attention, and expert-driven feature refinement into a unified architecture. Mammo-Mamba extends the MambaVision backbone by introducing the Sequential Mixture of Experts (SeqMoE) mechanism through its customized SecMamba block. The SecMamba is a modified MambaVision block that enhances representation learning in high-resolution mammographic images by enabling content-adaptive feature refinement. These blocks are integrated into the deeper stages of MambaVision, allowing the model to progressively adjust feature emphasis through dynamic expert gating, effectively mitigating the limitations of traditional Transformer models. Evaluated on the CBIS-DDSM benchmark dataset, Mammo-Mamba achieves superior classification performance across all key metrics while maintaining computational efficiency.

Fetal neurobehavior and consciousness: a systematic review of 4D ultrasound evidence and ethical challenges.

Pramono MBA, Andonotopo W, Bachnas MA, Dewantiningrum J, Sanjaya INH, Sulistyowati S, Stanojevic M, Kurjak A

pubmed logopapersJul 23 2025
Recent advancements in four-dimensional (4D) ultrasonography have enabled detailed observation of fetal behavior <i>in utero</i>, including facial movements, limb gestures, and stimulus responses. These developments have prompted renewed inquiry into whether such behaviors are merely reflexive or represent early signs of integrated neural function. However, the relationship between fetal movement patterns and conscious awareness remains scientifically uncertain and ethically contested. A systematic review was conducted in accordance with PRISMA 2020 guidelines. Four databases (PubMed, Scopus, Embase, Web of Science) were searched for English-language articles published from 2000 to 2025, using keywords including "fetal behavior," "4D ultrasound," "neurodevelopment," and "consciousness." Studies were included if they involved human fetuses, used 4D ultrasound or functional imaging modalities, and offered interpretation relevant to neurobehavioral or ethical analysis. A structured appraisal using AMSTAR-2 was applied to assess study quality. Data were synthesized narratively to map fetal behaviors onto developmental milestones and evaluate their interpretive limits. Seventy-four studies met inclusion criteria, with 23 rated as high-quality. Fetal behaviors such as yawning, hand-to-face movement, and startle responses increased in complexity between 24-34 weeks gestation. These patterns aligned with known neurodevelopmental events, including thalamocortical connectivity and cortical folding. However, no study provided definitive evidence linking observed behaviors to conscious experience. Emerging applications of artificial intelligence in ultrasound analysis were found to enhance pattern recognition but lack external validation. Fetal behavior observed via 4D ultrasound may reflect increasing neural integration but should not be equated with awareness. Interpretations must remain cautious, avoiding anthropomorphic assumptions. Ethical engagement requires attention to scientific limits, sociocultural diversity, and respect for maternal autonomy as imaging technologies continue to evolve.

VGS-ATD: Robust Distributed Learning for Multi-Label Medical Image Classification Under Heterogeneous and Imbalanced Conditions

Zehui Zhao, Laith Alzubaidi, Haider A. Alwzwazy, Jinglan Zhang, Yuantong Gu

arxiv logopreprintJul 23 2025
In recent years, advanced deep learning architectures have shown strong performance in medical imaging tasks. However, the traditional centralized learning paradigm poses serious privacy risks as all data is collected and trained on a single server. To mitigate this challenge, decentralized approaches such as federated learning and swarm learning have emerged, allowing model training on local nodes while sharing only model weights. While these methods enhance privacy, they struggle with heterogeneous and imbalanced data and suffer from inefficiencies due to frequent communication and the aggregation of weights. More critically, the dynamic and complex nature of clinical environments demands scalable AI systems capable of continuously learning from diverse modalities and multilabels. Yet, both centralized and decentralized models are prone to catastrophic forgetting during system expansion, often requiring full model retraining to incorporate new data. To address these limitations, we propose VGS-ATD, a novel distributed learning framework. To validate VGS-ATD, we evaluate it in experiments spanning 30 datasets and 80 independent labels across distributed nodes, VGS-ATD achieved an overall accuracy of 92.7%, outperforming centralized learning (84.9%) and swarm learning (72.99%), while federated learning failed under these conditions due to high requirements on computational resources. VGS-ATD also demonstrated strong scalability, with only a 1% drop in accuracy on existing nodes after expansion, compared to a 20% drop in centralized learning, highlighting its resilience to catastrophic forgetting. Additionally, it reduced computational costs by up to 50% relative to both centralized and swarm learning, confirming its superior efficiency and scalability.

Interpretable Deep Learning Approaches for Reliable GI Image Classification: A Study with the HyperKvasir Dataset

Wahid, S. B., Rothy, Z. T., News, R. K., Rieyan, S. A.

medrxiv logopreprintJul 23 2025
Deep learning has emerged as a promising tool for automating gastrointestinal (GI) disease diagnosis. However, multi-class GI disease classification remains underexplored. This study addresses this gap by presenting a framework that uses advanced models like InceptionNetV3 and ResNet50, combined with boosting algorithms (XGB, LGBM), to classify lower GI abnormalities. InceptionNetV3 with XGB achieved the best recall of 0.81 and an F1 score of 0.90. To assist clinicians in understanding model decisions, the Grad-CAM technique, a form of explainable AI, was employed to highlight the critical regions influencing predictions, fostering trust in these systems. This approach significantly improves both the accuracy and reliability of GI disease diagnosis.

Interpretable AI Framework for Secure and Reliable Medical Image Analysis in IoMT Systems.

Matthew UO, Rosa RL, Saadi M, Rodriguez DZ

pubmed logopapersJul 23 2025
The integration of artificial intelligence (AI) into medical image analysis has transformed healthcare, offering unprecedented precision in diagnosis, treatment planning, and disease monitoring. However, its adoption within the Internet of Medical Things (IoMT) raises significant challenges related to transparency, trustworthiness, and security. This paper introduces a novel Explainable AI (XAI) framework tailored for Medical Cyber-Physical Systems (MCPS), addressing these challenges by combining deep neural networks with symbolic knowledge reasoning to deliver clinically interpretable insights. The framework incorporates an Enhanced Dynamic Confidence-Weighted Attention (Enhanced DCWA) mechanism, which improves interpretability and robustness by dynamically refining attention maps through adaptive normalization and multi-level confidence weighting. Additionally, a Resilient Observability and Detection Engine (RODE) leverages sparse observability principles to detect and mitigate adversarial threats, ensuring reliable performance in dynamic IoMT environments. Evaluations conducted on benchmark datasets, including CheXpert, RSNA Pneumonia Detection Challenge, and NIH Chest X-ray Dataset, demonstrate significant advancements in classification accuracy, adversarial robustness, and explainability. The framework achieves a 15% increase in lesion classification accuracy, a 30% reduction in robustness loss, and a 20% improvement in the Explainability Index compared to state-of-the-art methods.

Machine learning approach effectively discriminates between Parkinson's disease and progressive supranuclear palsy: multi-level indices of rs-fMRI.

Cheng W, Liang X, Zeng W, Guo J, Yin Z, Dai J, Hong D, Zhou F, Li F, Fang X

pubmed logopapersJul 22 2025
Parkinson's disease (PD) and progressive supranuclear palsy (PSP) present similar clinical symptoms, but their treatment options and clinical prognosis differ significantly. Therefore, we aimed to discriminate between PD and PSP based on multi-level indices of resting-state functional magnetic resonance imaging (rs-fMRI) via the machine learning approach. A total of 58 PD and 52 PSP patients were prospectively enrolled in this study. Participants were randomly allocated to a training set and a validation set in a 7:3 ratio. Various rs-fMRI indices were extracted, followed by a comprehensive feature screening for each index. We constructed fifteen distinct combinations of indices and selected four machine learning algorithms for model development. Subsequently, different validation templates were employed to assess the classification results and investigate the relationship between the most significant features and clinical assessment scales. The classification performance of logistic regression (LR) and support vector machine (SVM) models, based on multiple index combinations, was significantly superior to that of other machine learning models and combinations when utilizing automatic anatomical labeling (AAL) templates. This has been verified across different templates. The utilization of multiple rs-fMRI indices significantly enhances the performance of machine learning models and can effectively achieve the automatic identification of PD and PSP at the individual level.

CLIF-Net: Intersection-guided Cross-view Fusion Network for Infection Detection from Cranial Ultrasound

Yu, M., Peterson, M. R., Burgoine, K., Harbaugh, T., Olupot-Olupot, P., Gladstone, M., Hagmann, C., Cowan, F. M., Weeks, A., Morton, S. U., Mulondo, R., Mbabazi-Kabachelor, E., Schiff, S. J., Monga, V.

medrxiv logopreprintJul 22 2025
This paper addresses the problem of detecting possible serious bacterial infection (pSBI) of infancy, i.e. a clinical presentation consistent with bacterial sepsis in newborn infants using cranial ultrasound (cUS) images. The captured image set for each patient enables multiview imagery: coronal and sagittal, with geometric overlap. To exploit this geometric relation, we develop a new learning framework, called the intersection-guided Crossview Local-and Image-level Fusion Network (CLIF-Net). Our technique employs two distinct convolutional neural network branches to extract features from coronal and sagittal images with newly developed multi-level fusion blocks. Specifically, we leverage the spatial position of these images to locate the intersecting region. We then identify and enhance the semantic features from this region across multiple levels using cross-attention modules, facilitating the acquisition of mutually beneficial and more representative features from both views. The final enhanced features from the two views are then integrated and projected through the image-level fusion layer, outputting pSBI and non-pSBI class probabilities. We contend that our method of exploiting multi-view cUS images enables a first of its kind, robust 3D representation tailored for pSBI detection. When evaluated on a dataset of 302 cUS scans from Mbale Regional Referral Hospital in Uganda, CLIF-Net demonstrates substantially enhanced performance, surpassing the prevailing state-of-the-art infection detection techniques.

Artificial intelligence in thyroid eye disease imaging: A systematic review.

Zhang H, Li Z, Chan HC, Song X, Zhou H, Fan X

pubmed logopapersJul 22 2025
Thyroid eye disease (TED) is a common, complex orbital disorder characterized by soft-tissue changes visible on imaging. Artificial intelligence (AI) offers promises for improving TED diagnosis and treatment; however, no systematic review has yet characterized the research landscape, key challenges, and future directions. We followed PRISMA guidelines to search multiple databases until January, 2025, for studies applying AI to computed tomography (CT), magnetic resonance imaging, and nuclear, facial or retinal imaging in TED patients. Using the APPRAISE-AI tool, we assessed study quality and included 41 studies covering various AI applications. Sample sizes ranged from 33 to 2,288 participants, predominantly East Asian. CT and facial imaging were the most common modalities, reported in 16 and 13 articles, respectively. Studies addressed clinical tasks-diagnosis, activity assessment, severity grading, and treatment prediction-and technical tasks-classification, segmentation, and image generation-with classification being the most frequent. Researchers primarily employed deep-learning models, such as residual network (ResNet) and Visual Geometry Group (VGG). Overall, the majority of the studies were of moderate quality. Image-based AI shows strong potential to improve diagnostic accuracy and guide personalized treatment strategies in TED. Future research should prioritize robust study designs, the creation of public datasets, multimodal imaging integration, and interdisciplinary collaboration to accelerate clinical translation.
Page 104 of 2402393 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.