Sort by:
Page 85 of 1421420 results

GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation.

Wang S, Li G, Gao M, Zhuo L, Liu M, Ma Z, Zhao W, Fu X

pubmed logopapersJul 10 2025
Medical image segmentation is vital for accurate diagnosis. While U-Net-based models are effective, they struggle to capture long-range dependencies in complex anatomy. We propose GH-UNet, a Group-wise Hybrid Convolution-ViT model within the U-Net framework, to address this limitation. GH-UNet integrates a hybrid convolution-Transformer encoder for both local detail and global context modeling, a Group-wise Dynamic Gating (GDG) module for adaptive feature weighting, and a cascaded decoder for multi-scale integration. Both the encoder and GDG are modular, enabling compatibility with various CNN or ViT backbones. Extensive experiments on five public and one private dataset show GH-UNet consistently achieves superior performance. On ISIC2016, it surpasses H2Former with 1.37% and 1.94% gains in DICE and IOU, respectively, using only 38% of the parameters and 49.61% of the FLOPs. The code is freely accessible via: https://github.com/xiachashuanghua/GH-UNet .

Dataset and Benchmark for Enhancing Critical Retained Foreign Object Detection

Yuli Wang, Victoria R. Shi, Liwei Zhou, Richard Chin, Yuwei Dai, Yuanyun Hu, Cheng-Yi Li, Haoyue Guan, Jiashu Cheng, Yu Sun, Cheng Ting Lin, Ihab Kamel, Premal Trivedi, Pamela Johnson, John Eng, Harrison Bai

arxiv logopreprintJul 9 2025
Critical retained foreign objects (RFOs), including surgical instruments like sponges and needles, pose serious patient safety risks and carry significant financial and legal implications for healthcare institutions. Detecting critical RFOs using artificial intelligence remains challenging due to their rarity and the limited availability of chest X-ray datasets that specifically feature critical RFOs cases. Existing datasets only contain non-critical RFOs, like necklace or zipper, further limiting their utility for developing clinically impactful detection algorithms. To address these limitations, we introduce "Hopkins RFOs Bench", the first and largest dataset of its kind, containing 144 chest X-ray images of critical RFO cases collected over 18 years from the Johns Hopkins Health System. Using this dataset, we benchmark several state-of-the-art object detection models, highlighting the need for enhanced detection methodologies for critical RFO cases. Recognizing data scarcity challenges, we further explore image synthetic methods to bridge this gap. We evaluate two advanced synthetic image methods, DeepDRR-RFO, a physics-based method, and RoentGen-RFO, a diffusion-based method, for creating realistic radiographs featuring critical RFOs. Our comprehensive analysis identifies the strengths and limitations of each synthetic method, providing insights into effectively utilizing synthetic data to enhance model training. The Hopkins RFOs Bench and our findings significantly advance the development of reliable, generalizable AI-driven solutions for detecting critical RFOs in clinical chest X-rays.

SimCortex: Collision-free Simultaneous Cortical Surfaces Reconstruction

Kaveh Moradkhani, R Jarrett Rushmore, Sylvain Bouix

arxiv logopreprintJul 9 2025
Accurate cortical surface reconstruction from magnetic resonance imaging (MRI) data is crucial for reliable neuroanatomical analyses. Current methods have to contend with complex cortical geometries, strict topological requirements, and often produce surfaces with overlaps, self-intersections, and topological defects. To overcome these shortcomings, we introduce SimCortex, a deep learning framework that simultaneously reconstructs all brain surfaces (left/right white-matter and pial) from T1-weighted(T1w) MRI volumes while preserving topological properties. Our method first segments the T1w image into a nine-class tissue label map. From these segmentations, we generate subject-specific, collision-free initial surface meshes. These surfaces serve as precise initializations for subsequent multiscale diffeomorphic deformations. Employing stationary velocity fields (SVFs) integrated via scaling-and-squaring, our approach ensures smooth, topology-preserving transformations with significantly reduced surface collisions and self-intersections. Evaluations on standard datasets demonstrate that SimCortex dramatically reduces surface overlaps and self-intersections, surpassing current methods while maintaining state-of-the-art geometric accuracy.

Cross-Modality Masked Learning for Survival Prediction in ICI Treated NSCLC Patients

Qilong Xing, Zikai Song, Bingxin Gong, Lian Yang, Junqing Yu, Wei Yang

arxiv logopreprintJul 9 2025
Accurate prognosis of non-small cell lung cancer (NSCLC) patients undergoing immunotherapy is essential for personalized treatment planning, enabling informed patient decisions, and improving both treatment outcomes and quality of life. However, the lack of large, relevant datasets and effective multi-modal feature fusion strategies pose significant challenges in this domain. To address these challenges, we present a large-scale dataset and introduce a novel framework for multi-modal feature fusion aimed at enhancing the accuracy of survival prediction. The dataset comprises 3D CT images and corresponding clinical records from NSCLC patients treated with immune checkpoint inhibitors (ICI), along with progression-free survival (PFS) and overall survival (OS) data. We further propose a cross-modality masked learning approach for medical feature fusion, consisting of two distinct branches, each tailored to its respective modality: a Slice-Depth Transformer for extracting 3D features from CT images and a graph-based Transformer for learning node features and relationships among clinical variables in tabular data. The fusion process is guided by a masked modality learning strategy, wherein the model utilizes the intact modality to reconstruct missing components. This mechanism improves the integration of modality-specific features, fostering more effective inter-modality relationships and feature interactions. Our approach demonstrates superior performance in multi-modal integration for NSCLC survival prediction, surpassing existing methods and setting a new benchmark for prognostic models in this context.

Enhancing automated detection and classification of dementia in individuals with cognitive impairment using artificial intelligence techniques.

Alotaibi SD, Alharbi AAK

pubmed logopapersJul 9 2025
Dementia is a degenerative and chronic disorder, increasingly prevalent among older adults, posing significant challenges in providing appropriate care. As the number of dementia cases continues to rise, delivering optimal care becomes more complex. Machine learning (ML) plays a crucial role in addressing this challenge by utilizing medical data to enhance care planning and management for individuals at risk of various types of dementia. Magnetic resonance imaging (MRI) is a commonly used method for analyzing neurological disorders. Recent evidence highlights the benefits of integrating artificial intelligence (AI) techniques with MRI, significantly enhancing the diagnostic accuracy for different forms of dementia. This paper explores the use of AI in the automated detection and classification of dementia, aiming to streamline early diagnosis and improve patient outcomes. Integrating ML models into clinical practice can transform dementia care by enabling early detection, personalized treatment plans, and more effectual monitoring of disease progression. In this study, an Enhancing Automated Detection and Classification of Dementia in Thinking Inability Persons using Artificial Intelligence Techniques (EADCD-TIPAIT) technique is presented. The goal of the EADCD-TIPAIT technique is for the detection and classification of dementia in individuals with cognitive impairment using MRI imaging. The EADCD-TIPAIT method performs preprocessing to scale the input data using z-score normalization to obtain this. Next, the EADCD-TIPAIT technique performs a binary greylag goose optimization (BGGO)-based feature selection approach to efficiently identify relevant features that distinguish between normal and dementia-affected brain regions. In addition, the wavelet neural network (WNN) classifier is employed to detect and classify dementia. Finally, the improved salp swarm algorithm (ISSA) is implemented to choose the WNN technique's hyperparameters optimally. The stimulation of the EADCD-TIPAIT technique is examined under a Dementia prediction dataset. The performance validation of the EADCD-TIPAIT approach portrayed a superior accuracy value of 95.00% under diverse measures.

Securing Healthcare Data Integrity: Deepfake Detection Using Autonomous AI Approaches.

Hsu CC, Tsai MY, Yu CM

pubmed logopapersJul 9 2025
The rapid evolution of deepfake technology poses critical challenges to healthcare systems, particularly in safeguarding the integrity of medical imaging, electronic health records (EHR), and telemedicine platforms. As autonomous AI becomes increasingly integrated into smart healthcare, the potential misuse of deepfakes to manipulate sensitive healthcare data or impersonate medical professionals highlights the urgent need for robust and adaptive detection mechanisms. In this work, we propose DProm, a dynamic deepfake detection framework leveraging visual prompt tuning (VPT) with a pre-trained Swin Transformer. Unlike traditional static detection models, which struggle to adapt to rapidly evolving deepfake techniques, DProm fine-tunes a small set of visual prompts to efficiently adapt to new data distributions with minimal computational and storage requirements. Comprehensive experiments demonstrate that DProm achieves state-of-the-art performance in both static cross-dataset evaluations and dynamic scenarios, ensuring robust detection across diverse data distributions. By addressing the challenges of scalability, adaptability, and resource efficiency, DProm offers a transformative solution for enhancing the security and trustworthiness of autonomous AI systems in healthcare, paving the way for safer and more reliable smart healthcare applications.

4KAgent: Agentic Any Image to 4K Super-Resolution

Yushen Zuo, Qi Zheng, Mingyang Wu, Xinrui Jiang, Renjie Li, Jian Wang, Yide Zhang, Gengchen Mai, Lihong V. Wang, James Zou, Xiaoyu Wang, Ming-Hsuan Yang, Zhengzhong Tu

arxiv logopreprintJul 9 2025
We present 4KAgent, a unified agentic super-resolution generalist system designed to universally upscale any image to 4K resolution (and even higher, if applied iteratively). Our system can transform images from extremely low resolutions with severe degradations, for example, highly distorted inputs at 256x256, into crystal-clear, photorealistic 4K outputs. 4KAgent comprises three core components: (1) Profiling, a module that customizes the 4KAgent pipeline based on bespoke use cases; (2) A Perception Agent, which leverages vision-language models alongside image quality assessment experts to analyze the input image and make a tailored restoration plan; and (3) A Restoration Agent, which executes the plan, following a recursive execution-reflection paradigm, guided by a quality-driven mixture-of-expert policy to select the optimal output for each step. Additionally, 4KAgent embeds a specialized face restoration pipeline, significantly enhancing facial details in portrait and selfie photos. We rigorously evaluate our 4KAgent across 11 distinct task categories encompassing a total of 26 diverse benchmarks, setting new state-of-the-art on a broad spectrum of imaging domains. Our evaluations cover natural images, portrait photos, AI-generated content, satellite imagery, fluorescence microscopy, and medical imaging like fundoscopy, ultrasound, and X-ray, demonstrating superior performance in terms of both perceptual (e.g., NIQE, MUSIQ) and fidelity (e.g., PSNR) metrics. By establishing a novel agentic paradigm for low-level vision tasks, we aim to catalyze broader interest and innovation within vision-centric autonomous agents across diverse research communities. We will release all the code, models, and results at: https://4kagent.github.io.

Machine learning techniques for stroke prediction: A systematic review of algorithms, datasets, and regional gaps.

Soladoye AA, Aderinto N, Popoola MR, Adeyanju IA, Osonuga A, Olawade DB

pubmed logopapersJul 9 2025
Stroke is a leading cause of mortality and disability worldwide, with approximately 15 million people suffering strokes annually. Machine learning (ML) techniques have emerged as powerful tools for stroke prediction, enabling early identification of risk factors through data-driven approaches. However, the clinical utility and performance characteristics of these approaches require systematic evaluation. To systematically review and analyze ML techniques used for stroke prediction, systematically synthesize performance metrics across different prediction targets and data sources, evaluate their clinical applicability, and identify research trends focusing on patient population characteristics and stroke prevalence patterns. A systematic review was conducted following PRISMA guidelines. Five databases (Google Scholar, Lens, PubMed, ResearchGate, and Semantic Scholar) were searched for open-access publications on ML-based stroke prediction published between January 2013 and December 2024. Data were extracted on publication characteristics, datasets, ML methodologies, evaluation metrics, prediction targets (stroke occurrence vs. outcomes), data sources (EHR, imaging, biosignals), patient demographics, and stroke prevalence. Descriptive synthesis was performed due to substantial heterogeneity precluding quantitative meta-analysis. Fifty-eight studies were included, with peak publication output in 2021 (21 articles). Studies targeted three main prediction objectives: stroke occurrence prediction (n = 52, 62.7 %), stroke outcome prediction (n = 19, 22.9 %), and stroke type classification (n = 12, 14.4 %). Data sources included electronic health records (n = 48, 57.8 %), medical imaging (n = 21, 25.3 %), and biosignals (n = 14, 16.9 %). Systematic analysis revealed ensemble methods consistently achieved highest accuracies for stroke occurrence prediction (range: 90.4-97.8 %), while deep learning excelled in imaging-based applications. African populations, despite highest stroke mortality rates globally, were represented in fewer than 4 studies. ML techniques show promising results for stroke prediction. However, significant gaps exist in representation of high-risk populations and real-world clinical validation. Future research should prioritize population-specific model development and clinical implementation frameworks.

Population-scale cross-sectional observational study for AI-powered TB screening on one million CXRs.

Munjal P, Mahrooqi AA, Rajan R, Jeremijenko A, Ahmad I, Akhtar MI, Pimentel MAF, Khan S

pubmed logopapersJul 9 2025
Traditional tuberculosis (TB) screening involves radiologists manually reviewing chest X-rays (CXR), which is time-consuming, error-prone, and limited by workforce shortages. Our AI model, AIRIS-TB (AI Radiology In Screening TB), aims to address these challenges by automating the reporting of all X-rays without any findings. AIRIS-TB was evaluated on over one million CXRs, achieving an AUC of 98.51% and overall false negative rate (FNR) of 1.57%, outperforming radiologists (1.85%) while maintaining a 0% TB-FNR. By selectively deferring only cases with findings to radiologists, the model has the potential to automate up to 80% of routine CXR reporting. Subgroup analysis revealed insignificant performance disparities across age, sex, HIV status, and region of origin, with sputum tests for suspected TB showing a strong correlation with model predictions. This large-scale validation demonstrates AIRIS-TB's safety and efficiency in high-volume TB screening programs, reducing radiologist workload without compromising diagnostic accuracy.

Airway Segmentation Network for Enhanced Tubular Feature Extraction

Qibiao Wu, Yagang Wang, Qian Zhang

arxiv logopreprintJul 9 2025
Manual annotation of airway regions in computed tomography images is a time-consuming and expertise-dependent task. Automatic airway segmentation is therefore a prerequisite for enabling rapid bronchoscopic navigation and the clinical deployment of bronchoscopic robotic systems. Although convolutional neural network methods have gained considerable attention in airway segmentation, the unique tree-like structure of airways poses challenges for conventional and deformable convolutions, which often fail to focus on fine airway structures, leading to missed segments and discontinuities. To address this issue, this study proposes a novel tubular feature extraction network, named TfeNet. TfeNet introduces a novel direction-aware convolution operation that first applies spatial rotation transformations to adjust the sampling positions of linear convolution kernels. The deformed kernels are then represented as line segments or polylines in 3D space. Furthermore, a tubular feature fusion module (TFFM) is designed based on asymmetric convolution and residual connection strategies, enhancing the network's focus on subtle airway structures. Extensive experiments conducted on one public dataset and two datasets used in airway segmentation challenges demonstrate that the proposed TfeNet achieves more accuracy and continuous airway structure predictions compared with existing methods. In particular, TfeNet achieves the highest overall score of 94.95% on the current largest airway segmentation dataset, Airway Tree Modeling(ATM22), and demonstrates advanced performance on the lung fibrosis dataset(AIIB23). The code is available at https://github.com/QibiaoWu/TfeNet.
Page 85 of 1421420 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.