Sort by:
Page 28 of 42411 results

Development of an Open-Source Algorithm for Automated Segmentation in Clinician-Led Paranasal Sinus Radiologic Research.

Darbari Kaul R, Zhong W, Liu S, Azemi G, Liang K, Zou E, Sacks PL, Thiel C, Campbell RG, Kalish L, Sacks R, Di Ieva A, Harvey RJ

pubmed logopapersMay 27 2025
Artificial Intelligence (AI) research needs to be clinician led; however, expertise typically lies outside their skill set. Collaborations exist but are often commercially driven. Free and open-source computational algorithms and software expertise are required for meaningful clinically driven AI medical research. Deep learning algorithms automate segmenting regions of interest for analysis and clinical translation. Numerous studies have automatically segmented paranasal sinus computed tomography (CT) scans; however, openly accessible algorithms capturing the sinonasal cavity remain scarce. The purpose of this study was to validate and provide an open-source segmentation algorithm for paranasal sinus CTs for the otolaryngology research community. A cross-sectional comparative study was conducted with a deep learning algorithm, UNet++, modified for automatic segmentation of paranasal sinuses CTs and "ground-truth" manual segmentations. A dataset of 100 paranasal sinuses scans was manually segmented, with an 80/20 training/testing split. The algorithm is available at https://github.com/rheadkaul/SinusSegment. Primary outcomes included the Dice similarity coefficient (DSC) score, Intersection over Union (IoU), Hausdorff distance (HD), sensitivity, specificity, and visual similarity grading. Twenty scans representing 7300 slices were assessed. The mean DSC was 0.87 and IoU 0.80, with HD 33.61 mm. The mean sensitivity was 83.98% and specificity 99.81%. The median visual similarity grading score was 3 (good). There were no statistically significant differences in outcomes with normal or diseased paranasal sinus CTs. Automatic segmentation of CT paranasal sinuses yields good results when compared with manual segmentation. This study provides an open-source segmentation algorithm as a foundation and gateway for more complex AI-based analysis of large datasets.

Development of a No-Reference CT Image Quality Assessment Method Using RadImageNet Pre-trained Deep Learning Models.

Ohashi K, Nagatani Y, Yamazaki A, Yoshigoe M, Iwai K, Uemura R, Shimomura M, Tanimura K, Ishida T

pubmed logopapersMay 27 2025
Accurate assessment of computed tomography (CT) image quality is crucial for ensuring diagnostic accuracy, optimizing imaging protocols, and preventing excessive radiation exposure. In clinical settings, where high-quality reference images are often unavailable, developing no-reference image quality assessment (NR-IQA) methods is essential. Recently, CT-NR-IQA methods using deep learning have been widely studied; however, significant challenges remain in handling multiple degradation factors and accurately reflecting real-world degradations. To address these issues, we propose a novel CT-NR-IQA method. Our approach utilizes a dataset that combines two degradation factors (noise and blur) to train convolutional neural network (CNN) models capable of handling multiple degradation factors. Additionally, we leveraged RadImageNet pre-trained models (ResNet50, DenseNet121, InceptionV3, and InceptionResNetV2), allowing the models to learn deep features from large-scale real clinical images, thus enhancing adaptability to real-world degradations without relying on artificially degraded images. The models' performances were evaluated by measuring the correlation between the subjective scores and predicted image quality scores for both artificially degraded and real clinical image datasets. The results demonstrated positive correlations between the subjective and predicted scores for both datasets. In particular, ResNet50 showed the best performance, with a correlation coefficient of 0.910 for the artificially degraded images and 0.831 for the real clinical images. These findings indicate that the proposed method could serve as a potential surrogate for subjective assessment in CT-NR-IQA.

An orchestration learning framework for ultrasound imaging: Prompt-Guided Hyper-Perception and Attention-Matching Downstream Synchronization.

Lin Z, Li S, Wang S, Gao Z, Sun Y, Lam CT, Hu X, Yang X, Ni D, Tan T

pubmed logopapersMay 27 2025
Ultrasound imaging is pivotal in clinical diagnostics due to its affordability, portability, safety, real-time capability, and non-invasive nature. It is widely utilized for examining various organs, such as the breast, thyroid, ovary, cardiac, and more. However, the manual interpretation and annotation of ultrasound images are time-consuming and prone to variability among physicians. While single-task artificial intelligence (AI) solutions have been explored, they are not ideal for scaling AI applications in medical imaging. Foundation models, although a trending solution, often struggle with real-world medical datasets due to factors such as noise, variability, and the incapability of flexibly aligning prior knowledge with task adaptation. To address these limitations, we propose an orchestration learning framework named PerceptGuide for general-purpose ultrasound classification and segmentation. Our framework incorporates a novel orchestration mechanism based on prompted hyper-perception, which adapts to the diverse inductive biases required by different ultrasound datasets. Unlike self-supervised pre-trained models, which require extensive fine-tuning, our approach leverages supervised pre-training to directly capture task-relevant features, providing a stronger foundation for multi-task and multi-organ ultrasound imaging. To support this research, we compiled a large-scale Multi-task, Multi-organ public ultrasound dataset (M<sup>2</sup>-US), featuring images from 9 organs and 16 datasets, encompassing both classification and segmentation tasks. Our approach employs four specific prompts-Object, Task, Input, and Position-to guide the model, ensuring task-specific adaptability. Additionally, a downstream synchronization training stage is introduced to fine-tune the model for new data, significantly improving generalization capabilities and enabling real-world applications. Experimental results demonstrate the robustness and versatility of our framework in handling multi-task and multi-organ ultrasound image processing, outperforming both specialist models and existing general AI solutions. Compared to specialist models, our method improves segmentation from 82.26% to 86.45%, classification from 71.30% to 79.08%, while also significantly reducing model parameters.

Evolution of deep learning tooth segmentation from CT/CBCT images: a systematic review and meta-analysis.

Kot WY, Au Yeung SY, Leung YY, Leung PH, Yang WF

pubmed logopapersMay 26 2025
Deep learning has been utilized to segment teeth from computed tomography (CT) or cone-beam CT (CBCT). However, the performance of deep learning is unknown due to multiple models and diverse evaluation metrics. This systematic review and meta-analysis aims to evaluate the evolution and performance of deep learning in tooth segmentation. We systematically searched PubMed, Web of Science, Scopus, IEEE Xplore, arXiv.org, and ACM for studies investigating deep learning in human tooth segmentation from CT/CBCT. Included studies were assessed using the Quality Assessment of Diagnostic Accuracy Study (QUADAS-2) tool. Data were extracted for meta-analyses by random-effects models. A total of 30 studies were included in the systematic review, and 28 of them were included for meta-analyses. Various deep learning algorithms were categorized according to the backbone network, encompassing single-stage convolutional models, convolutional models with U-Net architecture, Transformer models, convolutional models with attention mechanisms, and combinations of multiple models. Convolutional models with U-Net architecture were the most commonly used deep learning algorithms. The integration of attention mechanism within convolutional models has become a new topic. 29 evaluation metrics were identified, with Dice Similarity Coefficient (DSC) being the most popular. The pooled results were 0.93 [0.93, 0.93] for DSC, 0.86 [0.85, 0.87] for Intersection over Union (IoU), 0.22 [0.19, 0.24] for Average Symmetric Surface Distance (ASSD), 0.92 [0.90, 0.94] for sensitivity, 0.71 [0.26, 1.17] for 95% Hausdorff distance, and 0.96 [0.93, 0.98] for precision. No significant difference was observed in the segmentation of single-rooted or multi-rooted teeth. No obvious correlation between sample size and segmentation performance was observed. Multiple deep learning algorithms have been successfully applied to tooth segmentation from CT/CBCT and their evolution has been well summarized and categorized according to their backbone structures. In future, studies are needed with standardized protocols and open labelled datasets.

Training a deep learning model to predict the anatomy irradiated in fluoroscopic x-ray images.

Guo L, Trujillo D, Duncan JR, Thomas MA

pubmed logopapersMay 26 2025
Accurate patient dosimetry estimates from fluoroscopically-guided interventions (FGIs) are hindered by limited knowledge of the specific anatomy that was irradiated. Current methods use data reported by the equipment to estimate the patient anatomy exposed during each irradiation event. We propose a deep learning algorithm to automatically match 2D fluoroscopic images with corresponding anatomical regions in computational phantoms, enabling more precise patient dose estimates. Our method involves two main steps: (1) simulating 2D fluoroscopic images, and (2) developing a deep learning algorithm to predict anatomical coordinates from these images. For part (1), we utilized DeepDRR for fast and realistic simulation of 2D x-ray images from 3D computed tomography datasets. We generated a diverse set of simulated fluoroscopic images from various regions with different field sizes. In part (2), we employed a Residual Neural Network (ResNet) architecture combined with metadata processing to effectively integrate patient-specific information (age and gender) to learn the transformation between 2D images and specific anatomical coordinates in each representative phantom. For the Modified ResNet model, we defined an allowable error range of ± 10 mm. The proposed method achieved over 90% of predictions within ± 10 mm, with strong alignment between predicted and true coordinates as confirmed by Bland-Altman analysis. Most errors were within ± 2%, with outliers beyond ± 5% primarily in Z-coordinates for infant phantoms due to their limited representation in the training data. These findings highlight the model's accuracy and its potential for precise spatial localization, while emphasizing the need for improved performance in specific anatomical regions. In this work, a comprehensive simulated 2D fluoroscopy image dataset was developed, addressing the scarcity of real clinical datasets and enabling effective training of deep-learning models. The modified ResNet successfully achieved precise prediction of anatomical coordinates from the simulated fluoroscopic images, enabling the goal of more accurate patient-specific dosimetry.

Automated landmark-based mid-sagittal plane: reliability for 3-dimensional mandibular asymmetry assessment on head CT scans.

Alt S, Gajny L, Tilotta F, Schouman T, Dot G

pubmed logopapersMay 26 2025
The determination of the mid-sagittal plane (MSP) on three-dimensional (3D) head imaging is key to the assessment of facial asymmetry. The aim of this study was to evaluate the reliability of an automated landmark-based MSP to quantify mandibular asymmetry on head computed tomography (CT) scans. A dataset of 368 CT scans, including orthognathic surgery patients, was automatically annotated with 3D cephalometric landmarks via a previously published deep learning-based method. Five of these landmarks were used to automatically construct an MSP orthogonal to the Frankfurt horizontal plane. The reliability of automatic MSP construction was compared with the reliability of manual MSP construction based on 6 manual localizations by 3 experienced operators on 19 randomly selected CT scans. The mandibular asymmetry of the 368 CT scans with respect to the MSP was calculated and compared with clinical expert judgment. The construction of the MSP was found to be highly reliable, both manually and automatically. The manual reproducibility 95% limit of agreement was less than 1 mm for -y translation and less than 1.1° for -x and -z rotation, and the automatic measurement lied within the confidence interval of the manual method. The automatic MSP construction was shown to be clinically relevant, with the mandibular asymmetry measures being consistent with the expertly assessed levels of asymmetry. The proposed automatic landmark-based MSP construction was found to be as reliable as manual construction and clinically relevant in assessing the mandibular asymmetry of 368 head CT scans. Once implemented in a clinical software, fully automated landmark-based MSP construction could be clinically used to assess mandibular asymmetry on head CT scans.

Advancements in Medical Image Classification through Fine-Tuning Natural Domain Foundation Models

Mobina Mansoori, Sajjad Shahabodini, Farnoush Bayatmakou, Jamshid Abouei, Konstantinos N. Plataniotis, Arash Mohammadi

arxiv logopreprintMay 26 2025
Using massive datasets, foundation models are large-scale, pre-trained models that perform a wide range of tasks. These models have shown consistently improved results with the introduction of new methods. It is crucial to analyze how these trends impact the medical field and determine whether these advancements can drive meaningful change. This study investigates the application of recent state-of-the-art foundation models, DINOv2, MAE, VMamba, CoCa, SAM2, and AIMv2, for medical image classification. We explore their effectiveness on datasets including CBIS-DDSM for mammography, ISIC2019 for skin lesions, APTOS2019 for diabetic retinopathy, and CHEXPERT for chest radiographs. By fine-tuning these models and evaluating their configurations, we aim to understand the potential of these advancements in medical image classification. The results indicate that these advanced models significantly enhance classification outcomes, demonstrating robust performance despite limited labeled data. Based on our results, AIMv2, DINOv2, and SAM2 models outperformed others, demonstrating that progress in natural domain training has positively impacted the medical domain and improved classification outcomes. Our code is publicly available at: https://github.com/sajjad-sh33/Medical-Transfer-Learning.

Rep3D: Re-parameterize Large 3D Kernels with Low-Rank Receptive Modeling for Medical Imaging

Ho Hin Lee, Quan Liu, Shunxing Bao, Yuankai Huo, Bennett A. Landman

arxiv logopreprintMay 26 2025
In contrast to vision transformers, which model long-range dependencies through global self-attention, large kernel convolutions provide a more efficient and scalable alternative, particularly in high-resolution 3D volumetric settings. However, naively increasing kernel size often leads to optimization instability and degradation in performance. Motivated by the spatial bias observed in effective receptive fields (ERFs), we hypothesize that different kernel elements converge at variable rates during training. To support this, we derive a theoretical connection between element-wise gradients and first-order optimization, showing that structurally re-parameterized convolution blocks inherently induce spatially varying learning rates. Building on this insight, we introduce Rep3D, a 3D convolutional framework that incorporates a learnable spatial prior into large kernel training. A lightweight two-stage modulation network generates a receptive-biased scaling mask, adaptively re-weighting kernel updates and enabling local-to-global convergence behavior. Rep3D adopts a plain encoder design with large depthwise convolutions, avoiding the architectural complexity of multi-branch compositions. We evaluate Rep3D on five challenging 3D segmentation benchmarks and demonstrate consistent improvements over state-of-the-art baselines, including transformer-based and fixed-prior re-parameterization methods. By unifying spatial inductive bias with optimization-aware learning, Rep3D offers an interpretable, and scalable solution for 3D medical image analysis. The source code is publicly available at https://github.com/leeh43/Rep3D.

Advancing Limited-Angle CT Reconstruction Through Diffusion-Based Sinogram Completion

Jiaqi Guo, Santiago Lopez-Tapia, Aggelos K. Katsaggelos

arxiv logopreprintMay 26 2025
Limited Angle Computed Tomography (LACT) often faces significant challenges due to missing angular information. Unlike previous methods that operate in the image domain, we propose a new method that focuses on sinogram inpainting. We leverage MR-SDEs, a variant of diffusion models that characterize the diffusion process with mean-reverting stochastic differential equations, to fill in missing angular data at the projection level. Furthermore, by combining distillation with constraining the output of the model using the pseudo-inverse of the inpainting matrix, the diffusion process is accelerated and done in a step, enabling efficient and accurate sinogram completion. A subsequent post-processing module back-projects the inpainted sinogram into the image domain and further refines the reconstruction, effectively suppressing artifacts while preserving critical structural details. Quantitative experimental results demonstrate that the proposed method achieves state-of-the-art performance in both perceptual and fidelity quality, offering a promising solution for LACT reconstruction in scientific and clinical applications.

CDPDNet: Integrating Text Guidance with Hybrid Vision Encoders for Medical Image Segmentation

Jiong Wu, Yang Xing, Boxiao Yu, Wei Shao, Kuang Gong

arxiv logopreprintMay 25 2025
Most publicly available medical segmentation datasets are only partially labeled, with annotations provided for a subset of anatomical structures. When multiple datasets are combined for training, this incomplete annotation poses challenges, as it limits the model's ability to learn shared anatomical representations among datasets. Furthermore, vision-only frameworks often fail to capture complex anatomical relationships and task-specific distinctions, leading to reduced segmentation accuracy and poor generalizability to unseen datasets. In this study, we proposed a novel CLIP-DINO Prompt-Driven Segmentation Network (CDPDNet), which combined a self-supervised vision transformer with CLIP-based text embedding and introduced task-specific text prompts to tackle these challenges. Specifically, the framework was constructed upon a convolutional neural network (CNN) and incorporated DINOv2 to extract both fine-grained and global visual features, which were then fused using a multi-head cross-attention module to overcome the limited long-range modeling capability of CNNs. In addition, CLIP-derived text embeddings were projected into the visual space to help model complex relationships among organs and tumors. To further address the partial label challenge and enhance inter-task discriminative capability, a Text-based Task Prompt Generation (TTPG) module that generated task-specific prompts was designed to guide the segmentation. Extensive experiments on multiple medical imaging datasets demonstrated that CDPDNet consistently outperformed existing state-of-the-art segmentation methods. Code and pretrained model are available at: https://github.com/wujiong-hub/CDPDNet.git.
Page 28 of 42411 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.