Sort by:
Page 85 of 99982 results

Can Large Language Models Challenge CNNs in Medical Image Analysis?

Shibbir Ahmed, Shahnewaz Karim Sakib, Anindya Bijoy Das

arxiv logopreprintMay 29 2025
This study presents a multimodal AI framework designed for precisely classifying medical diagnostic images. Utilizing publicly available datasets, the proposed system compares the strengths of convolutional neural networks (CNNs) and different large language models (LLMs). This in-depth comparative analysis highlights key differences in diagnostic performance, execution efficiency, and environmental impacts. Model evaluation was based on accuracy, F1-score, average execution time, average energy consumption, and estimated $CO_2$ emission. The findings indicate that although CNN-based models can outperform various multimodal techniques that incorporate both images and contextual information, applying additional filtering on top of LLMs can lead to substantial performance gains. These findings highlight the transformative potential of multimodal AI systems to enhance the reliability, efficiency, and scalability of medical diagnostics in clinical settings.

The use of imaging in the diagnosis and treatment of thromboembolic pulmonary hypertension.

Szewczuk K, Dzikowska-Diduch O, Gołębiowski M

pubmed logopapersMay 29 2025
Chronic thromboembolic pulmonary hypertension (CTEPH) is a potentially life-threatening condition, classified as group 4 pulmonary hypertension (PH), caused by stenosis or occlusion of the pulmonary arteries due to unresolved thromboembolic material. The prognosis for untreated CTEPH patients is poor because it leads to elevated pulmonary artery pressure and right heart failure. Early and accurate diagnosis of CTEPH is crucial because it remains the only form of PH that is potentially curable. However, diagnosing CTEPH is often challenging and frequently delayed or misdiagnosed. This review discusses the current role of multimodal imaging in diagnosing CTEPH, guiding clinical decision-making, and monitoring post-treatment outcomes. The characteristic findings, strengths, and limitations of various imaging modalities, such as computed tomography, ventilation-perfusion lung scintigraphy, digital subtraction pulmonary angiography, and magnetic resonance imaging, are evaluated. Additionally, the role of artificial intelligence in improving the diagnosis and treatment outcomes of CTEPH is explored. Optimal patient assessment and therapeutic decision-making should ideally be conducted in specialized centers by a multidisciplinary team, utilizing data from imaging, pulmonary hemodynamics, and patient comorbidities.

RadCLIP: Enhancing Radiologic Image Analysis Through Contrastive Language-Image Pretraining.

Lu Z, Li H, Parikh NA, Dillman JR, He L

pubmed logopapersMay 28 2025
The integration of artificial intelligence (AI) with radiology signifies a transformative era in medicine. Vision foundation models have been adopted to enhance radiologic imaging analysis. However, the inherent complexities of 2D and 3D radiologic data present unique challenges that existing models, which are typically pretrained on general nonmedical images, do not adequately address. To bridge this gap and harness the diagnostic precision required in radiologic imaging, we introduce radiologic contrastive language-image pretraining (RadCLIP): a cross-modal vision-language foundational model that utilizes a vision-language pretraining (VLP) framework to improve radiologic image analysis. Building on the contrastive language-image pretraining (CLIP) approach, RadCLIP incorporates a slice pooling mechanism designed for volumetric image analysis and is pretrained using a large, diverse dataset of radiologic image-text pairs. This pretraining effectively aligns radiologic images with their corresponding text annotations, resulting in a robust vision backbone for radiologic imaging. Extensive experiments demonstrate RadCLIP's superior performance in both unimodal radiologic image classification and cross-modal image-text matching, underscoring its significant promise for enhancing diagnostic accuracy and efficiency in clinical settings. Our key contributions include curating a large dataset featuring diverse radiologic 2D/3D image-text pairs, pretraining RadCLIP as a vision-language foundation model on this dataset, developing a slice pooling adapter with an attention mechanism for integrating 2D images, and conducting comprehensive evaluations of RadCLIP on various radiologic downstream tasks.

Integrating SEResNet101 and SE-VGG19 for advanced cervical lesion detection: a step forward in precision oncology.

Ye Y, Chen Y, Pan J, Li P, Ni F, He H

pubmed logopapersMay 28 2025
Cervical cancer remains a significant global health issue, with accurate differentiation between low-grade (LSIL) and high-grade squamous intraepithelial lesions (HSIL) crucial for effective screening and management. Current methods, such as Pap smears and HPV testing, often fall short in sensitivity and specificity. Deep learning models hold the potential to enhance the accuracy of cervical cancer screening but require thorough evaluation to ascertain their practical utility. This study compares the performance of two advanced deep learning models, SEResNet101 and SE-VGG19, in classifying cervical lesions using a dataset of 3,305 high-quality colposcopy images. We assessed the models based on their accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC). The SEResNet101 model demonstrated superior performance over SE-VGG19 across all evaluated metrics. Specifically, SEResNet101 achieved a sensitivity of 95%, a specificity of 97%, and an AUC of 0.98, compared to 89% sensitivity, 93% specificity, and an AUC of 0.94 for SE-VGG19. These findings suggest that SEResNet101 could significantly reduce both over- and under-treatment rates by enhancing diagnostic precision. Our results indicate that SEResNet101 offers a promising enhancement over existing screening methods, integrating advanced deep learning algorithms to significantly improve the precision of cervical lesion classification. This study advocates for the inclusion of SEResNet101 in clinical workflows to enhance cervical cancer screening protocols, thereby improving patient outcomes. Future work should focus on multicentric trials to validate these findings and facilitate widespread clinical adoption.

Cascaded 3D Diffusion Models for Whole-body 3D 18-F FDG PET/CT synthesis from Demographics

Siyeop Yoon, Sifan Song, Pengfei Jin, Matthew Tivnan, Yujin Oh, Sekeun Kim, Dufan Wu, Xiang Li, Quanzheng Li

arxiv logopreprintMay 28 2025
We propose a cascaded 3D diffusion model framework to synthesize high-fidelity 3D PET/CT volumes directly from demographic variables, addressing the growing need for realistic digital twins in oncologic imaging, virtual trials, and AI-driven data augmentation. Unlike deterministic phantoms, which rely on predefined anatomical and metabolic templates, our method employs a two-stage generative process. An initial score-based diffusion model synthesizes low-resolution PET/CT volumes from demographic variables alone, providing global anatomical structures and approximate metabolic activity. This is followed by a super-resolution residual diffusion model that refines spatial resolution. Our framework was trained on 18-F FDG PET/CT scans from the AutoPET dataset and evaluated using organ-wise volume and standardized uptake value (SUV) distributions, comparing synthetic and real data between demographic subgroups. The organ-wise comparison demonstrated strong concordance between synthetic and real images. In particular, most deviations in metabolic uptake values remained within 3-5% of the ground truth in subgroup analysis. These findings highlight the potential of cascaded 3D diffusion models to generate anatomically and metabolically accurate PET/CT images, offering a robust alternative to traditional phantoms and enabling scalable, population-informed synthetic imaging for clinical and research applications.

Estimation of time-to-total knee replacement surgery with multimodal modeling and artificial intelligence.

Cigdem O, Hedayati E, Rajamohan HR, Cho K, Chang G, Kijowski R, Deniz CM

pubmed logopapersMay 27 2025
The methods for predicting time-to-total knee replacement (TKR) do not provide enough information to make robust and accurate predictions. Develop and evaluate an artificial intelligence-based model for predicting time-to-TKR by analyzing longitudinal knee data and identifying key features associated with accelerated knee osteoarthritis progression. A total of 547 subjects underwent TKR in the Osteoarthritis Initiative over nine years, and their longitudinal data was used for model training and testing. 518 and 164 subjects from Multi-Center Osteoarthritis Study and internal hospital data were used for external testing, respectively. The clinical variables, magnetic resonance (MR) images, radiographs, and quantitative and semi-quantitative assessments from images were analyzed. Deep learning (DL) models were used to extract features from radiographs and MR images. DL features were combined with clinical and image assessment features for survival analysis. A Lasso Cox feature selection method combined with a random survival forest model was used to estimate time-to-TKR. Utilizing only clinical variables for time-to-TKR predictions provided the estimation accuracy of 60.4% and C-index of 62.9%. Combining DL features extracted from radiographs, MR images with clinical, quantitative, and semi-quantitative image assessment features achieved the highest accuracy of 73.2%, (p=.001) and C-index of 77.3% for predicting time-to-TKR. The proposed predictive model demonstrated the potential of DL models and multimodal data fusion in accurately predicting time-to-TKR surgery that may help assist physicians to personalize treatment strategies and improve patient outcomes.

Rate and Patient Specific Risk Factors for Periprosthetic Acetabular Fractures during Primary Total Hip Arthroplasty using a Pressfit Cup.

Simon S, Gobi H, Mitterer JA, Frank BJ, Huber S, Aichmair A, Dominkus M, Hofstaetter JG

pubmed logopapersMay 26 2025
Periprosthetic acetabular fractures following primary total hip arthroplasty (THA) using a cementless acetabular component range from occult to severe fractures. The aims of this study were to evaluate the perioperative periprosthetic acetabular fracture rate and patient-specific risks of a modular cementless acetabular component. In this study, we included 7,016 primary THAs (61.4% women, 38.6% men; age, 67 years; interquartile-range, 58 to 74) that received a cementless-hydroxyapatite-coated modular-titanium press-fit acetabular component from a single manufacturer between January 2013 and September 2022. All perioperative radiographs and CT (computer tomography) scans were analyzed for all causes. Patient-specific data and the revision rate were retrieved, and radiographic measurements were performed using artificial intelligence-based software. Following matching based on patients' demographics, a comparison was made between patients who had and did not have periacetabular fractures in order to identify patient-specific and radiographic risk factors for periacetabular fractures. The fracture rate was 0.8% (56 of 7,016). Overall, 33.9% (19 of 56) were small occult fractures solely visible on CT. Additionally, there were 21 of 56 (37.5%) with a stable small fracture. Both groups (40 of 56 (71.4%)) were treated nonoperatively. Revision THA was necessary in 16 of 56, resulting in an overall revision rate of 0.2% (16 of 7,016). Patient-specific risk factors were small acetabular-component size (≤ 50), a low body mass index (BMI) (< 24.5), a higher age (> 68 years), women, a low lateral-central-age-angle (< 24°), a high Extrusion-index (> 20%), a high sharp-angle (> 38°), and a high Tönnis-angle (> 10°). A wide range of periprosthetic acetabular fractures were observed following primary cementless THA. In total, 71.4% of acetabular fractures were small cracks that did not necessitate revision surgery. By identifying patient-specific risk factors, such as advanced age, women, low BMI, and dysplastic hips, future complications may be reduced.

tUbe net: a generalisable deep learning tool for 3D vessel segmentation

Holroyd, N. A., Li, Z., Walsh, C., Brown, E. E., Shipley, R. J., Walker-Samuel, S.

biorxiv logopreprintMay 26 2025
Deep learning has become an invaluable tool for bioimage analysis but, while open-source cell annotation software such as cellpose are widely used, an equivalent tool for three-dimensional (3D) vascular annotation does not exist. With the vascular system being directly impacted by a broad range of diseases, there is significant medical interest in quantitative analysis for vascular imaging. However, existing deep learning approaches for this task are specialised to particular tissue types or imaging modalities. We present a new deep learning model for segmentation of vasculature that is generalisable across tissues, modalities, scales and pathologies. To create a generalisable model, a 3D convolutional neural network was trained using data from multiple modalities including optical imaging, computational tomography and photoacoustic imaging. Through this varied training set, the model was forced to learn common features of vessels cross-modality and scale. Following this, the general model was fine-tuned to different applications with a minimal amount of manually labelled ground truth data. It was found that the general model could be specialised to segment new datasets, with a high degree of accuracy, using as little as 0.3% of the volume of that dataset for fine-tuning. As such, this model enables users to produce accurate segmentations of 3D vascular networks without the need to label large amounts of training data.

Rep3D: Re-parameterize Large 3D Kernels with Low-Rank Receptive Modeling for Medical Imaging

Ho Hin Lee, Quan Liu, Shunxing Bao, Yuankai Huo, Bennett A. Landman

arxiv logopreprintMay 26 2025
In contrast to vision transformers, which model long-range dependencies through global self-attention, large kernel convolutions provide a more efficient and scalable alternative, particularly in high-resolution 3D volumetric settings. However, naively increasing kernel size often leads to optimization instability and degradation in performance. Motivated by the spatial bias observed in effective receptive fields (ERFs), we hypothesize that different kernel elements converge at variable rates during training. To support this, we derive a theoretical connection between element-wise gradients and first-order optimization, showing that structurally re-parameterized convolution blocks inherently induce spatially varying learning rates. Building on this insight, we introduce Rep3D, a 3D convolutional framework that incorporates a learnable spatial prior into large kernel training. A lightweight two-stage modulation network generates a receptive-biased scaling mask, adaptively re-weighting kernel updates and enabling local-to-global convergence behavior. Rep3D adopts a plain encoder design with large depthwise convolutions, avoiding the architectural complexity of multi-branch compositions. We evaluate Rep3D on five challenging 3D segmentation benchmarks and demonstrate consistent improvements over state-of-the-art baselines, including transformer-based and fixed-prior re-parameterization methods. By unifying spatial inductive bias with optimization-aware learning, Rep3D offers an interpretable, and scalable solution for 3D medical image analysis. The source code is publicly available at https://github.com/leeh43/Rep3D.
Page 85 of 99982 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.