Sort by:
Page 232 of 6576562 results

Dong L, Cai X, Ge H, Sun L, Pan X, Sun F, Meng Q

pubmed logopapersAug 25 2025
To develop and evaluate a Dual-modality Complementary Feature Attention Network (DCFAN) that integrates spatial and stiffness information from B-mode ultrasound and shear wave elastography (SWE) for improved breast tumor classification and axillary lymph node (ALN) metastasis prediction. A total of 387 paired B-mode and SWE images from 218 patients were retrospectively analyzed. The proposed DCFAN incorporates attention mechanisms to effectively fuse structural features from B-mode ultrasound with stiffness features from SWE. Two classification tasks were performed: (1) differentiating benign from malignant tumors, and (2) classifying benign tumors, malignant tumors without ALN metastasis, and malignant tumors with ALN metastasis. Model performance was assessed using accuracy, sensitivity, specificity, and AUC, and compared with conventional CNN-based models and two radiologists with varying experience. In Task 1, DCFAN achieved an accuracy of 94.36% ± 1.45% and the highest AUC of 0.97. In Task 2, it attained 91.70% ± 3.77% accuracy and an average AUC of 0.83. The multimodal approach significantly outperformed the single-modality models in both tasks. Notably, in Task 1, DCFAN demonstrated higher specificity (94.9%) compared to the experienced radiologist (p = 0.002), and yielded higher F1-scores than both radiologists. It also outperformed several state-of-the-art deep learning models in diagnostic accuracy. DCFAN demonstrated robust and superior performance over existing CNN-based methods and radiologists in both breast tumor classification and ALN metastasis prediction. This approach may serve as a valuable assistive tool to enhance diagnostic accuracy in breast ultrasound.

Guo L, Liu C, Soultanidis G

pubmed logopapersAug 25 2025
Motion in clinical positron emission tomography (PET) examinations degrades image quality and quantification, requiring tailored correction strategies. Recent advancements integrate external devices and/or data-driven motion tracking with image registration and motion modeling, particularly deep learning-based methods, to address complex motion scenarios. The development of total-body PET systems with long axial field-of-view enables advanced motion correction by leveraging extended coverage and continuous acquisition. These innovations enhance the accuracy of motion estimation and correction across various clinical applications, improve quantitative reliability in static and dynamic imaging, and enable more precise assessments in oncology, neurology, and cardiovascular PET studies.

Wang X, Li P, Li Y, Zhang R, Duan F, Wang D

pubmed logopapersAug 25 2025
To develop and validate predictive models based on <sup>18</sup>F-fluorodeoxyglucose positron emission tomography/computed tomography (<sup>18</sup>F-FDG PET/CT) radiomics and a clinical model for differentiating invasive adenocarcinoma (IAC) from non-invasive ground-glass nodules (GGNs) in early-stage lung cancer. A total of 164 patients with GGNs histologically confirmed as part of the lung adenocarcinoma spectrum (including both invasive and non-invasive subtypes) who underwent preoperative <sup>18</sup>F-FDG PET/CT and surgery. Radiomic features were extracted from PET and CT images. Models were constructed using support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost). Five predictive models (CT, PET, PET/CT, Clinical, Combined) were evaluated using receiver operating characteristic (ROC) curves, decision curve analysis (DCA), and calibration curves. Statistical comparisons were performed using DeLong's test, net reclassification improvement (NRI), and integrated discrimination improvement (IDI). The Combined model, integrating PET/CT radiomic features with the clinical model, achieved the highest diagnostic performance (AUC: 0.950 in training, 0.911 in test). It consistently showed superior IDI and NRI across both cohorts and significantly outperformed the clinical model (DeLong p = 0.027), confirming its enhanced predictive power through multimodal integration. A clinical nomogram was constructed from the final model to support individualized risk stratification. Integrating PET/CT radiomic features with a clinical model significantly enhances the preoperative prediction of GGN invasiveness. This multimodal image data may assist in preoperative risk stratification and support personalized surgical decision-making in early-stage lung adenocarcinoma.

Utsav Ratna Tuladhar, Richard Simon, Doran Mix, Michael Richards

arxiv logopreprintAug 25 2025
Abdominal aortic aneurysms (AAA) pose a significant clinical risk due to their potential for rupture, which is often asymptomatic but can be fatal. Although maximum diameter is commonly used for risk assessment, diameter alone is insufficient as it does not capture the properties of the underlying material of the vessel wall, which play a critical role in determining the risk of rupture. To overcome this limitation, we propose a deep learning-based framework for elasticity imaging of AAAs with 2D ultrasound. Leveraging finite element simulations, we generate a diverse dataset of displacement fields with their corresponding modulus distributions. We train a model with U-Net architecture and normalized mean squared error (NMSE) to infer the spatial modulus distribution from the axial and lateral components of the displacement fields. This model is evaluated across three experimental domains: digital phantom data from 3D COMSOL simulations, physical phantom experiments using biomechanically distinct vessel models, and clinical ultrasound exams from AAA patients. Our simulated results demonstrate that the proposed deep learning model is able to reconstruct modulus distributions, achieving an NMSE score of 0.73\%. Similarly, in phantom data, the predicted modular ratio closely matches the expected values, affirming the model's ability to generalize to phantom data. We compare our approach with an iterative method which shows comparable performance but higher computation time. In contrast, the deep learning method can provide quick and effective estimates of tissue stiffness from ultrasound images, which could help assess the risk of AAA rupture without invasive procedures.

Le Zhang, Fuping Wu, Arun Thirunavukarasu, Kevin Bronik, Thomas Nichols, Bartlomiej W. Papiez

arxiv logopreprintAug 25 2025
Large annotated datasets are vital for training segmentation models, but pixel-level labeling is time-consuming, error-prone, and often requires scarce expert annotators, especially in medical imaging. In contrast, coarse annotations are quicker, cheaper, and easier to produce, even by non-experts. In this paper, we propose to use coarse drawings from both positive (target) and negative (background) classes in the image, even with noisy pixels, to train a convolutional neural network (CNN) for semantic segmentation. We present a method for learning the true segmentation label distributions from purely noisy coarse annotations using two coupled CNNs. The separation of the two CNNs is achieved by high fidelity with the characters of the noisy training annotations. We propose to add a complementary label learning that encourages estimating negative label distribution. To illustrate the properties of our method, we first use a toy segmentation dataset based on MNIST. We then present the quantitative results of experiments using publicly available datasets: Cityscapes dataset for multi-class segmentation, and retinal images for medical applications. In all experiments, our method outperforms state-of-the-art methods, particularly in the cases where the ratio of coarse annotations is small compared to the given dense annotations.

Ravi Shankar Prasad, Dinesh Singh

arxiv logopreprintAug 25 2025
Craniofacial reconstruction in forensics is one of the processes to identify victims of crime and natural disasters. Identifying an individual from their remains plays a crucial role when all other identification methods fail. Traditional methods for this task, such as clay-based craniofacial reconstruction, require expert domain knowledge and are a time-consuming process. At the same time, other probabilistic generative models like the statistical shape model or the Basel face model fail to capture the skull and face cross-domain attributes. Looking at these limitations, we propose a generic framework for craniofacial reconstruction from 2D X-ray images. Here, we used various generative models (i.e., CycleGANs, cGANs, etc) and fine-tune the generator and discriminator parts to generate more realistic images in two distinct domains, which are the skull and face of an individual. This is the first time where 2D X-rays are being used as a representation of the skull by generative models for craniofacial reconstruction. We have evaluated the quality of generated faces using FID, IS, and SSIM scores. Finally, we have proposed a retrieval framework where the query is the generated face image and the gallery is the database of real faces. By experimental results, we have found that this can be an effective tool for forensic science.

Fatemeh Ziaeetabar

arxiv logopreprintAug 25 2025
Vision foundation models (FMs) have become the predominant architecture in computer vision, providing highly transferable representations learned from large-scale, multimodal corpora. Nonetheless, they exhibit persistent limitations on tasks that require explicit reasoning over entities, roles, and spatio-temporal relations. Such relational competence is indispensable for fine-grained human activity recognition, egocentric video understanding, and multimodal medical image analysis, where spatial, temporal, and semantic dependencies are decisive for performance. We advance the position that next-generation FMs should incorporate explicit relational interfaces, instantiated as dynamic relational graphs (graphs whose topology and edge semantics are inferred from the input and task context). We illustrate this position with cross-domain evidence from recent systems in human manipulation action recognition and brain tumor segmentation, showing that augmenting FMs with lightweight, context-adaptive graph-reasoning modules improves fine-grained semantic fidelity, out of distribution robustness, interpretability, and computational efficiency relative to FM only baselines. Importantly, by reasoning sparsely over semantic nodes, such hybrids also achieve favorable memory and hardware efficiency, enabling deployment under practical resource constraints. We conclude with a targeted research agenda for FM graph hybrids, prioritizing learned dynamic graph construction, multi-level relational reasoning (e.g., part object scene in activity understanding, or region organ in medical imaging), cross-modal fusion, and evaluation protocols that directly probe relational competence in structured vision tasks.

Xingyu Ai, Shaoyu Wang, Zhiyuan Jia, Ao Xu, Hongming Shan, Jianhua Ma, Qiegen Liu

arxiv logopreprintAug 25 2025
During raw-data acquisition in CT imaging, diverse factors can degrade the collected sinograms, with undersampling and noise leading to severe artifacts and noise in reconstructed images and compromising diagnostic accuracy. Conventional correction methods rely on manually designed algorithms or fixed empirical parameters, but these approaches often lack generalizability across heterogeneous artifact types. To address these limitations, we propose UniSino, a foundation model for universal CT sinogram standardization. Unlike existing foundational models that operate in image domain, UniSino directly standardizes data in the projection domain, which enables stronger generalization across diverse undersampling scenarios. Its training framework incorporates the physical characteristics of sinograms, enhancing generalization and enabling robust performance across multiple subtasks spanning four benchmark datasets. Experimental results demonstrate thatUniSino achieves superior reconstruction quality both single and mixed undersampling case, demonstrating exceptional robustness and generalization in sinogram enhancement for CT imaging. The code is available at: https://github.com/yqx7150/UniSino.

Nur Amirah Abd Hamid, Mohd Shahrizal Rusli, Muhammad Thaqif Iman Mohd Taufek, Mohd Ibrahim Shapiai, Daphne Teck Ching Lai

arxiv logopreprintAug 25 2025
Accurate prediction of clinical scores is critical for early detection and prognosis of Alzheimers disease (AD). While existing approaches primarily focus on forecasting the ADAS-Cog global score, they often overlook the predictive value of its sub-scores (13 items), which capture domain-specific cognitive decline. In this study, we propose a multi task learning (MTL) framework that jointly predicts the global ADAS-Cog score and its sub-scores (13 items) at Month 24 using baseline MRI and longitudinal clinical scores from baseline and Month 6. The main goal is to examine how each sub scores particularly those associated with MRI features contribute to the prediction of the global score, an aspect largely neglected in prior MTL studies. We employ Vision Transformer (ViT) and Swin Transformer architectures to extract imaging features, which are fused with longitudinal clinical inputs to model cognitive progression. Our results show that incorporating sub-score learning improves global score prediction. Subscore level analysis reveals that a small subset especially Q1 (Word Recall), Q4 (Delayed Recall), and Q8 (Word Recognition) consistently dominates the predicted global score. However, some of these influential sub-scores exhibit high prediction errors, pointing to model instability. Further analysis suggests that this is caused by clinical feature dominance, where the model prioritizes easily predictable clinical scores over more complex MRI derived features. These findings emphasize the need for improved multimodal fusion and adaptive loss weighting to achieve more balanced learning. Our study demonstrates the value of sub score informed modeling and provides insights into building more interpretable and clinically robust AD prediction frameworks. (Github repo provided)

Maham Nazir, Muhammad Aqeel, Francesco Setti

arxiv logopreprintAug 25 2025
Medical image segmentation models struggle with rare abnormalities due to scarce annotated pathological data. We propose DiffAug a novel framework that combines textguided diffusion-based generation with automatic segmentation validation to address this challenge. Our proposed approach uses latent diffusion models conditioned on medical text descriptions and spatial masks to synthesize abnormalities via inpainting on normal images. Generated samples undergo dynamic quality validation through a latentspace segmentation network that ensures accurate localization while enabling single-step inference. The text prompts, derived from medical literature, guide the generation of diverse abnormality types without requiring manual annotation. Our validation mechanism filters synthetic samples based on spatial accuracy, maintaining quality while operating efficiently through direct latent estimation. Evaluated on three medical imaging benchmarks (CVC-ClinicDB, Kvasir-SEG, REFUGE2), our framework achieves state-of-the-art performance with 8-10% Dice improvements over baselines and reduces false negative rates by up to 28% for challenging cases like small polyps and flat lesions critical for early detection in screening applications.
Page 232 of 6576562 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.