Sort by:
Page 602 of 7627616 results

Shao Y, Cen HS, Dhananjay A, Pawan SJ, Lei X, Gill IS, D'souza A, Duddalwar VA

pubmed logopapersJun 12 2025
This study aimed to evaluate radiomic models' ability to predict hypoxia-related biomarker expression in clear cell renal cell carcinoma (ccRCC). Clinical and molecular data from 190 patients were extracted from The Cancer Genome Atlas-Kidney Renal Clear Cell Carcinoma dataset, and corresponding CT imaging data were manually segmented from The Cancer Imaging Archive. A panel of 2,824 radiomic features was analyzed, and robust, high-interscanner-reproducibility features were selected. Gene expression data for 13 hypoxia-related biomarkers were stratified by tumor grade (1/2 vs. 3/4) and stage (I/II vs. III/IV) and analyzed using Wilcoxon rank sum test. Machine learning modeling was conducted using the High-Performance Random Forest (RF) procedure in SAS Enterprise Miner 15.1, with significance at P < 0.05. Descriptive univariate analysis revealed significantly lower expression of several biomarkers in high-grade and late-stage tumors, with KLF6 showing the most notable decrease. The RF model effectively predicted the expression of KLF6, ETS1, and BCL2, as well as PLOD2 and PPARGC1A underexpression. Stratified performance assessment showed improved predictive ability for RORA, BCL2, and KLF6 in high-grade tumors and for ETS1 across grades, with no significant performance difference across grade or stage. The RF model demonstrated modest but significant associations between texture metrics derived from clinical CT scans, such as GLDM and GLCM, and key hypoxia-related biomarkers including KLF6, BCL2, ETS1, and PLOD2. These findings suggest that radiomic analysis could support ccRCC risk stratification and personalized treatment planning by providing non-invasive insights into tumor biology.

Fang Y, Wang M, Song Q, Cao C, Gao Z, Song B, Min X, Li A

pubmed logopapersJun 12 2025
Accurate and non-invasive prediction of epidermal growth factor receptor (EGFR) mutation is crucial for the diagnosis and treatment of non-small cell lung cancer (NSCLC). While computed tomography (CT) imaging shows promise in identifying EGFR mutation, current prediction methods heavily rely on fully supervised learning, which overlooks the substantial heterogeneity of tumors and therefore leads to suboptimal results. To tackle tumor heterogeneity issue, this study introduces a novel weakly supervised method named TransMIEL, which leverages multiple instance learning techniques for accurate EGFR mutation prediction. Specifically, we first propose an innovative instance enhancement learning (IEL) strategy that strengthens the discriminative power of instance features for complex tumor CT images by exploring self-derived soft pseudo-labels. Next, to improve tumor representation capability, we design a spatial-aware transformer (SAT) that fully captures inter-instance relationships of different pathological subregions to mirror the diagnostic processes of radiologists. Finally, an instance adaptive gating (IAG) module is developed to effectively emphasize the contribution of informative instance features in heterogeneous tumors, facilitating dynamic instance feature aggregation and increasing model generalization performance. Experimental results demonstrate that TransMIEL significantly outperforms existing fully and weakly supervised methods on both public and in-house NSCLC datasets. Additionally, visualization results show that our approach can highlight intra-tumor and peri-tumor areas relevant to EGFR mutation status. Therefore, our method holds significant potential as an effective tool for EGFR prediction and offers a novel perspective for future research on tumor heterogeneity.

Wang J, Mateen M, Xiang D, Zhu W, Shi F, Huang J, Sun K, Dai J, Xu J, Zhang S, Chen X

pubmed logopapersJun 12 2025
Deep learning (DL) requires large amounts of labeled data, which is extremely time-consuming and laborintensive to obtain for medical image segmentation tasks. Metalearning focuses on developing learning strategies that enable quick adaptation to new tasks with limited labeled data. However, rich-class medical image segmentation datasets for constructing meta-learning multi-tasks are currently unavailable. In addition, data collected from various healthcare sites and devices may present significant distribution differences, potentially degrading model's performance. In this paper, we propose a task augmentation-based meta-learning method for retinal image segmentation (TAMS) to meet labor-intensive annotation demand. A retinal Lesion Simulation Algorithm (LSA) is proposed to automatically generate multi-class retinal disease datasets with pixel-level segmentation labels, such that metalearning tasks can be augmented without collecting data from various sources. In addition, a novel simulation function library is designed to control generation process and ensure interpretability. Moreover, a generative simulation network (GSNet) with an improved adversarial training strategy is introduced to maintain high-quality representations of complex retinal diseases. TAMS is evaluated on three different OCT and CFP image datasets, and comprehensive experiments have demonstrated that TAMS achieves superior segmentation performance than state-of-the-art models.

Nicholas Summerfield, Qisheng He, Alex Kuo, Ahmed I. Ghanem, Simeng Zhu, Chase Ruff, Joshua Pan, Anudeep Kumar, Prashant Nagpal, Jiwei Zhao, Ming Dong, Carri K. Glide-Hurst

arxiv logopreprintJun 12 2025
Cardiac substructures are essential in thoracic radiation therapy planning to minimize risk of radiation-induced heart disease. Deep learning (DL) offers efficient methods to reduce contouring burden but lacks generalizability across different modalities and overlapping structures. This work introduces and validates a Modality-AGnostic Image Cascade (MAGIC) for comprehensive and multi-modal cardiac substructure segmentation. MAGIC is implemented through replicated encoding and decoding branches of an nnU-Net-based, U-shaped backbone conserving the function of a single model. Twenty cardiac substructures (heart, chambers, great vessels (GVs), valves, coronary arteries (CAs), and conduction nodes) from simulation CT (Sim-CT), low-field MR-Linac, and cardiac CT angiography (CCTA) modalities were manually delineated and used to train (n=76), validate (n=15), and test (n=30) MAGIC. Twelve comparison models (four segmentation subgroups across three modalities) were equivalently trained. All methods were compared for training efficiency and against reference contours using the Dice Similarity Coefficient (DSC) and two-tailed Wilcoxon Signed-Rank test (threshold, p<0.05). Average DSC scores were 0.75(0.16) for Sim-CT, 0.68(0.21) for MR-Linac, and 0.80(0.16) for CCTA. MAGIC outperforms the comparison in 57% of cases, with limited statistical differences. MAGIC offers an effective and accurate segmentation solution that is lightweight and capable of segmenting multiple modalities and overlapping structures in a single model. MAGIC further enables clinical implementation by simplifying the computational requirements and offering unparalleled flexibility for clinical settings.

Zhu L, Yu NY, Ahmed SK, Ashman JB, Toesca DS, Grams MP, Deufel CL, Duan J, Chen Q, Rong Y

pubmed logopapersJun 12 2025
Lattice radiation therapy (LRT) is a form of spatially fractionated radiation therapy that allows increased total dose delivery aiming for improved treatment response without an increase in toxicities, commonly utilized for palliation of bulky tumors. The LRT treatment planning process is complex, while eligible patients often have an urgent need for expedited treatment start. In this study, we aimed to develop a simulation-free workflow for volumetric modulated arc therapy (VMAT)-based LRT planning via deep learning-predicted synthetic CT (sCT) to expedite treatment initiation. Two deep learning models were initially trained using 3D U-Net architecture to generate sCT from diagnostic CTs (dCT) of the thoracic and abdomen regions using a training dataset of 50 patients. The models were then tested on an independent dataset of 15 patients using image similarity analysis assessing mean absolute error (MAE) and structural similarity index measure (SSIM) as metrics. VMAT-based LRT plans were generated based on sCT and recalculated on the planning CT (pCT) for dosimetric accuracy comparison. Differences in dose volume histogram (DVH) metrics between pCT and sCT plans were assessed using the Wilcoxon signed-rank test. The final sCT prediction model demonstrated high image similarity to pCT, with a MAE and SSIM of 38.93 ± 14.79 Hounsfield Units (HU) and 0.92 ± 0.05 for the thoracic region, and 73.60 ± 22.90 HU and 0.90 ± 0.03 for the abdominal region, respectively. There were no statistically significant differences between sCT and pCT plans in terms of organ-at-risk and target volume DVH parameters, including maximum dose (Dmax), mean dose (Dmean), dose delivered to 90% (D90%) and 50% (D50%) of target volume, except for minimum dose (Dmin) and (D10%). With demonstrated high image similarity and adequate dose agreement between sCT and pCT, our study is a proof-of-concept for using deep learning predicted sCT for a simulation-free treatment planning workflow for VMAT-based LRT.

Zhenhuan Zhou

arxiv logopreprintJun 12 2025
Medical image segmentation is a fundamental and key technology in computer-aided diagnosis and treatment. Previous methods can be broadly classified into three categories: convolutional neural network (CNN) based, Transformer based, and hybrid architectures that combine both. However, each of them has its own limitations, such as restricted receptive fields in CNNs or the computational overhead caused by the quadratic complexity of Transformers. Recently, the Receptance Weighted Key Value (RWKV) model has emerged as a promising alternative for various vision tasks, offering strong long-range modeling capabilities with linear computational complexity. Some studies have also adapted RWKV to medical image segmentation tasks, achieving competitive performance. However, most of these studies focus on modifications to the Vision-RWKV (VRWKV) mechanism and train models from scratch, without exploring the potential advantages of leveraging pre-trained VRWKV models for medical image segmentation tasks. In this paper, we propose Med-URWKV, a pure RWKV-based architecture built upon the U-Net framework, which incorporates ImageNet-based pretraining to further explore the potential of RWKV in medical image segmentation tasks. To the best of our knowledge, Med-URWKV is the first pure RWKV segmentation model in the medical field that can directly reuse a large-scale pre-trained VRWKV encoder. Experimental results on seven datasets demonstrate that Med-URWKV achieves comparable or even superior segmentation performance compared to other carefully optimized RWKV models trained from scratch. This validates the effectiveness of using a pretrained VRWKV encoder in enhancing model performance. The codes will be released.

Yu Huang, Zelin Peng, Yichen Zhao, Piao Yang, Xiaokang Yang, Wei Shen

arxiv logopreprintJun 12 2025
Medical image segmentation is crucial for clinical diagnosis, yet existing models are limited by their reliance on explicit human instructions and lack the active reasoning capabilities to understand complex clinical questions. While recent advancements in multimodal large language models (MLLMs) have improved medical question-answering (QA) tasks, most methods struggle to generate precise segmentation masks, limiting their application in automatic medical diagnosis. In this paper, we introduce medical image reasoning segmentation, a novel task that aims to generate segmentation masks based on complex and implicit medical instructions. To address this, we propose MedSeg-R, an end-to-end framework that leverages the reasoning abilities of MLLMs to interpret clinical questions while also capable of producing corresponding precise segmentation masks for medical images. It is built on two core components: 1) a global context understanding module that interprets images and comprehends complex medical instructions to generate multi-modal intermediate tokens, and 2) a pixel-level grounding module that decodes these tokens to produce precise segmentation masks and textual responses. Furthermore, we introduce MedSeg-QA, a large-scale dataset tailored for the medical image reasoning segmentation task. It includes over 10,000 image-mask pairs and multi-turn conversations, automatically annotated using large language models and refined through physician reviews. Experiments show MedSeg-R's superior performance across several benchmarks, achieving high segmentation accuracy and enabling interpretable textual analysis of medical images.

Yuliang Zhu, Jing Cheng, Qi Xie, Zhuo-Xu Cui, Qingyong Zhu, Yuanyuan Liu, Xin Liu, Jianfeng Ren, Chengbo Wang, Dong Liang

arxiv logopreprintJun 12 2025
Dynamic Magnetic Resonance Imaging (MRI) exhibits transformation symmetries, including spatial rotation symmetry within individual frames and temporal symmetry along the time dimension. Explicit incorporation of these symmetry priors in the reconstruction model can significantly improve image quality, especially under aggressive undersampling scenarios. Recently, Equivariant convolutional neural network (ECNN) has shown great promise in exploiting spatial symmetry priors. However, existing ECNNs critically fail to model temporal symmetry, arguably the most universal and informative structural prior in dynamic MRI reconstruction. To tackle this issue, we propose a novel Deep Unrolling Network with Spatiotemporal Rotation Equivariance (DUN-SRE) for Dynamic MRI Reconstruction. The DUN-SRE establishes spatiotemporal equivariance through a (2+1)D equivariant convolutional architecture. In particular, it integrates both the data consistency and proximal mapping module into a unified deep unrolling framework. This architecture ensures rigorous propagation of spatiotemporal rotation symmetry constraints throughout the reconstruction process, enabling more physically accurate modeling of cardiac motion dynamics in cine MRI. In addition, a high-fidelity group filter parameterization mechanism is developed to maintain representation precision while enforcing symmetry constraints. Comprehensive experiments on Cardiac CINE MRI datasets demonstrate that DUN-SRE achieves state-of-the-art performance, particularly in preserving rotation-symmetric structures, offering strong generalization capability to a broad range of dynamic MRI reconstruction tasks.

Andrea Moglia, Matteo Leccardi, Matteo Cavicchioli, Alice Maccarini, Marco Marcon, Luca Mainardi, Pietro Cerveri

arxiv logopreprintJun 12 2025
Following the successful paradigm shift of large language models, leveraging pre-training on a massive corpus of data and fine-tuning on different downstream tasks, generalist models have made their foray into computer vision. The introduction of Segment Anything Model (SAM) set a milestone on segmentation of natural images, inspiring the design of a multitude of architectures for medical image segmentation. In this survey we offer a comprehensive and in-depth investigation on generalist models for medical image segmentation. We start with an introduction on the fundamentals concepts underpinning their development. Then, we provide a taxonomy on the different declinations of SAM in terms of zero-shot, few-shot, fine-tuning, adapters, on the recent SAM 2, on other innovative models trained on images alone, and others trained on both text and images. We thoroughly analyze their performances at the level of both primary research and best-in-literature, followed by a rigorous comparison with the state-of-the-art task-specific models. We emphasize the need to address challenges in terms of compliance with regulatory frameworks, privacy and security laws, budget, and trustworthy artificial intelligence (AI). Finally, we share our perspective on future directions concerning synthetic data, early fusion, lessons learnt from generalist models in natural language processing, agentic AI and physical AI, and clinical translation.

Konstantinos Vilouras, Ilias Stogiannidis, Junyu Yan, Alison Q. O'Neil, Sotirios A. Tsaftaris

arxiv logopreprintJun 12 2025
Latent Diffusion Models have shown remarkable results in text-guided image synthesis in recent years. In the domain of natural (RGB) images, recent works have shown that such models can be adapted to various vision-language downstream tasks with little to no supervision involved. On the contrary, text-to-image Latent Diffusion Models remain relatively underexplored in the field of medical imaging, primarily due to limited data availability (e.g., due to privacy concerns). In this work, focusing on the chest X-ray modality, we first demonstrate that a standard text-conditioned Latent Diffusion Model has not learned to align clinically relevant information in free-text radiology reports with the corresponding areas of the given scan. Then, to alleviate this issue, we propose a fine-tuning framework to improve multi-modal alignment in a pre-trained model such that it can be efficiently repurposed for downstream tasks such as phrase grounding. Our method sets a new state-of-the-art on a standard benchmark dataset (MS-CXR), while also exhibiting robust performance on out-of-distribution data (VinDr-CXR). Our code will be made publicly available.
Page 602 of 7627616 results
Show
per page

Ready to Sharpen Your Edge?

Subscribe to join 7,700+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.