Sort by:
Page 152 of 2092082 results

MobNas ensembled model for breast cancer prediction.

Shahzad T, Saqib SM, Mazhar T, Iqbal M, Almogren A, Ghadi YY, Saeed MM, Hamam H

pubmed logopapersMay 25 2025
Breast cancer poses a real and immense threat to humankind, thus a need to develop a way of diagnosing this devastating disease early, accurately, and in a simpler manner. Thus, while substantial progress has been made in developing machine learning algorithms, deep learning, and transfer learning models, issues with diagnostic accuracy and minimizing diagnostic errors persist. This paper introduces MobNAS, a model that uses MobileNetV2 and NASNetLarge to sort breast cancer images into benign, malignant, or normal classes. The study employs a multi-class classification design and uses a publicly available dataset comprising 1,578 ultrasound images, including 891 benign, 421 malignant, and 266 normal cases. By deploying MobileNetV2, it is easy to work well on devices with less computational capability than is used by NASNetLarge, which enhances its applicability and effectiveness in other tasks. The performance of the proposed MobNAS model was tested on the breast cancer image dataset, and the accuracy level achieved was 97%, the Mean Absolute Error (MAE) was 0.05, and the Matthews Correlation Coefficient (MCC) was 95%. From the findings of this research, it is evident that MobNAS can enhance diagnostic accuracy and reduce existing shortcomings in breast cancer detection.

Sex-related differences and associated transcriptional signatures in the brain ventricular system and cerebrospinal fluid development in full-term neonates.

Sun Y, Fu C, Gu L, Zhao H, Feng Y, Jin C

pubmed logopapersMay 25 2025
The cerebrospinal fluid (CSF) is known to serve as a unique environment for neurodevelopment, with specific proteins secreted by epithelial cells of the choroid plexus (CP) playing crucial roles in cortical development and cell differentiation. Sex-related differences in the brain in early life have been widely identified, but few studies have investigated the neonatal CSF system and associated transcriptional signatures. This study included 75 full-term neonates [44 males and 31 females; gestational age (GA) = 37-42 weeks] without significant MRI abnormalities from the dHCP (developing Human Connectome Project) database. Deep-learning automated segmentation was used to measure various metrics of the brain ventricular system and CSF. Sex-related differences and relationships with postnatal age were analyzed by linear regression. Correlations between the CP and CSF space metrics were also examined. LASSO regression was further applied to identify the key genes contributing to the sex-related CSF system differences by using regional gene expression data from the Allen Human Brain Atlas. Right lateral ventricles [2.42 ± 0.98 vs. 2.04 ± 0.45 cm3 (mean ± standard deviation), p = 0.036] and right CP (0.16 ± 0.07 vs. 0.13 ± 0.04 cm3, p = 0.024) were larger in males, with a stronger volume correlation (male/female correlation coefficients r: 0.798 vs. 0.649, p < 1 × 10<sup>- 4</sup>). No difference was found in total CSF volume, while peripheral CSF (male/female β: 1.218 vs. 1.064) and CP (male/female β: 0.008 vs. 0.005) exhibited relatively faster growth in males. Additionally, the volumes of the lateral ventricular system, third ventricle, peripheral CSF, and total CSF were significantly correlated with their corresponding CP volume (r: 0.362 to 0.799, p < 0.05). DERL2 (Degradation in Endoplasmic Reticulum Protein 2) (r = 0.1319) and MRPL48 (Mitochondrial Large Ribosomal Subunit Protein) (r=-0.0370) were identified as potential key genes associated with sex-related differences in CSF system. Male neonates present larger volumes and faster growth of the right lateral ventricle, likely linked to corresponding CP volume and growth pattern. The downregulation of DERL2 and upregulation of MRPL48 may contribute to these sex-related variations in the CSF system, suggesting a molecular basis for sex-specific brain development.

A novel network architecture for post-applicator placement CT auto-contouring in cervical cancer HDR brachytherapy.

Lei Y, Chao M, Yang K, Gupta V, Yoshida EJ, Wang T, Yang X, Liu T

pubmed logopapersMay 25 2025
High-dose-rate brachytherapy (HDR-BT) is an integral part of treatment for locally advanced cervical cancer, requiring accurate segmentation of the high-risk clinical target volume (HR-CTV) and organs at risk (OARs) on post-applicator CT (pCT) for precise and safe dose delivery. Manual contouring, however, is time-consuming and highly variable, with challenges heightened in cervical HDR-BT due to complex anatomy and low tissue contrast. An effective auto-contouring solution could significantly enhance efficiency, consistency, and accuracy in cervical HDR-BT planning. To develop a machine learning-based approach that improves the accuracy and efficiency of HR-CTV and OAR segmentation on pCT images for cervical HDR-BT. The proposed method employs two sequential deep learning models to segment target and OARs from planning CT data. The intuitive model, a U-Net, initially segments simpler structures such as the bladder and HR-CTV, utilizing shallow features and iodine contrast agents. Building on this, the sophisticated model targets complex structures like the sigmoid, rectum, and bowel, addressing challenges from low contrast, anatomical proximity, and imaging artifacts. This model incorporates spatial information from the intuitive model and uses total variation regularization to improve segmentation smoothness by applying a penalty to changes in gradient. This dual-model approach improves accuracy and consistency in segmenting high-risk clinical target volumes and organs at risk in cervical HDR-BT. To validate the proposed method, 32 cervical cancer patients treated with tandem and ovoid (T&O) HDR brachytherapy (3-5 fractions, 115 CT images) were retrospectively selected. The method's performance was assessed using four-fold cross-validation, comparing segmentation results to manual contours across five metrics: Dice similarity coefficient (DSC), 95% Hausdorff distance (HD<sub>95</sub>), mean surface distance (MSD), center-of-mass distance (CMD), and volume difference (VD). Dosimetric evaluations included D90 for HR-CTV and D2cc for OARs. The proposed method demonstrates high segmentation accuracy for HR-CTV, bladder, and rectum, achieving DSC values of 0.79 ± 0.06, 0.83 ± 0.10, and 0.76 ± 0.15, MSD values of 1.92 ± 0.77 mm, 2.24 ± 1.20 mm, and 4.18 ± 3.74 mm, and absolute VD values of 5.34 ± 4.85 cc, 17.16 ± 17.38 cc, and 18.54 ± 16.83 cc, respectively. Despite challenges in bowel and sigmoid segmentation due to poor soft tissue contrast in CT and variability in manual contouring (ground truth volumes of 128.48 ± 95.9 cc and 51.87 ± 40.67 cc), the method significantly outperforms two state-of-the-art methods on DSC, MSD, and CMD metrics (p-value < 0.05). For HR-CTV, the mean absolute D90 difference was 0.42 ± 1.17 Gy (p-value > 0.05), less than 5% of the prescription dose. Over 75% of cases showed changes within ± 0.5 Gy, and fewer than 10% exceeded ± 1 Gy. The mean and variation in structure volume and D2cc parameters between manual and segmented contours for OARs showed no significant differences (p-value > 0.05), with mean absolute D2cc differences within 0.5 Gy, except for the bladder, which exhibited higher variability (0.97 Gy). Our innovative auto-contouring method showed promising results in segmenting HR-CTV and OARs from pCT, potentially enhancing the efficiency of HDR BT cervical treatment planning. Further validation and clinical implementation are required to fully realize its clinical benefits.

PolyPose: Localizing Deformable Anatomy in 3D from Sparse 2D X-ray Images using Polyrigid Transforms

Vivek Gopalakrishnan, Neel Dey, Polina Golland

arxiv logopreprintMay 25 2025
Determining the 3D pose of a patient from a limited set of 2D X-ray images is a critical task in interventional settings. While preoperative volumetric imaging (e.g., CT and MRI) provides precise 3D localization and visualization of anatomical targets, these modalities cannot be acquired during procedures, where fast 2D imaging (X-ray) is used instead. To integrate volumetric guidance into intraoperative procedures, we present PolyPose, a simple and robust method for deformable 2D/3D registration. PolyPose parameterizes complex 3D deformation fields as a composition of rigid transforms, leveraging the biological constraint that individual bones do not bend in typical motion. Unlike existing methods that either assume no inter-joint movement or fail outright in this under-determined setting, our polyrigid formulation enforces anatomically plausible priors that respect the piecewise rigid nature of human movement. This approach eliminates the need for expensive deformation regularizers that require patient- and procedure-specific hyperparameter optimization. Across extensive experiments on diverse datasets from orthopedic surgery and radiotherapy, we show that this strong inductive bias enables PolyPose to successfully align the patient's preoperative volume to as few as two X-ray images, thereby providing crucial 3D guidance in challenging sparse-view and limited-angle settings where current registration methods fail.

Improving Medical Reasoning with Curriculum-Aware Reinforcement Learning

Shaohao Rui, Kaitao Chen, Weijie Ma, Xiaosong Wang

arxiv logopreprintMay 25 2025
Recent advances in reinforcement learning with verifiable, rule-based rewards have greatly enhanced the reasoning capabilities and out-of-distribution generalization of VLMs/LLMs, obviating the need for manually crafted reasoning chains. Despite these promising developments in the general domain, their translation to medical imaging remains limited. Current medical reinforcement fine-tuning (RFT) methods predominantly focus on close-ended VQA, thereby restricting the model's ability to engage in world knowledge retrieval and flexible task adaptation. More critically, these methods fall short of addressing the critical clinical demand for open-ended, reasoning-intensive decision-making. To bridge this gap, we introduce \textbf{MedCCO}, the first multimodal reinforcement learning framework tailored for medical VQA that unifies close-ended and open-ended data within a curriculum-driven RFT paradigm. Specifically, MedCCO is initially fine-tuned on a diverse set of close-ended medical VQA tasks to establish domain-grounded reasoning capabilities, and is then progressively adapted to open-ended tasks to foster deeper knowledge enhancement and clinical interpretability. We validate MedCCO across eight challenging medical VQA benchmarks, spanning both close-ended and open-ended settings. Experimental results show that MedCCO consistently enhances performance and generalization, achieving a 11.4\% accuracy gain across three in-domain tasks, and a 5.7\% improvement on five out-of-domain benchmarks. These findings highlight the promise of curriculum-guided RL in advancing robust, clinically-relevant reasoning in medical multimodal language models.

SPARS: Self-Play Adversarial Reinforcement Learning for Segmentation of Liver Tumours

Catalina Tan, Yipeng Hu, Shaheer U. Saeed

arxiv logopreprintMay 25 2025
Accurate tumour segmentation is vital for various targeted diagnostic and therapeutic procedures for cancer, e.g., planning biopsies or tumour ablations. Manual delineation is extremely labour-intensive, requiring substantial expert time. Fully-supervised machine learning models aim to automate such localisation tasks, but require a large number of costly and often subjective 3D voxel-level labels for training. The high-variance and subjectivity in such labels impacts model generalisability, even when large datasets are available. Histopathology labels may offer more objective labels but the infeasibility of acquiring pixel-level annotations to develop tumour localisation methods based on histology remains challenging in-vivo. In this work, we propose a novel weakly-supervised semantic segmentation framework called SPARS (Self-Play Adversarial Reinforcement Learning for Segmentation), which utilises an object presence classifier, trained on a small number of image-level binary cancer presence labels, to localise cancerous regions on CT scans. Such binary labels of patient-level cancer presence can be sourced more feasibly from biopsies and histopathology reports, enabling a more objective cancer localisation on medical images. Evaluating with real patient data, we observed that SPARS yielded a mean dice score of $77.3 \pm 9.4$, which outperformed other weakly-supervised methods by large margins. This performance was comparable with recent fully-supervised methods that require voxel-level annotations. Our results demonstrate the potential of using SPARS to reduce the need for extensive human-annotated labels to detect cancer in real-world healthcare settings.

MedITok: A Unified Tokenizer for Medical Image Synthesis and Interpretation

Chenglong Ma, Yuanfeng Ji, Jin Ye, Zilong Li, Chenhui Wang, Junzhi Ning, Wei Li, Lihao Liu, Qiushan Guo, Tianbin Li, Junjun He, Hongming Shan

arxiv logopreprintMay 25 2025
Advanced autoregressive models have reshaped multimodal AI. However, their transformative potential in medical imaging remains largely untapped due to the absence of a unified visual tokenizer -- one capable of capturing fine-grained visual structures for faithful image reconstruction and realistic image synthesis, as well as rich semantics for accurate diagnosis and image interpretation. To this end, we present MedITok, the first unified tokenizer tailored for medical images, encoding both low-level structural details and high-level clinical semantics within a unified latent space. To balance these competing objectives, we introduce a novel two-stage training framework: a visual representation alignment stage that cold-starts the tokenizer reconstruction learning with a visual semantic constraint, followed by a textual semantic representation alignment stage that infuses detailed clinical semantics into the latent space. Trained on the meticulously collected large-scale dataset with over 30 million medical images and 2 million image-caption pairs, MedITok achieves state-of-the-art performance on more than 30 datasets across 9 imaging modalities and 4 different tasks. By providing a unified token space for autoregressive modeling, MedITok supports a wide range of tasks in clinical diagnostics and generative healthcare applications. Model and code will be made publicly available at: https://github.com/Masaaki-75/meditok.

CardioCoT: Hierarchical Reasoning for Multimodal Survival Analysis

Shaohao Rui, Haoyang Su, Jinyi Xiang, Lian-Ming Wu, Xiaosong Wang

arxiv logopreprintMay 25 2025
Accurate prediction of major adverse cardiovascular events recurrence risk in acute myocardial infarction patients based on postoperative cardiac MRI and associated clinical notes is crucial for precision treatment and personalized intervention. Existing methods primarily focus on risk stratification capability while overlooking the need for intermediate robust reasoning and model interpretability in clinical practice. Moreover, end-to-end risk prediction using LLM/VLM faces significant challenges due to data limitations and modeling complexity. To bridge this gap, we propose CardioCoT, a novel two-stage hierarchical reasoning-enhanced survival analysis framework designed to enhance both model interpretability and predictive performance. In the first stage, we employ an evidence-augmented self-refinement mechanism to guide LLM/VLMs in generating robust hierarchical reasoning trajectories based on associated radiological findings. In the second stage, we integrate the reasoning trajectories with imaging data for risk model training and prediction. CardioCoT demonstrates superior performance in MACE recurrence risk prediction while providing interpretable reasoning processes, offering valuable insights for clinical decision-making.

CDPDNet: Integrating Text Guidance with Hybrid Vision Encoders for Medical Image Segmentation

Jiong Wu, Yang Xing, Boxiao Yu, Wei Shao, Kuang Gong

arxiv logopreprintMay 25 2025
Most publicly available medical segmentation datasets are only partially labeled, with annotations provided for a subset of anatomical structures. When multiple datasets are combined for training, this incomplete annotation poses challenges, as it limits the model's ability to learn shared anatomical representations among datasets. Furthermore, vision-only frameworks often fail to capture complex anatomical relationships and task-specific distinctions, leading to reduced segmentation accuracy and poor generalizability to unseen datasets. In this study, we proposed a novel CLIP-DINO Prompt-Driven Segmentation Network (CDPDNet), which combined a self-supervised vision transformer with CLIP-based text embedding and introduced task-specific text prompts to tackle these challenges. Specifically, the framework was constructed upon a convolutional neural network (CNN) and incorporated DINOv2 to extract both fine-grained and global visual features, which were then fused using a multi-head cross-attention module to overcome the limited long-range modeling capability of CNNs. In addition, CLIP-derived text embeddings were projected into the visual space to help model complex relationships among organs and tumors. To further address the partial label challenge and enhance inter-task discriminative capability, a Text-based Task Prompt Generation (TTPG) module that generated task-specific prompts was designed to guide the segmentation. Extensive experiments on multiple medical imaging datasets demonstrated that CDPDNet consistently outperformed existing state-of-the-art segmentation methods. Code and pretrained model are available at: https://github.com/wujiong-hub/CDPDNet.git.

CDPDNet: Integrating Text Guidance with Hybrid Vision Encoders for Medical Image Segmentation

Jiong Wu, Yang Xing, Boxiao Yu, Wei Shao, Kuang Gong

arxiv logopreprintMay 25 2025
Most publicly available medical segmentation datasets are only partially labeled, with annotations provided for a subset of anatomical structures. When multiple datasets are combined for training, this incomplete annotation poses challenges, as it limits the model's ability to learn shared anatomical representations among datasets. Furthermore, vision-only frameworks often fail to capture complex anatomical relationships and task-specific distinctions, leading to reduced segmentation accuracy and poor generalizability to unseen datasets. In this study, we proposed a novel CLIP-DINO Prompt-Driven Segmentation Network (CDPDNet), which combined a self-supervised vision transformer with CLIP-based text embedding and introduced task-specific text prompts to tackle these challenges. Specifically, the framework was constructed upon a convolutional neural network (CNN) and incorporated DINOv2 to extract both fine-grained and global visual features, which were then fused using a multi-head cross-attention module to overcome the limited long-range modeling capability of CNNs. In addition, CLIP-derived text embeddings were projected into the visual space to help model complex relationships among organs and tumors. To further address the partial label challenge and enhance inter-task discriminative capability, a Text-based Task Prompt Generation (TTPG) module that generated task-specific prompts was designed to guide the segmentation. Extensive experiments on multiple medical imaging datasets demonstrated that CDPDNet consistently outperformed existing state-of-the-art segmentation methods. Code and pretrained model are available at: https://github.com/wujiong-hub/CDPDNet.git.
Page 152 of 2092082 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.