Sort by:
Page 3 of 45448 results

Mamba-based deformable medical image registration with an annotated brain MR-CT dataset.

Wang Y, Guo T, Yuan W, Shu S, Meng C, Bai X

pubmed logopapersJul 1 2025
Deformable registration is essential in medical image analysis, especially for handling various multi- and mono-modal registration tasks in neuroimaging. Existing studies lack exploration of brain MR-CT registration, and face challenges in both accuracy and efficiency improvements of learning-based methods. To enlarge the practice of multi-modal registration in brain, we present SR-Reg, a new benchmark dataset comprising 180 volumetric paired MR-CT images and annotated anatomical regions. Building on this foundation, we introduce MambaMorph, a novel deformable registration network based on an efficient state space model Mamba for global feature learning, with a fine-grained feature extractor for low-level embedding. Experimental results demonstrate that MambaMorph surpasses advanced ConvNet-based and Transformer-based networks across several multi- and mono-modal tasks, showcasing impressive enhancements of efficacy and efficiency. Code and dataset are available at https://github.com/mileswyn/MambaMorph.

Prediction of PD-L1 expression in NSCLC patients using PET/CT radiomics and prognostic modelling for immunotherapy in PD-L1-positive NSCLC patients.

Peng M, Wang M, Yang X, Wang Y, Xie L, An W, Ge F, Yang C, Wang K

pubmed logopapersJul 1 2025
To develop a positron emission tomography/computed tomography (PET/CT)-based radiomics model for predicting programmed cell death ligand 1 (PD-L1) expression in non-small cell lung cancer (NSCLC) patients and estimating progression-free survival (PFS) and overall survival (OS) in PD-L1-positive patients undergoing first-line immunotherapy. We retrospectively analysed 143 NSCLC patients who underwent pretreatment <sup>18</sup>F-fluorodeoxyglucose (<sup>18</sup>F-FDG) PET/CT scans, of whom 86 were PD-L1-positive. Clinical data collected included gender, age, smoking history, Tumor-Node-Metastases (TNM) staging system, pathologic types, laboratory parameters, and PET metabolic parameters. Four machine learning algorithms-Bayes, logistic, random forest, and Supportsupport vector machine (SVM)-were used to build models. The predictive performance was validated using receiver operating characteristic (ROC) curves. Univariate and multivariate Cox analyses identified independent predictors of OS and PFS in PD-L1-positive expression patients undergoing immunotherapy, and a nomogram was created to predict OS. A total of 20 models were built for predicting PD-L1 expression. The clinical combined PET/CT radiomics model based on the SVM algorithm performed best (area under curve for training and test sets: 0.914 and 0.877, respectively). The Cox analyses showed that smoking history independently predicted PFS. SUVmean, monocyte percentage and white blood cell count were independent predictors of OS, and the nomogram was created to predict 1-year, 2-year, and 3-year OS based on these three factors. We developed PET/CT-based machine learning models to help predict PD-L1 expression in NSCLC patients and identified independent predictors of PFS and OS in PD-L1-positive patients receiving immunotherapy, thereby aiding precision treatment.

World of Forms: Deformable geometric templates for one-shot surface meshing in coronary CT angiography.

van Herten RLM, Lagogiannis I, Wolterink JM, Bruns S, Meulendijks ER, Dey D, de Groot JR, Henriques JP, Planken RN, Saitta S, Išgum I

pubmed logopapersJul 1 2025
Deep learning-based medical image segmentation and surface mesh generation typically involve a sequential pipeline from image to segmentation to meshes, often requiring large training datasets while making limited use of prior geometric knowledge. This may lead to topological inconsistencies and suboptimal performance in low-data regimes. To address these challenges, we propose a data-efficient deep learning method for direct 3D anatomical object surface meshing using geometric priors. Our approach employs a multi-resolution graph neural network that operates on a prior geometric template which is deformed to fit object boundaries of interest. We show how different templates may be used for the different surface meshing targets, and introduce a novel masked autoencoder pretraining strategy for 3D spherical data. The proposed method outperforms nnUNet in a one-shot setting for segmentation of the pericardium, left ventricle (LV) cavity and the LV myocardium. Similarly, the method outperforms other lumen segmentation operating on multi-planar reformatted images. Results further indicate that mesh quality is on par with or improves upon marching cubes post-processing of voxel mask predictions, while remaining flexible in the choice of mesh triangulation prior, thus paving the way for more accurate and topologically consistent 3D medical object surface meshing.

TIER-LOC: Visual Query-based Video Clip Localization in fetal ultrasound videos with a multi-tier transformer.

Mishra D, Saha P, Zhao H, Hernandez-Cruz N, Patey O, Papageorghiou AT, Noble JA

pubmed logopapersJul 1 2025
In this paper, we introduce the Visual Query-based task of Video Clip Localization (VQ-VCL) for medical video understanding. Specifically, we aim to retrieve a video clip containing frames similar to a given exemplar frame from a given input video. To solve the task, we propose a novel visual query-based video clip localization model called TIER-LOC. TIER-LOC is designed to improve video clip retrieval, especially in fine-grained videos by extracting features from different levels, i.e., coarse to fine-grained, referred to as TIERS. The aim is to utilize multi-Tier features for detecting subtle differences, and adapting to scale or resolution variations, leading to improved video-clip retrieval. TIER-LOC has three main components: (1) a Multi-Tier Spatio-Temporal Transformer to fuse spatio-temporal features extracted from multiple Tiers of video frames with features from multiple Tiers of the visual query enabling better video understanding. (2) a Multi-Tier, Dual Anchor Contrastive Loss to deal with real-world annotation noise which can be notable at event boundaries and in videos featuring highly similar objects. (3) a Temporal Uncertainty-Aware Localization Loss designed to reduce the model sensitivity to imprecise event boundary. This is achieved by relaxing hard boundary constraints thus allowing the model to learn underlying class patterns and not be influenced by individual noisy samples. To demonstrate the efficacy of TIER-LOC, we evaluate it on two ultrasound video datasets and an open-source egocentric video dataset. First, we develop a sonographer workflow assistive task model to detect standard-frame clips in fetal ultrasound heart sweeps. Second, we assess our model's performance in retrieving standard-frame clips for detecting fetal anomalies in routine ultrasound scans, using the large-scale PULSE dataset. Lastly, we test our model's performance on an open-source computer vision video dataset by creating a VQ-VCL fine-grained video dataset based on the Ego4D dataset. Our model outperforms the best-performing state-of-the-art model by 7%, 4%, and 4% on the three video datasets, respectively.

Automated vertebrae identification and segmentation with structural uncertainty analysis in longitudinal CT scans of patients with multiple myeloma.

Madzia-Madzou DK, Jak M, de Keizer B, Verlaan JJ, Minnema MC, Gilhuijs K

pubmed logopapersJul 1 2025
Optimize deep learning-based vertebrae segmentation in longitudinal CT scans of multiple myeloma patients using structural uncertainty analysis. Retrospective CT scans from 474 multiple myeloma patients were divided into train (179 patients, 349 scans, 2005-2011) and test cohort (295 patients, 671 scans, 2012-2020). An enhanced segmentation pipeline was developed on the train cohort. It integrated vertebrae segmentation using an open-source deep learning method (Payer's) with a post-hoc structural uncertainty analysis. This analysis identified inconsistencies, automatically correcting them or flagging uncertain regions for human review. Segmentation quality was assessed through vertebral shape analysis using topology. Metrics included 'identification rate', 'longitudinal vertebral match rate', 'success rate' and 'series success rate' and evaluated across age/sex subgroups. Statistical analysis included McNemar and Wilcoxon signed-rank tests, with p < 0.05 indicating significant improvement. Payer's method achieved an identification rate of 95.8% and success rate of 86.7%. The proposed pipeline automatically improved these metrics to 98.8% and 96.0%, respectively (p < 0.001). Additionally, 3.6% of scans were marked for human inspection, increasing the success rate from 96.0% to 98.8% (p < 0.001). The vertebral match rate increased from 97.0% to 99.7% (p < 0.001), and the series success rate from 80.0% to 95.4% (p < 0.001). Subgroup analysis showed more consistent performance across age and sex groups. The proposed pipeline significantly outperforms Payer's method, enhancing segmentation accuracy and reducing longitudinal matching errors while minimizing evaluation workload. Its uncertainty analysis ensures robust performance, making it a valuable tool for longitudinal studies in multiple myeloma.

Cascade learning in multi-task encoder-decoder networks for concurrent bone segmentation and glenohumeral joint clinical assessment in shoulder CT scans.

Marsilio L, Marzorati D, Rossi M, Moglia A, Mainardi L, Manzotti A, Cerveri P

pubmed logopapersJul 1 2025
Osteoarthritis is a degenerative condition that affects bones and cartilage, often leading to structural changes, including osteophyte formation, bone density loss, and the narrowing of joint spaces. Over time, this process may disrupt the glenohumeral (GH) joint functionality, requiring a targeted treatment. Various options are available to restore joint functions, ranging from conservative management to surgical interventions, depending on the severity of the condition. This work introduces an innovative deep learning framework to process shoulder CT scans. It features the semantic segmentation of the proximal humerus and scapula, the 3D reconstruction of bone surfaces, the identification of the GH joint region, and the staging of three common osteoarthritic-related conditions: osteophyte formation (OS), GH space reduction (JS), and humeroscapular alignment (HSA). Each condition was stratified into multiple severity stages, offering a comprehensive analysis of shoulder bone structure pathology. The pipeline comprised two cascaded CNN architectures: 3D CEL-UNet for segmentation and 3D Arthro-Net for threefold classification. A retrospective dataset of 571 CT scans featuring patients with various degrees of GH osteoarthritic-related pathologies was used to train, validate, and test the pipeline. Root mean squared error and Hausdorff distance median values for 3D reconstruction were 0.22 mm and 1.48 mm for the humerus and 0.24 mm and 1.48 mm for the scapula, outperforming state-of-the-art architectures and making it potentially suitable for a PSI-based shoulder arthroplasty preoperative plan context. The classification accuracy for OS, JS, and HSA consistently reached around 90% across all three categories. The computational time for the entire inference pipeline was less than 15 s, showcasing the framework's efficiency and compatibility with orthopedic radiology practice. The achieved reconstruction and classification accuracy, combined with the rapid processing time, represent a promising advancement towards the medical translation of artificial intelligence tools. This progress aims to streamline the preoperative planning pipeline, delivering high-quality bone surfaces and supporting surgeons in selecting the most suitable surgical approach according to the unique patient joint conditions.

Rethinking boundary detection in deep learning-based medical image segmentation.

Lin Y, Zhang D, Fang X, Chen Y, Cheng KT, Chen H

pubmed logopapersJul 1 2025
Medical image segmentation is a pivotal task within the realms of medical image analysis and computer vision. While current methods have shown promise in accurately segmenting major regions of interest, the precise segmentation of boundary areas remains challenging. In this study, we propose a novel network architecture named CTO, which combines Convolutional Neural Networks (CNNs), Vision Transformer (ViT) models, and explicit edge detection operators to tackle this challenge. CTO surpasses existing methods in terms of segmentation accuracy and strikes a better balance between accuracy and efficiency, without the need for additional data inputs or label injections. Specifically, CTO adheres to the canonical encoder-decoder network paradigm, with a dual-stream encoder network comprising a mainstream CNN stream for capturing local features and an auxiliary StitchViT stream for integrating long-range dependencies. Furthermore, to enhance the model's ability to learn boundary areas, we introduce a boundary-guided decoder network that employs binary boundary masks generated by dedicated edge detection operators to provide explicit guidance during the decoding process. We validate the performance of CTO through extensive experiments conducted on seven challenging medical image segmentation datasets, namely ISIC 2016, PH2, ISIC 2018, CoNIC, LiTS17, BraTS, and BTCV. Our experimental results unequivocally demonstrate that CTO achieves state-of-the-art accuracy on these datasets while maintaining competitive model complexity. The codes have been released at: CTO.

Reconstruction-based approach for chest X-ray image segmentation and enhanced multi-label chest disease classification.

Hage Chehade A, Abdallah N, Marion JM, Hatt M, Oueidat M, Chauvet P

pubmed logopapersJul 1 2025
U-Net is a commonly used model for medical image segmentation. However, when applied to chest X-ray images that show pathologies, it often fails to include these critical pathological areas in the generated masks. To address this limitation, in our study, we tackled the challenge of precise segmentation and mask generation by developing a novel approach, using CycleGAN, that encompasses the areas affected by pathologies within the region of interest, allowing the extraction of relevant radiomic features linked to pathologies. Furthermore, we adopted a feature selection approach to focus the analysis on the most significant features. The results of our proposed pipeline are promising, with an average accuracy of 92.05% and an average AUC of 89.48% for the multi-label classification of effusion and infiltration acquired from the ChestX-ray14 dataset, using the XGBoost model. Furthermore, applying our methodology to the classification of the 14 diseases in the ChestX-ray14 dataset resulted in an average AUC of 83.12%, outperforming previous studies. This research highlights the importance of effective pathological mask generation and features selection for accurate classification of chest diseases. The promising results of our approach underscore its potential for broader applications in the classification of chest diseases.

ConnectomeAE: Multimodal brain connectome-based dual-branch autoencoder and its application in the diagnosis of brain diseases.

Zheng Q, Nan P, Cui Y, Li L

pubmed logopapersJul 1 2025
Exploring the dependencies between multimodal brain networks and integrating node features to enhance brain disease diagnosis remains a significant challenge. Some work has examined only brain connectivity changes in patients, ignoring important information about radiomics features such as shape and texture of individual brain regions in structural images. To this end, this study proposed a novel deep learning approach to integrate multimodal brain connectome information and regional radiomics features for brain disease diagnosis. A dual-branch autoencoder (ConnectomeAE) based on multimodal brain connectomes was proposed for brain disease diagnosis. Specifically, a matrix of radiomics feature extracted from structural magnetic resonance image (MRI) was used as Rad_AE branch inputs for learning important brain region features. Functional brain network built from functional MRI image was used as inputs to Cycle_AE for capturing brain disease-related connections. By separately learning node features and connection features from multimodal brain networks, the method demonstrates strong adaptability in diagnosing different brain diseases. ConnectomeAE was validated on two publicly available datasets. The experimental results show that ConnectomeAE achieved excellent diagnostic performance with an accuracy of 70.7 % for autism spectrum disorder and 90.5 % for Alzheimer's disease. A comparison of training time with other methods indicated that ConnectomeAE exhibits simplicity and efficiency suitable for clinical applications. Furthermore, the interpretability analysis of the model aligned with previous studies, further supporting the biological basis of ConnectomeAE. ConnectomeAE could effectively leverage the complementary information between multimodal brain connectomes for brain disease diagnosis. By separately learning radiomic node features and connectivity features, ConnectomeAE demonstrated good adaptability to different brain disease classification tasks.

MDAL: Modality-difference-based active learning for multimodal medical image analysis via contrastive learning and pointwise mutual information.

Wang H, Jin Q, Du X, Wang L, Guo Q, Li H, Wang M, Song Z

pubmed logopapersJul 1 2025
Multimodal medical images reveal different characteristics of the same anatomy or lesion, offering significant clinical value. Deep learning has achieved widespread success in medical image analysis with large-scale labeled datasets. However, annotating medical images is expensive and labor-intensive for doctors, and the variations between different modalities further increase the annotation cost for multimodal images. This study aims to minimize the annotation cost for multimodal medical image analysis. We proposes a novel active learning framework MDAL based on modality differences for multimodal medical images. MDAL quantifies the sample-wise modality differences through pointwise mutual information estimated by multimodal contrastive learning. We hypothesize that samples with larger modality differences are more informative for annotation and further propose two sampling strategies based on these differences: MaxMD and DiverseMD. Moreover, MDAL could select informative samples in one shot without initial labeled data. We evaluated MDAL on public brain glioma and meningioma segmentation datasets and an in-house ovarian cancer classification dataset. MDAL outperforms other advanced active learning competitors. Besides, when using only 20%, 20%, and 15% of labeled samples in these datasets, MDAL reaches 99.6%, 99.9%, and 99.3% of the performance of supervised training with full labeled dataset, respectively. The results show that our proposed MDAL could significantly reduce the annotation cost for multimodal medical image analysis. We expect MDAL could be further extended to other multimodal medical data for lower annotation costs.
Page 3 of 45448 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.