Sort by:
Page 61 of 1341333 results

Scaling Chest X-ray Foundation Models from Mixed Supervisions for Dense Prediction.

Wang F, Yu L

pubmed logopapersJul 16 2025
Foundation models have significantly revolutionized the field of chest X-ray diagnosis with their ability to transfer across various diseases and tasks. However, previous works have predominantly utilized self-supervised learning from medical image-text pairs, which falls short in dense medical prediction tasks due to their sole reliance on such coarse pair supervision, thereby limiting their applicability to detailed diagnostics. In this paper, we introduce a Dense Chest X-ray Foundation Model (DCXFM), which utilizes mixed supervision types (i.e., text, label, and segmentation masks) to significantly enhance the scalability of foundation models across various medical tasks. Our model involves two training stages: we first employ a novel self-distilled multimodal pretraining paradigm to exploit text and label supervision, along with local-to-global self-distillation and soft cross-modal contrastive alignment strategies to enhance localization capabilities. Subsequently, we introduce an efficient cost aggregation module, comprising spatial and class aggregation mechanisms, to further advance dense prediction tasks with densely annotated datasets. Comprehensive evaluations on three tasks (phrase grounding, zero-shot semantic segmentation, and zero-shot classification) demonstrate DCXFM's superior performance over other state-of-the-art medical image-text pretraining models. Remarkably, DCXFM exhibits powerful zero-shot capabilities across various datasets in phrase grounding and zero-shot semantic segmentation, underscoring its superior generalization in dense prediction tasks.

Benchmarking and Explaining Deep Learning Cortical Lesion MRI Segmentation in Multiple Sclerosis

Nataliia Molchanova, Alessandro Cagol, Mario Ocampo-Pineda, Po-Jui Lu, Matthias Weigel, Xinjie Chen, Erin Beck, Charidimos Tsagkas, Daniel Reich, Colin Vanden Bulcke, Anna Stolting, Serena Borrelli, Pietro Maggi, Adrien Depeursinge, Cristina Granziera, Henning Mueller, Pedro M. Gordaliza, Meritxell Bach Cuadra

arxiv logopreprintJul 16 2025
Cortical lesions (CLs) have emerged as valuable biomarkers in multiple sclerosis (MS), offering high diagnostic specificity and prognostic relevance. However, their routine clinical integration remains limited due to subtle magnetic resonance imaging (MRI) appearance, challenges in expert annotation, and a lack of standardized automated methods. We propose a comprehensive multi-centric benchmark of CL detection and segmentation in MRI. A total of 656 MRI scans, including clinical trial and research data from four institutions, were acquired at 3T and 7T using MP2RAGE and MPRAGE sequences with expert-consensus annotations. We rely on the self-configuring nnU-Net framework, designed for medical imaging segmentation, and propose adaptations tailored to the improved CL detection. We evaluated model generalization through out-of-distribution testing, demonstrating strong lesion detection capabilities with an F1-score of 0.64 and 0.5 in and out of the domain, respectively. We also analyze internal model features and model errors for a better understanding of AI decision-making. Our study examines how data variability, lesion ambiguity, and protocol differences impact model performance, offering future recommendations to address these barriers to clinical adoption. To reinforce the reproducibility, the implementation and models will be publicly accessible and ready to use at https://github.com/Medical-Image-Analysis-Laboratory/ and https://doi.org/10.5281/zenodo.15911797.

Generate to Ground: Multimodal Text Conditioning Boosts Phrase Grounding in Medical Vision-Language Models

Felix Nützel, Mischa Dombrowski, Bernhard Kainz

arxiv logopreprintJul 16 2025
Phrase grounding, i.e., mapping natural language phrases to specific image regions, holds significant potential for disease localization in medical imaging through clinical reports. While current state-of-the-art methods rely on discriminative, self-supervised contrastive models, we demonstrate that generative text-to-image diffusion models, leveraging cross-attention maps, can achieve superior zero-shot phrase grounding performance. Contrary to prior assumptions, we show that fine-tuning diffusion models with a frozen, domain-specific language model, such as CXR-BERT, substantially outperforms domain-agnostic counterparts. This setup achieves remarkable improvements, with mIoU scores doubling those of current discriminative methods. These findings highlight the underexplored potential of generative models for phrase grounding tasks. To further enhance performance, we introduce Bimodal Bias Merging (BBM), a novel post-processing technique that aligns text and image biases to identify regions of high certainty. BBM refines cross-attention maps, achieving even greater localization accuracy. Our results establish generative approaches as a more effective paradigm for phrase grounding in the medical imaging domain, paving the way for more robust and interpretable applications in clinical practice. The source code and model weights are available at https://github.com/Felix-012/generate_to_ground.

Automatic segmentation of liver structures in multi-phase MRI using variants of nnU-Net and Swin UNETR.

Raab F, Strotzer Q, Stroszczynski C, Fellner C, Einspieler I, Haimerl M, Lang EW

pubmed logopapersJul 16 2025
Accurate segmentation of the liver parenchyma, portal veins, hepatic veins, and lesions from MRI is important for hepatic disease monitoring and treatment. Multi-phase contrast enhanced imaging is superior in distinguishing hepatic structures compared to single-phase approaches, but automated approaches for detailed segmentation of hepatic structures are lacking. This study evaluates deep learning architectures for segmenting liver structures from multi-phase Gd-EOB-DTPA-enhanced T1-weighted VIBE MRI scans. We utilized 458 T1-weighted VIBE scans of pathological livers, with 78 manually labeled for liver parenchyma, hepatic and portal veins, aorta, lesions, and ascites. An additional dataset of 47 labeled subjects was used for cross-scanner evaluation. Three models were evaluated using nested cross-validation: the conventional nnU-Net, the ResEnc nnU-Net, and the Swin UNETR. The late arterial phase was identified as the optimal fixed phase for co-registration. Both nnU-Net variants outperformed Swin UNETR across most tasks. The conventional nnU-Net achieved the highest segmentation performance for liver parenchyma (DSC: 0.97; 95% CI 0.97, 0.98), portal vein (DSC: 0.83; 95% CI 0.80, 0.87), and hepatic vein (DSC: 0.78; 95% CI 0.77, 0.80). Lesion and ascites segmentation proved challenging for all models, with the conventional nnU-Net performing best. This study demonstrates the effectiveness of deep learning, particularly nnU-Net variants, for detailed liver structure segmentation from multi-phase MRI. The developed models and preprocessing pipeline offer potential for improved liver disease assessment and surgical planning in clinical practice.

SML-Net: Semi-supervised multi-task learning network for carotid plaque segmentation and classification.

Gan H, Liu L, Wang F, Yang Z, Huang Z, Zhou R

pubmed logopapersJul 16 2025
Carotid ultrasound image segmentation and classification are crucial in assessing the severity of carotid plaques which serve as a major cause of ischemic stroke. Although many methods are employed for carotid plaque segmentation and classification, treating these tasks separately neglects their interrelatedness. Currently, there is limited research exploring the key information of both plaque and background regions, and collecting and annotating extensive segmentation data is a costly and time-intensive task. To address these two issues, we propose an end-to-end semi-supervised multi-task learning network(SML-Net), which can classify plaques while performing segmentation. SML-Net identifies regions by extracting image features and fuses multi-scale features to improve semi-supervised segmentation. SML-Net effectively utilizes plaque and background regions from the segmentation results and extracts features from various dimensions, thereby facilitating the classification task. Our experimental results indicate that SML-Net achieves a plaque classification accuracy of 86.59% and a Dice Similarity Coefficient (DSC) of 82.36%. Compared to the leading single-task network, SML-Net improves DSC by 1.2% and accuracy by 1.84%. Similarly, when compared to the best-performing multi-task network, our method achieves a 1.05% increase in DSC and a 2.15% improvement in classification accuracy.

AI-Powered Segmentation and Prognosis with Missing MRI in Pediatric Brain Tumors

Chrysochoou, D., Gandhi, D., Adib, S., Familiar, A., Khalili, N., Khalili, N., Ware, J. B., Tu, W., Jain, P., Anderson, H., Haldar, S., Storm, P. B., Franson, A., Prados, M., Kline, C., Mueller, S., Resnick, A., Vossough, A., Davatzikos, C., Nabavizadeh, A., Fathi Kazerooni, A.

medrxiv logopreprintJul 16 2025
ImportanceBrain MRI is the main imaging modality for pediatric brain tumors (PBTs); however, incomplete MRI exams are common in pediatric neuro-oncology settings and pose a barrier to the development and application of deep learning (DL) models, such as tumor segmentation and prognostic risk estimation. ObjectiveTo evaluate DL-based strategies (image-dropout training and generative image synthesis) and heuristic imputation approaches for handling missing MRI sequences in PBT imaging from clinical acquisition protocols, and to determine their impact on segmentation accuracy and prognostic risk estimation. DesignThis cohort study included 715 patients from the Childrens Brain Tumor Network (CBTN) and BraTS-PEDs, and 43 patients with longitudinal MRI (157 timepoints) from PNOC003/007 clinical trials. We developed a dropout-trained nnU-Net tumor segmentation model that randomly omitted FLAIR and/or T1w (no contrast) sequences during training to simulate missing inputs. We compared this against three imputation approaches: a generative model for image synthesis, copy-substitution heuristics, and zeroed missing inputs. Model-generated tumor volumes from each segmentation method were compared and evaluated against ground truth (expert manual segmentations) and incorporated into time-varying Cox regression models for survival analysis. SettingMulti-institutional PBT datasets and longitudinal clinical trial cohorts. ParticipantsAll patients had multi-parametric MRI and expert manual segmentations. The PNOC cohort had a median of three imaging timepoints and associated clinical data. Main Outcomes and MeasuresSegmentation accuracy (Dice scores), image quality metrics for synthesized scans (SSIM, PSNR, MSE), and survival discrimination (C-index, hazard ratios). ResultsThe dropout model achieved robust segmentation under missing MRI, with [≤]0.04 Dice drop and a stable C-index of 0.65 compared to complete-input performance. DL-based MRI synthesis achieved high image quality (SSIM > 0.90) and removed artifacts, benefiting visual interpretability. Performance was consistent across cohorts and missing data scenarios. Conclusion and RelevanceModality-dropout training yields robust segmentation and risk-stratification on incomplete pediatric MRI without the computational and clinical complexity of synthesis approaches. Image synthesis, though less effective for these tasks, provides complementary benefits for artifact removal and qualitative assessment of missing or corrupted MRI scans. Together, these approaches can facilitate broader deployment of AI tools in real-world pediatric neuro-oncology settings.

OMT and tensor SVD-based deep learning model for segmentation and predicting genetic markers of glioma: A multicenter study.

Zhu Z, Wang H, Li T, Huang TM, Yang H, Tao Z, Tan ZH, Zhou J, Chen S, Ye M, Zhang Z, Li F, Liu D, Wang M, Lu J, Zhang W, Li X, Chen Q, Jiang Z, Chen F, Zhang X, Lin WW, Yau ST, Zhang B

pubmed logopapersJul 15 2025
Glioma is the most common primary malignant brain tumor and preoperative genetic profiling is essential for the management of glioma patients. Our study focused on tumor regions segmentation and predicting the World Health Organization (WHO) grade, isocitrate dehydrogenase (IDH) mutation, and 1p/19q codeletion status using deep learning models on preoperative MRI. To achieve accurate tumor segmentation, we developed an optimal mass transport (OMT) approach to transform irregular MRI brain images into tensors. In addition, we proposed an algebraic preclassification (APC) model utilizing multimode OMT tensor singular value decomposition (SVD) to estimate preclassification probabilities. The fully automated deep learning model named OMT-APC was used for multitask classification. Our study incorporated preoperative brain MRI data from 3,565 glioma patients across 16 datasets spanning Asia, Europe, and America. Among these, 2,551 patients from 5 datasets were used for training and internal validation. In comparison, 1,014 patients from 11 datasets, including 242 patients from The Cancer Genome Atlas (TCGA), were used as independent external test. The OMT segmentation model achieved mean lesion-wise Dice scores of 0.880. The OMT-APC model was evaluated on the TCGA dataset, achieving accuracies of 0.855, 0.917, and 0.809, with AUC scores of 0.845, 0.908, and 0.769 for WHO grade, IDH mutation, and 1p/19q codeletion, respectively, which outperformed the four radiologists in all tasks. These results highlighted the effectiveness of our OMT and tensor SVD-based methods in brain tumor genetic profiling, suggesting promising applications for algebraic and geometric methods in medical image analysis.

Direct-to-Treatment Adaptive Radiation Therapy: Live Planning of Spine Metastases Using Novel Cone Beam Computed Tomography.

McGrath KM, MacDonald RL, Robar JL, Cherpak A

pubmed logopapersJul 15 2025
Cone beam computed tomography (CBCT)-based online adaptive radiation therapy is carried out using a synthetic CT (sCT) created through deformable registration between the patient-specific fan-beam CT, fan-beam computed tomography (FBCT), and daily CBCT. Ethos 2.0 allows for plan calculation directly on HyperSight CBCT and uses artificial intelligence-informed tools for daily contouring without the use of a priori information. This breaks an important link between daily adaptive sessions and initial reference plan preparation. This study explores adaptive radiation therapy for spine metastases without prior patient-specific imaging or treatment planning. We hypothesize that adaptive plans can be created when patient-specific positioning and anatomy is incorporated only once the patient has arrived at the treatment unit. An Ethos 2.0 emulator was used to create initial reference plans on 10 patient-specific FBCTs. Reference plans were also created using FBCTs of (1) a library patient with clinically acceptable contours and (2) a water-equivalent phantom with placeholder contours. Adaptive sessions were simulated for each patient using the 3 different starting points. Resulting adaptive plans were compared with determine the significance of patient-specific information prior to the start of treatment. The library patient and phantom reference plans did not generate adaptive plans that differed significantly from the standard workflow for all clinical constraints for target coverage and organ at risk sparing (P > .2). Gamma comparison between the 3 adaptive plans for each patient (3%/3 mm) demonstrated overall similarity of dose distributions (pass rate > 95%), for all but 2 cases. Failures occurred mainly in low-dose regions, highlighting difference in fluence used to achieve the same clinical goals. This study confirmed feasibility of a procedure for treatment of spine metastases that does not rely on previously acquired patient-specific imaging, contours or plan. Reference-free direct-to-treatment workflows are possible and can condense a multistep process to a single location with dedicated resources.

Artificial Intelligence-Empowered Multistep Integrated Radiation Therapy Workflow for Nasopharyngeal Carcinoma.

Yang YX, Yang X, Jiang XB, Lin L, Wang GY, Sun WZ, Zhang K, Li BH, Li H, Jia LC, Wei ZQ, Liu YF, Fu DN, Tang JX, Zhang W, Zhou JJ, Diao WC, Wang YJ, Chen XM, Xu CD, Lin LW, Wu JY, Wu JW, Peng LX, Pan JF, Liu BZ, Feng C, Huang XY, Zhou GQ, Sun Y

pubmed logopapersJul 15 2025
To establish an artificial intelligence (AI)-empowered multistep integrated (MSI) radiation therapy (RT) workflow for patients with nasopharyngeal carcinoma (NPC) and evaluate its feasibility and clinical performance. Patients with NPC scheduled for MSI RT workflow were prospectively enrolled. This workflow integrates RT procedures from computed tomography (CT) scan to beam delivery, all performed with the patient on the treatment couch. Workflow performance, tumor response, patient-reported acute toxicities, and quality of life were evaluated. From March 2022 to October 2023, 120 newly diagnosed, nonmetastatic patients with NPC were enrolled. Of these, 117 completed the workflow with a median duration of 23.2 minutes (range, 16.3-45.8). Median translation errors were 0.2 mm (from CT scan to planning approval) and 0.1 mm (during beam delivery). AI-generated contours required minimal revision for the high-risk clinical target volume and organs at risk, minor revision for the involved cervical lymph nodes and low-risk clinical target volume (median Dice similarity coefficients (DSC), 0.98 and 0.94), and more revision for the gross tumor at the primary site and the involved retropharyngeal lymph nodes (median DSC, 0.84). Of 117 AI-generated plans, 108 (92.3%) passed after the first optimization, with ≥97.8% of target volumes receiving ≥100% of the prescribed dose. Dosimetric constraints were met for most organs at risk, except the thyroid and submandibular glands. One hundred and fifteen patients achieved a complete response at week 12 post-RT, while 14 patients reported any acute toxicity as "very severe" from the start of RT to week 12 post-RT. AI-empowered MSI RT workflow for patients with NPC is clinically feasible in a single institutional setting compared with standard, human-based RT workflow.

Patient-Specific Deep Learning Tracking Framework for Real-Time 2D Target Localization in Magnetic Resonance Imaging-Guided Radiation Therapy.

Lombardo E, Velezmoro L, Marschner SN, Rabe M, Tejero C, Papadopoulou CI, Sui Z, Reiner M, Corradini S, Belka C, Kurz C, Riboldi M, Landry G

pubmed logopapersJul 15 2025
We propose a tumor tracking framework for 2D cine magnetic resonance imaging (MRI) based on a pair of deep learning (DL) models relying on patient-specific (PS) training. The chosen DL models are: (1) an image registration transformer and (2) an auto-segmentation convolutional neural network (CNN). We collected over 1,400,000 cine MRI frames from 219 patients treated on a 0.35 T MRI-linac plus 7500 frames from additional 35 patients that were manually labeled and subdivided into fine-tuning, validation, and testing sets. The transformer was first trained on the unlabeled data (without segmentations). We then continued training (with segmentations) either on the fine-tuning set or for PS models based on 8 randomly selected frames from the first 5 seconds of each patient's cine MRI. The PS auto-segmentation CNN was trained from scratch with the same 8 frames for each patient, without pre-training. Furthermore, we implemented B-spline image registration as a conventional model, as well as different baselines. Output segmentations of all models were compared on the testing set using the Dice similarity coefficient, the 50% and 95% Hausdorff distance (HD<sub>50%</sub>/HD<sub>95%</sub>), and the root-mean-square-error of the target centroid in superior-inferior direction. The PS transformer and CNN significantly outperformed all other models, achieving a median (interquartile range) dice similarity coefficient of 0.92 (0.03)/0.90 (0.04), HD<sub>50%</sub> of 1.0 (0.1)/1.0 (0.4) mm, HD<sub>95%</sub> of 3.1 (1.9)/3.8 (2.0) mm, and root-mean-square-error of the target centroid in superior-inferior direction of 0.7 (0.4)/0.9 (1.0) mm on the testing set. Their inference time was about 36/8 ms per frame and PS fine-tuning required 3 min for labeling and 8/4 min for training. The transformer was better than the CNN in 9/12 patients, the CNN better in 1/12 patients, and the 2 PS models achieved the same performance on the remaining 2/12 testing patients. For targets in the thorax, abdomen, and pelvis, we found 2 PS DL models to provide accurate real-time target localization during MRI-guided radiotherapy.
Page 61 of 1341333 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.