Latest Papers on Radiology AI. Tags: Mixed Modality

DMAF-Net: An Effective Modality Rebalancing Framework for Incomplete Multi-Modal Medical Image Segmentation

Libin Lan, Hongxing Li, Zunhui Xia, Yudong Zhang

•preprint•Jun 13 2025

Incomplete multi-modal medical image segmentation faces critical challenges from modality imbalance, including imbalanced modality missing rates and heterogeneous modality contributions. Due to their reliance on idealized assumptions of complete modality availability, existing methods fail to dynamically balance contributions and neglect the structural relationships between modalities, resulting in suboptimal performance in real-world clinical scenarios. To address these limitations, we propose a novel model, named Dynamic Modality-Aware Fusion Network (DMAF-Net). The DMAF-Net adopts three key ideas. First, it introduces a Dynamic Modality-Aware Fusion (DMAF) module to suppress missing-modality interference by combining transformer attention with adaptive masking and weight modality contributions dynamically through attention maps. Second, it designs a synergistic Relation Distillation and Prototype Distillation framework to enforce global-local feature alignment via covariance consistency and masked graph attention, while ensuring semantic consistency through cross-modal class-specific prototype alignment. Third, it presents a Dynamic Training Monitoring (DTM) strategy to stabilize optimization under imbalanced missing rates by tracking distillation gaps in real-time, and to balance convergence speeds across modalities by adaptively reweighting losses and scaling gradients. Extensive experiments on BraTS2020 and MyoPS2020 demonstrate that DMAF-Net outperforms existing methods for incomplete multi-modal medical image segmentation. Extensive experiments on BraTS2020 and MyoPS2020 demonstrate that DMAF-Net outperforms existing methods for incomplete multi-modal medical image segmentation. Our code is available at https://github.com/violet-42/DMAF-Net.

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code

Clinically reported covert cerebrovascular disease and risk of neurological disease: a whole-population cohort of 395,273 people using natural language processing

Iveson, M. H., Mukherjee, M., Davidson, E. M., Zhang, H., Sherlock, L., Ball, E. L., Mair, G., Hosking, A., Whalley, H., Poon, M. T. C., Wardlaw, J. M., Kent, D., Tobin, R., Grover, C., Alex, B., Whiteley, W. N.

•preprint•Jun 13 2025

ImportanceUnderstanding the relevance of covert cerebrovascular disease (CCD) for later health will allow clinicians to more effectively monitor and target interventions. ObjectiveTo examine the association between clinically reported CCD, measured using natural language processing (NLP), and subsequent disease risk. Design, Setting and ParticipantsWe conducted a retrospective e-cohort study using linked health record data. From all people with clinical brain imaging in Scotland from 2010 to 2018, we selected people with no prior hospitalisation for neurological disease. The data were analysed from March 2024 to June 2025. ExposureFour phenotypes were identified with NLP of imaging reports: white matter hypoattenuation or hyperintensities (WMH), lacunes, cortical infarcts and cerebral atrophy. Main outcomes and measuresHazard ratios (aHR) for stroke, dementia, and Parkinsons disease (conditions previously associated with CCD), epilepsy (a brain-based control condition) and colorectal cancer (a non-brain control condition), adjusted for age, sex, deprivation, region, scan modality, and pre-scan healthcare, were calculated for each phenotype. ResultsFrom 395,273 people with brain imaging and no history of neurological disease, 145,978 (37%) had [≥]1 phenotype. For each phenotype, the aHR of any stroke was: WMH 1.4 (95%CI: 1.3-1.4), lacunes 1.6 (1.5-1.6), cortical infarct 1.7 (1.6-1.8), and cerebral atrophy 1.1 (1.0-1.1). The aHR of any dementia was: WMH, 1.3 (1.3-1.3), lacunes, 1.0 (0.9-1.0), cortical infarct 1.1 (1.0-1.1) and cerebral atrophy 1.7 (1.7-1.7). The aHR of Parkinsons disease was, in people with a report of: WMH 1.1 (1.0-1.2), lacunes 1.1 (0.9-1.2), cortical infarct 0.7 (0.6-0.9) and cerebral atrophy 1.4 (1.3-1.5). The aHRs between CCD phenotypes and epilepsy and colorectal cancer overlapped the null. Conclusions and RelevanceNLP identified CCD and atrophy phenotypes from routine clinical image reports, and these had important associations with future stroke, dementia and Parkinsons disease. Prevention of neurological disease in people with CCD should be a priority for healthcare providers and policymakers. Key PointsO_ST_ABSQuestionC_ST_ABSAre measures of Covert Cerebrovascular Disease (CCD) associated with the risk of subsequent disease (stroke, dementia, Parkinsons disease, epilepsy, and colorectal cancer)? FindingsThis study used a validated NLP algorithm to identify CCD (white matter hypoattenuation/hyperintensities, lacunes, cortical infarcts) and cerebral atrophy from both MRI and computed tomography (CT) imaging reports generated during routine healthcare in >395K people in Scotland. In adjusted models, we demonstrate higher risk of dementia (particularly Alzheimers disease) in people with atrophy, and higher risk of stroke in people with cortical infarcts. However, associations with an age-associated control outcome (colorectal cancer) were neutral, supporting a causal relationship. It also highlights differential associations between cerebral atrophy and dementia and cortical infarcts and stroke risk. MeaningCCD or atrophy on brain imaging reports in routine clinical practice is associated with a higher risk of stroke or dementia. Evidence is needed to support treatment strategies to reduce this risk. NLP can identify these important, otherwise uncoded, disease phenotypes, allowing research at scale into imaging-based biomarkers of dementia and stroke.

Mixed Modality Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Protocol of the observational study STRATUM-OS: First step in the development and validation of the STRATUM tool based on multimodal data processing to assist surgery in patients affected by intra-axial brain tumours

Fabelo, H., Ramallo-Farina, Y., Morera, J., Pineiro, J. F., Lagares, A., Jimenez-Roldan, L., Burstrom, G., Garcia-Bello, M. A., Garcia-Perez, L., Falero, R., Gonzalez, M., Duque, S., Rodriguez-Jimenez, C., Hernandez, M., Delgado-Sanchez, J. J., Paredes, A. B., Hernandez, G., Ponce, P., Leon, R., Gonzalez-Martin, J. M., Rodriguez-Esparragon, F., Callico, G. M., Wagner, A. M., Clavo, B., STRATUM,

•preprint•Jun 13 2025

IntroductionIntegrated digital diagnostics can support complex surgeries in many anatomic sites, and brain tumour surgery represents one of the most complex cases. Neurosurgeons face several challenges during brain tumour surgeries, such as differentiating critical tissue from brain tumour margins. To overcome these challenges, the STRATUM project will develop a 3D decision support tool for brain surgery guidance and diagnostics based on multimodal data processing, including hyperspectral imaging, integrated as a point-of-care computing tool in neurosurgical workflows. This paper reports the protocol for the development and technical validation of the STRATUM tool. Methods and analysisThis international multicentre, prospective, open, observational cohort study, STRATUM-OS (study: 28 months, pre-recruitment: 2 months, recruitment: 20 months, follow-up: 6 months), with no control group, will collect data from 320 patients undergoing standard neurosurgical procedures to: (1) develop and technically validate the STRATUM tool, and (2) collect the outcome measures for comparing the standard procedure versus the standard procedure plus the use of the STRATUM tool during surgery in a subsequent historically controlled non-randomized clinical trial. Ethics and disseminationThe protocol was approved by the participant Ethics Committees. Results will be disseminated in scientific conferences and peer-reviewed journals. Trial registration number[Pending Number] ARTICLE SUMMARYO_ST_ABSStrengths and limitations of this studyC_ST_ABSO_LISTRATUM-OS will be the first multicentre prospective observational study to develop and technically validate a 3D decision support tool for brain surgery guidance and diagnostics in real-time based on artificial intelligence and multimodal data processing, including the emerging hyperspectral imaging modality. C_LIO_LIThis study encompasses a prospective collection of multimodal pre, intra and postoperative medical data, including innovative imaging modalities, from patients with intra-axial brain tumours. C_LIO_LIThis large observational study will act as historical control in a subsequent clinical trial to evaluate a fully-working prototype. C_LIO_LIAlthough the estimated sample size is deemed adequate for the purpose of the study, the complexity of the clinical context and the type of surgery could potentially lead to under-recruitment and under-representation of less prevalent tumour types. C_LI

Mixed Modality Segmentation Neurological Prospective Prototype Consortium Breakthrough

Enhancing Privacy: The Utility of Stand-Alone Synthetic CT and MRI for Tumor and Bone Segmentation

André Ferreira, Kunpeng Xie, Caroline Wilpert, Gustavo Correia, Felix Barajas Ordonez, Tiago Gil Oliveira, Maike Bode, Robert Siepmann, Frank Hölzle, Rainer Röhrig, Jens Kleesiek, Daniel Truhn, Jan Egger, Victor Alves, Behrus Puladi

•preprint•Jun 13 2025

AI requires extensive datasets, while medical data is subject to high data protection. Anonymization is essential, but poses a challenge for some regions, such as the head, as identifying structures overlap with regions of clinical interest. Synthetic data offers a potential solution, but studies often lack rigorous evaluation of realism and utility. Therefore, we investigate to what extent synthetic data can replace real data in segmentation tasks. We employed head and neck cancer CT scans and brain glioma MRI scans from two large datasets. Synthetic data were generated using generative adversarial networks and diffusion models. We evaluated the quality of the synthetic data using MAE, MS-SSIM, Radiomics and a Visual Turing Test (VTT) performed by 5 radiologists and their usefulness in segmentation tasks using DSC. Radiomics indicates high fidelity of synthetic MRIs, but fall short in producing highly realistic CT tissue, with correlation coefficient of 0.8784 and 0.5461 for MRI and CT tumors, respectively. DSC results indicate limited utility of synthetic data: tumor segmentation achieved DSC=0.064 on CT and 0.834 on MRI, while bone segmentation a mean DSC=0.841. Relation between DSC and correlation is observed, but is limited by the complexity of the task. VTT results show synthetic CTs' utility, but with limited educational applications. Synthetic data can be used independently for the segmentation task, although limited by the complexity of the structures to segment. Advancing generative models to better tolerate heterogeneous inputs and learn subtle details is essential for enhancing their realism and expanding their application potential.

Mixed Modality Segmentation Neurological Methodology In Silico Academic Lab Open Dataset

3D Skin Segmentation Methods in Medical Imaging: A Comparison

Martina Paccini, Giuseppe Patanè

•preprint•Jun 13 2025

Automatic segmentation of anatomical structures is critical in medical image analysis, aiding diagnostics and treatment planning. Skin segmentation plays a key role in registering and visualising multimodal imaging data. 3D skin segmentation enables applications in personalised medicine, surgical planning, and remote monitoring, offering realistic patient models for treatment simulation, procedural visualisation, and continuous condition tracking. This paper analyses and compares algorithmic and AI-driven skin segmentation approaches, emphasising key factors to consider when selecting a strategy based on data availability and application requirements. We evaluate an iterative region-growing algorithm and the TotalSegmentator, a deep learning-based approach, across different imaging modalities and anatomical regions. Our tests show that AI segmentation excels in automation but struggles with MRI due to its CT-based training, while the graphics-based method performs better for MRIs but introduces more noise. AI-driven segmentation also automates patient bed removal in CT, whereas the graphics-based method requires manual intervention.

Mixed Modality Segmentation Methodology In Silico

High-Fidelity 3D Imaging of Dental Scenes Using Gaussian Splatting.

Jin CX, Li MX, Yu H, Gao Y, Guo YP, Xia GS, Huang C

•papers•Jun 13 2025

Three-dimensional visualization is increasingly used in dentistry for diagnostics, education, and treatment design. The accurate replication of geometry and color is crucial for these applications. Image-based rendering, which uses 2-dimensional photos to generate photo-realistic 3-dimensional representations, provides an affordable and practical option, aiding both regular and remote health care. This study explores an advanced novel view synthesis (NVS) method called Gaussian splatting (GS), a differentiable image-based rendering approach, to assess its feasibility for dental scene capturing. The rendering quality and resource usage were compared with representative NVS methods. In addition, the linear measurement trueness of extracted craniofacial meshes was evaluated against a commercial facial scanner and 3 smartphone facial scanning apps, while teeth meshes were assessed against 2 intraoral scanners and a desktop scanner. GS-based representation demonstrated superior rendering quality, achieving the highest visual quality, fastest rendering speed, and lowest resource usage. The craniofacial measurements showed similar trueness to commercial facial scanners. The dental measurements had larger deviations than intraoral and desktop scanners did, although all deviations remained within clinically acceptable limits. The GS-based representation shows great potential for developing a convenient and cost-effective method of capturing dental scenes, offering a balance between color fidelity and trueness suitable for clinical applications.

Mixed Modality Reconstruction Methodology In Silico Academic Lab

MedSeg-R: Reasoning Segmentation in Medical Images with Multimodal Large Language Models

Yu Huang, Zelin Peng, Yichen Zhao, Piao Yang, Xiaokang Yang, Wei Shen

•preprint•Jun 12 2025

Medical image segmentation is crucial for clinical diagnosis, yet existing models are limited by their reliance on explicit human instructions and lack the active reasoning capabilities to understand complex clinical questions. While recent advancements in multimodal large language models (MLLMs) have improved medical question-answering (QA) tasks, most methods struggle to generate precise segmentation masks, limiting their application in automatic medical diagnosis. In this paper, we introduce medical image reasoning segmentation, a novel task that aims to generate segmentation masks based on complex and implicit medical instructions. To address this, we propose MedSeg-R, an end-to-end framework that leverages the reasoning abilities of MLLMs to interpret clinical questions while also capable of producing corresponding precise segmentation masks for medical images. It is built on two core components: 1) a global context understanding module that interprets images and comprehends complex medical instructions to generate multi-modal intermediate tokens, and 2) a pixel-level grounding module that decodes these tokens to produce precise segmentation masks and textual responses. Furthermore, we introduce MedSeg-QA, a large-scale dataset tailored for the medical image reasoning segmentation task. It includes over 10,000 image-mask pairs and multi-turn conversations, automatically annotated using large language models and refined through physician reviews. Experiments show MedSeg-R's superior performance across several benchmarks, achieving high segmentation accuracy and enabling interpretable textual analysis of medical images.

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Dataset GenAI

Modality-AGnostic Image Cascade (MAGIC) for Multi-Modality Cardiac Substructure Segmentation

Nicholas Summerfield, Qisheng He, Alex Kuo, Ahmed I. Ghanem, Simeng Zhu, Chase Ruff, Joshua Pan, Anudeep Kumar, Prashant Nagpal, Jiwei Zhao, Ming Dong, Carri K. Glide-Hurst

•preprint•Jun 12 2025

Cardiac substructures are essential in thoracic radiation therapy planning to minimize risk of radiation-induced heart disease. Deep learning (DL) offers efficient methods to reduce contouring burden but lacks generalizability across different modalities and overlapping structures. This work introduces and validates a Modality-AGnostic Image Cascade (MAGIC) for comprehensive and multi-modal cardiac substructure segmentation. MAGIC is implemented through replicated encoding and decoding branches of an nnU-Net-based, U-shaped backbone conserving the function of a single model. Twenty cardiac substructures (heart, chambers, great vessels (GVs), valves, coronary arteries (CAs), and conduction nodes) from simulation CT (Sim-CT), low-field MR-Linac, and cardiac CT angiography (CCTA) modalities were manually delineated and used to train (n=76), validate (n=15), and test (n=30) MAGIC. Twelve comparison models (four segmentation subgroups across three modalities) were equivalently trained. All methods were compared for training efficiency and against reference contours using the Dice Similarity Coefficient (DSC) and two-tailed Wilcoxon Signed-Rank test (threshold, p<0.05). Average DSC scores were 0.75(0.16) for Sim-CT, 0.68(0.21) for MR-Linac, and 0.80(0.16) for CCTA. MAGIC outperforms the comparison in 57% of cases, with limited statistical differences. MAGIC offers an effective and accurate segmentation solution that is lightweight and capable of segmenting multiple modalities and overlapping structures in a single model. MAGIC further enables clinical implementation by simplifying the computational requirements and offering unparalleled flexibility for clinical settings.

Mixed Modality Segmentation Cardiac Methodology In Silico Academic Lab

A strategy for the automatic diagnostic pipeline towards feature-based models: a primer with pleural invasion prediction from preoperative PET/CT images.

Kong X, Zhang A, Zhou X, Zhao M, Liu J, Zhang X, Zhang W, Meng X, Li N, Yang Z

•papers•Jun 12 2025

This study aims to explore the feasibility to automate the application process of nomograms in clinical medicine, demonstrated through the task of preoperative pleural invasion prediction in non-small cell lung cancer patients using PET/CT imaging. The automatic pipeline involves multimodal segmentation, feature extraction, and model prediction. It is validated on a cohort of 1116 patients from two medical centers. The performance of the feature-based diagnostic model outperformed both the radiomics model and individual machine learning models. The segmentation models for CT and PET images achieved mean dice similarity coefficients of 0.85 and 0.89, respectively, and the segmented lung contours showed high consistency with the actual contours. The automatic diagnostic system achieved an accuracy of 0.87 in the internal test set and 0.82 in the external test set, demonstrating comparable overall diagnostic performance to the human-based diagnostic model. In comparative analysis, the automatic diagnostic system showed superior performance relative to other segmentation and diagnostic pipelines. The proposed automatic diagnostic system provides an interpretable, automated solution for predicting pleural invasion in non-small cell lung cancer.

Mixed Modality Segmentation Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Score-based Generative Diffusion Models to Synthesize Full-dose FDG Brain PET from MRI in Epilepsy Patients

Jiaqi Wu, Jiahong Ouyang, Farshad Moradi, Mohammad Mehdi Khalighi, Greg Zaharchuk

•preprint•Jun 12 2025

Fluorodeoxyglucose (FDG) PET to evaluate patients with epilepsy is one of the most common applications for simultaneous PET/MRI, given the need to image both brain structure and metabolism, but is suboptimal due to the radiation dose in this young population. Little work has been done synthesizing diagnostic quality PET images from MRI data or MRI data with ultralow-dose PET using advanced generative AI methods, such as diffusion models, with attention to clinical evaluations tailored for the epilepsy population. Here we compared the performance of diffusion- and non-diffusion-based deep learning models for the MRI-to-PET image translation task for epilepsy imaging using simultaneous PET/MRI in 52 subjects (40 train/2 validate/10 hold-out test). We tested three different models: 2 score-based generative diffusion models (SGM-Karras Diffusion [SGM-KD] and SGM-variance preserving [SGM-VP]) and a Transformer-Unet. We report results on standard image processing metrics as well as clinically relevant metrics, including congruency measures (Congruence Index and Congruency Mean Absolute Error) that assess hemispheric metabolic asymmetry, which is a key part of the clinical analysis of these images. The SGM-KD produced the best qualitative and quantitative results when synthesizing PET purely from T1w and T2 FLAIR images with the least mean absolute error in whole-brain specific uptake value ratio (SUVR) and highest intraclass correlation coefficient. When 1% low-dose PET images are included in the inputs, all models improve significantly and are interchangeable for quantitative performance and visual quality. In summary, SGMs hold great potential for pure MRI-to-PET translation, while all 3 model types can synthesize full-dose FDG-PET accurately using MRI and ultralow-dose PET.

Mixed Modality Image Synthesis Neurological Methodology In Silico Academic Lab GenAI

Filter Papers

Tags

DMAF-Net: An Effective Modality Rebalancing Framework for Incomplete Multi-Modal Medical Image Segmentation

Clinically reported covert cerebrovascular disease and risk of neurological disease: a whole-population cohort of 395,273 people using natural language processing

Protocol of the observational study STRATUM-OS: First step in the development and validation of the STRATUM tool based on multimodal data processing to assist surgery in patients affected by intra-axial brain tumours

Enhancing Privacy: The Utility of Stand-Alone Synthetic CT and MRI for Tumor and Bone Segmentation

3D Skin Segmentation Methods in Medical Imaging: A Comparison

High-Fidelity 3D Imaging of Dental Scenes Using Gaussian Splatting.

MedSeg-R: Reasoning Segmentation in Medical Images with Multimodal Large Language Models

Modality-AGnostic Image Cascade (MAGIC) for Multi-Modality Cardiac Substructure Segmentation

A strategy for the automatic diagnostic pipeline towards feature-based models: a primer with pleural invasion prediction from preoperative PET/CT images.

Score-based Generative Diffusion Models to Synthesize Full-dose FDG Brain PET from MRI in Epilepsy Patients

Ready to Sharpen Your Edge?