Sort by:
Page 266 of 3423416 results

Deep-learning based multi-modal models for brain age, cognition and amyloid pathology prediction.

Wang C, Zhang W, Ni M, Wang Q, Liu C, Dai L, Zhang M, Shen Y, Gao F

pubmed logopapersMay 31 2025
Magnetic resonance imaging (MRI), combined with artificial intelligence techniques, has improved our understanding of brain structural change and enabled the estimation of brain age. Neurodegenerative disorders, such as Alzheimer's disease (AD), have been linked to accelerated brain aging. In this study, we aimed to develop a deep-learning framework that processes and integrates MRI images to more accurately predict brain age, cognitive function, and amyloid pathology. In this study, we aimed to develop a deep-learning framework that processes and integrates MRI images to more accurately predict brain age, cognitive function, and amyloid pathology.We collected over 10,000 T1-weighted MRI scans from more than 7,000 individuals across six cohorts. We designed a multi-modal deep-learning framework that employs 3D convolutional neural networks to analyze MRI and additional neural networks to evaluate demographic data. Our initial model focused on predicting brain age, serving as a foundational model from which we developed separate models for cognition function and amyloid plaque prediction through transfer learning. The brain age prediction model achieved the mean absolute error (MAE) for cognitive normal population in the ADNI (test) datasets of 3.302 years. The gap between predicted brain age and chronological age significantly increases while cognition declines. The cognition prediction model exhibited a root mean square error (RMSE) of 0.334 for the Clinical Dementia Rating (CDR) regression task, achieving an area under the curve (AUC) of approximately 0.95 in identifying ing dementia patients. Dementia related brain regions, such as the medial temporal lobe, were identified by our model. Finally, amyloid plaque prediction model was trained to predict amyloid plaque, and achieved an AUC about 0.8 for dementia patients. These findings indicate that the present predictive models can identify subtle changes in brain structure, enabling precise estimates of brain age, cognitive status, and amyloid pathology. Such models could facilitate the use of MRI as a non-invasive diagnostic tool for neurodegenerative diseases, including AD.

Development and validation of a 3-D deep learning system for diabetic macular oedema classification on optical coherence tomography images.

Zhu H, Ji J, Lin JW, Wang J, Zheng Y, Xie P, Liu C, Ng TK, Huang J, Xiong Y, Wu H, Lin L, Zhang M, Zhang G

pubmed logopapersMay 31 2025
To develop and validate an automated diabetic macular oedema (DME) classification system based on the images from different three-dimensional optical coherence tomography (3-D OCT) devices. A multicentre, platform-based development study using retrospective and cross-sectional data. Data were subjected to a two-level grading system by trained graders and a retina specialist, and categorised into three types: no DME, non-centre-involved DME and centre-involved DME (CI-DME). The 3-D convolutional neural networks algorithm was used for DME classification system development. The deep learning (DL) performance was compared with the diabetic retinopathy experts. Data were collected from Joint Shantou International Eye Center of Shantou University and the Chinese University of Hong Kong, Chaozhou People's Hospital and The Second Affiliated Hospital of Shantou University Medical College from January 2010 to December 2023. 7790 volumes of 7146 eyes from 4254 patients were annotated, of which 6281 images were used as the development set and 1509 images were used as the external validation set, split based on the centres. Accuracy, F1-score, sensitivity, specificity, area under receiver operating characteristic curve (AUROC) and Cohen's kappa were calculated to evaluate the performance of the DL algorithm. In classifying DME with non-DME, our model achieved an AUROCs of 0.990 (95% CI 0.983 to 0.996) and 0.916 (95% CI 0.902 to 0.930) for hold-out testing dataset and external validation dataset, respectively. To distinguish CI-DME from non-centre-involved-DME, our model achieved AUROCs of 0.859 (95% CI 0.812 to 0.906) and 0.881 (95% CI 0.859 to 0.902), respectively. In addition, our system showed comparable performance (Cohen's κ: 0.85 and 0.75) to the retina experts (Cohen's κ: 0.58-0.92 and 0.70-0.71). Our DL system achieved high accuracy in multiclassification tasks on DME classification with 3-D OCT images, which can be applied to population-based DME screening.

LiDSCUNet++: A lightweight depth separable convolutional UNet++ for vertebral column segmentation and spondylosis detection.

Agrawal KK, Kumar G

pubmed logopapersMay 31 2025
Accurate computer-aided diagnosis systems rely on precise segmentation of the vertebral column to assist physicians in diagnosing various disorders. However, segmenting spinal disks and bones becomes challenging in the presence of abnormalities and complex anatomical structures. While Deep Convolutional Neural Networks (DCNNs) achieve remarkable results in medical image segmentation, their performance is limited by data insufficiency and the high computational complexity of existing solutions. This paper introduces LiDSCUNet++, a lightweight deep learning framework based on depthwise-separable and pointwise convolutions integrated with UNet++ for vertebral column segmentation. The model segments vertebral anomalies from dog radiographs, and the results are further processed by YOLOv8 for automated detection of Spondylosis Deformans. LiDSCUNet++ delivers comparable segmentation performance while significantly reducing trainable parameters, memory usage, energy consumption, and computational time, making it an efficient and practical solution for medical image analysis.

ABCDEFGH: An Adaptation-Based Convolutional Neural Network-CycleGAN Disease-Courses Evolution Framework Using Generative Models in Health Education

Ruiming Min, Minghao Liu

arxiv logopreprintMay 31 2025
With the advancement of modern medicine and the development of technologies such as MRI, CT, and cellular analysis, it has become increasingly critical for clinicians to accurately interpret various diagnostic images. However, modern medical education often faces challenges due to limited access to high-quality teaching materials, stemming from privacy concerns and a shortage of educational resources (Balogh et al., 2015). In this context, image data generated by machine learning models, particularly generative models, presents a promising solution. These models can create diverse and comparable imaging datasets without compromising patient privacy, thereby supporting modern medical education. In this study, we explore the use of convolutional neural networks (CNNs) and CycleGAN (Zhu et al., 2017) for generating synthetic medical images. The source code is available at https://github.com/mliuby/COMP4211-Project.

Text-to-CT Generation via 3D Latent Diffusion Model with Contrastive Vision-Language Pretraining

Daniele Molino, Camillo Maria Caruso, Filippo Ruffini, Paolo Soda, Valerio Guarrasi

arxiv logopreprintMay 31 2025
Objective: While recent advances in text-conditioned generative models have enabled the synthesis of realistic medical images, progress has been largely confined to 2D modalities such as chest X-rays. Extending text-to-image generation to volumetric Computed Tomography (CT) remains a significant challenge, due to its high dimensionality, anatomical complexity, and the absence of robust frameworks that align vision-language data in 3D medical imaging. Methods: We introduce a novel architecture for Text-to-CT generation that combines a latent diffusion model with a 3D contrastive vision-language pretraining scheme. Our approach leverages a dual-encoder CLIP-style model trained on paired CT volumes and radiology reports to establish a shared embedding space, which serves as the conditioning input for generation. CT volumes are compressed into a low-dimensional latent space via a pretrained volumetric VAE, enabling efficient 3D denoising diffusion without requiring external super-resolution stages. Results: We evaluate our method on the CT-RATE dataset and conduct a comprehensive assessment of image fidelity, clinical relevance, and semantic alignment. Our model achieves competitive performance across all tasks, significantly outperforming prior baselines for text-to-CT generation. Moreover, we demonstrate that CT scans synthesized by our framework can effectively augment real data, improving downstream diagnostic performance. Conclusion: Our results show that modality-specific vision-language alignment is a key component for high-quality 3D medical image generation. By integrating contrastive pretraining and volumetric diffusion, our method offers a scalable and controllable solution for synthesizing clinically meaningful CT volumes from text, paving the way for new applications in data augmentation, medical education, and automated clinical simulation.

QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training

Wei Dai, Peilin Chen, Chanakya Ekbote, Paul Pu Liang

arxiv logopreprintMay 31 2025
Clinical decision-making routinely demands reasoning over heterogeneous data, yet existing multimodal language models (MLLMs) remain largely vision-centric and fail to generalize across clinical specialties. To bridge this gap, we introduce QoQ-Med-7B/32B, the first open generalist clinical foundation model that jointly reasons across medical images, time-series signals, and text reports. QoQ-Med is trained with Domain-aware Relative Policy Optimization (DRPO), a novel reinforcement-learning objective that hierarchically scales normalized rewards according to domain rarity and modality difficulty, mitigating performance imbalance caused by skewed clinical data distributions. Trained on 2.61 million instruction tuning pairs spanning 9 clinical domains, we show that DRPO training boosts diagnostic performance by 43% in macro-F1 on average across all visual domains as compared to other critic-free training methods like GRPO. Furthermore, with QoQ-Med trained on intensive segmentation data, it is able to highlight salient regions related to the diagnosis, with an IoU 10x higher than open models while reaching the performance of OpenAI o4-mini. To foster reproducibility and downstream research, we release (i) the full model weights, (ii) the modular training pipeline, and (iii) all intermediate reasoning traces at https://github.com/DDVD233/QoQ_Med.

MR2US-Pro: Prostate MR to Ultrasound Image Translation and Registration Based on Diffusion Models

Xudong Ma, Nantheera Anantrasirichai, Stefanos Bolomytis, Alin Achim

arxiv logopreprintMay 31 2025
The diagnosis of prostate cancer increasingly depends on multimodal imaging, particularly magnetic resonance imaging (MRI) and transrectal ultrasound (TRUS). However, accurate registration between these modalities remains a fundamental challenge due to the differences in dimensionality and anatomical representations. In this work, we present a novel framework that addresses these challenges through a two-stage process: TRUS 3D reconstruction followed by cross-modal registration. Unlike existing TRUS 3D reconstruction methods that rely heavily on external probe tracking information, we propose a totally probe-location-independent approach that leverages the natural correlation between sagittal and transverse TRUS views. With the help of our clustering-based feature matching method, we enable the spatial localization of 2D frames without any additional probe tracking information. For the registration stage, we introduce an unsupervised diffusion-based framework guided by modality translation. Unlike existing methods that translate one modality into another, we map both MR and US into a pseudo intermediate modality. This design enables us to customize it to retain only registration-critical features, greatly easing registration. To further enhance anatomical alignment, we incorporate an anatomy-aware registration strategy that prioritizes internal structural coherence while adaptively reducing the influence of boundary inconsistencies. Extensive validation demonstrates that our approach outperforms state-of-the-art methods by achieving superior registration accuracy with physically realistic deformations in a completely unsupervised fashion.

Relationship between spleen volume and diameter for assessment of response to treatment on CT in patients with hematologic malignancies enrolled in clinical trials.

Hasenstab KA, Lu J, Leong LT, Bossard E, Pylarinou-Sinclair E, Devi K, Cunha GM

pubmed logopapersMay 31 2025
Investigate spleen diameter (d) and volume (v) relationship in patients with hematologic malignancies (HM) by determining volumetric thresholds that best correlate to established diameter thresholds for assessing response to treatment. Exploratorily, interrogate the impact of volumetric measurements in response categories and as a predictor of response. Secondary analysis of prospectively collected clinical trial data of 382 patients with HM. Spleen diameters were computed following Lugano criteria and volumes using deep learning segmentation. d and v relationship was estimated using power regression model, volumetric thresholds ([Formula: see text]) for treatment response estimated; threshold search to determine percentual change ([Formula: see text] and minimum volumetric increase ([Formula: see text]) that maximize agreement with Lugano criteria performed. Spleen diameter and volume predictive performance for clinical response investigated using random forest model. [Formula: see text] describes the relationship between spleen diameter and volume. [Formula: see text] for splenomegaly was 546 cm³. [Formula: see text], [Formula: see text], and [Formula: see text] for assessing response resulting in highest agreement with Lugano criteria were 570 cm<sup>3</sup>, 73%, and 170 cm<sup>3</sup>, respectively. Predictive performance for response between diameter and volume were not significantly different (P=0.78). This study provides empirical spleen volume threshold and percentual changes that best correlate with diameter thresholds, i.e., Lugano criteria, for assessment of response to treatment in patients with HM. In our dataset use of spleen volumetric thresholds versus diameter thresholds resulted in similar response assessment categories and did not signal differences in predictive values for response.

Dual-energy CT-based virtual monoenergetic imaging via unsupervised learning.

Liu CK, Chang HY, Huang HM

pubmed logopapersMay 31 2025
Since its development, virtual monoenergetic imaging (VMI) derived from dual-energy computed tomography (DECT) has been shown to be valuable in many clinical applications. However, DECT-based VMI showed increased noise at low keV levels. In this study, we proposed an unsupervised learning method to generate VMI from DECT. This means that we don't require training and labeled (i.e. high-quality VMI) data. Specifically, DECT images were fed into a deep learning (DL) based model expected to output VMI. Based on the theory that VMI obtained from image space data is a linear combination of DECT images, we used the model output (i.e. the predicted VMI) to recalculate DECT images. By minimizing the difference between the measured and recalculated DECT images, the DL-based model can be constrained itself to generate VMI from DECT images. We investigate whether the proposed DL-based method has the ability to improve the quality of VMIs. The experimental results obtained from patient data showed that the DL-based VMIs had better image quality than the conventional DECT-based VMIs. Moreover, the CT number differences between the DECT-based and DL-based VMIs were distributed within <math xmlns="http://www.w3.org/1998/Math/MathML"><mo>±</mo></math> 10 HU for bone and <math xmlns="http://www.w3.org/1998/Math/MathML"><mo>±</mo></math> 5 HU for brain, fat, and muscle. Except for bone, no statistically significant difference in CT number measurements was found between the DECT-based and DL-based VMIs (p > 0.01). Our preliminary results show that DL has the potential to unsupervisedly generate high-quality VMIs directly from DECT.

CineMA: A Foundation Model for Cine Cardiac MRI

Yunguan Fu, Weixi Yi, Charlotte Manisty, Anish N Bhuva, Thomas A Treibel, James C Moon, Matthew J Clarkson, Rhodri Huw Davies, Yipeng Hu

arxiv logopreprintMay 31 2025
Cardiac magnetic resonance (CMR) is a key investigation in clinical cardiovascular medicine and has been used extensively in population research. However, extracting clinically important measurements such as ejection fraction for diagnosing cardiovascular diseases remains time-consuming and subjective. We developed CineMA, a foundation AI model automating these tasks with limited labels. CineMA is a self-supervised autoencoder model trained on 74,916 cine CMR studies to reconstruct images from masked inputs. After fine-tuning, it was evaluated across eight datasets on 23 tasks from four categories: ventricle and myocardium segmentation, left and right ventricle ejection fraction calculation, disease detection and classification, and landmark localisation. CineMA is the first foundation model for cine CMR to match or outperform convolutional neural networks (CNNs). CineMA demonstrated greater label efficiency than CNNs, achieving comparable or better performance with fewer annotations. This reduces the burden of clinician labelling and supports replacing task-specific training with fine-tuning foundation models in future cardiac imaging applications. Models and code for pre-training and fine-tuning are available at https://github.com/mathpluscode/CineMA, democratising access to high-performance models that otherwise require substantial computational resources, promoting reproducibility and accelerating clinical translation.
Page 266 of 3423416 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.