Sort by:
Page 13 of 22215 results

GAN-based synthetic FDG PET images from T1 brain MRI can serve to improve performance of deep unsupervised anomaly detection models.

Zotova D, Pinon N, Trombetta R, Bouet R, Jung J, Lartizien C

pubmed logopapersJun 1 2025
Research in the cross-modal medical image translation domain has been very productive over the past few years in tackling the scarce availability of large curated multi-modality datasets with the promising performance of GAN-based architectures. However, only a few of these studies assessed task-based related performance of these synthetic data, especially for the training of deep models. We design and compare different GAN-based frameworks for generating synthetic brain[18F]fluorodeoxyglucose (FDG) PET images from T1 weighted MRI data. We first perform standard qualitative and quantitative visual quality evaluation. Then, we explore further impact of using these fake PET data in the training of a deep unsupervised anomaly detection (UAD) model designed to detect subtle epilepsy lesions in T1 MRI and FDG PET images. We introduce novel diagnostic task-oriented quality metrics of the synthetic FDG PET data tailored to our unsupervised detection task, then use these fake data to train a use case UAD model combining a deep representation learning based on siamese autoencoders with a OC-SVM density support estimation model. This model is trained on normal subjects only and allows the detection of any variation from the pattern of the normal population. We compare the detection performance of models trained on 35 paired real MR T1 of normal subjects paired either on 35 true PET images or on 35 synthetic PET images generated from the best performing generative models. Performance analysis is conducted on 17 exams of epilepsy patients undergoing surgery. The best performing GAN-based models allow generating realistic fake PET images of control subject with SSIM and PSNR values around 0.9 and 23.8, respectively and in distribution (ID) with regard to the true control dataset. The best UAD model trained on these synthetic normative PET data allows reaching 74% sensitivity. Our results confirm that GAN-based models are the best suited for MR T1 to FDG PET translation, outperforming transformer or diffusion models. We also demonstrate the diagnostic value of these synthetic data for the training of UAD models and evaluation on clinical exams of epilepsy patients. Our code and the normative image dataset are available.

Boosting polyp screening with improved point-teacher weakly semi-supervised.

Du X, Zhang X, Chen J, Li L

pubmed logopapersJun 1 2025
Polyps, like a silent time bomb in the gut, are always lurking and can explode into deadly colorectal cancer at any time. Many methods are attempted to maximize the early detection of colon polyps by screening, however, there are still face some challenges: (i) the scarcity of per-pixel annotation data and clinical features such as the blurred boundary and low contrast of polyps result in poor performance. (ii) existing weakly semi-supervised methods directly using pseudo-labels to supervise student tend to ignore the value brought by intermediate features in the teacher. To adapt the point-prompt teacher model to the challenging scenarios of complex medical images and limited annotation data, we creatively leverage the diverse inductive biases of CNN and Transformer to extract robust and complementary representation of polyp features (boundary and context). At the same time, a novel designed teacher-student intermediate feature distillation method is introduced rather than just using pseudo-labels to guide student learning. Comprehensive experiments demonstrate that our proposed method effectively handles scenarios with limited annotations and exhibits good segmentation performance. All code is available at https://github.com/dxqllp/WSS-Polyp.

Efficient slice anomaly detection network for 3D brain MRI Volume.

Zhang Z, Mohsenzadeh Y

pubmed logopapersJun 1 2025
Current anomaly detection methods excel with benchmark industrial data but struggle with natural images and medical data due to varying definitions of 'normal' and 'abnormal.' This makes accurate identification of deviations in these fields particularly challenging. Especially for 3D brain MRI data, all the state-of-the-art models are reconstruction-based with 3D convolutional neural networks which are memory-intensive, time-consuming and producing noisy outputs that require further post-processing. We propose a framework called Simple Slice-based Network (SimpleSliceNet), which utilizes a model pre-trained on ImageNet and fine-tuned on a separate MRI dataset as a 2D slice feature extractor to reduce computational cost. We aggregate the extracted features to perform anomaly detection tasks on 3D brain MRI volumes. Our model integrates a conditional normalizing flow to calculate log likelihood of features and employs the contrastive loss to enhance anomaly detection accuracy. The results indicate improved performance, showcasing our model's remarkable adaptability and effectiveness when addressing the challenges exists in brain MRI data. In addition, for the large-scale 3D brain volumes, our model SimpleSliceNet outperforms the state-of-the-art 2D and 3D models in terms of accuracy, memory usage and time consumption. Code is available at: https://github.com/Jarvisarmy/SimpleSliceNet.

GAN Inversion for Data Augmentation to Improve Colonoscopy Lesion Classification.

Golhar MV, Bobrow TL, Ngamruengphong S, Durr NJ

pubmed logopapersJun 1 2025
A major challenge in applying deep learning to medical imaging is the paucity of annotated data. This study explores the use of synthetic images for data augmentation to address the challenge of limited annotated data in colonoscopy lesion classification. We demonstrate that synthetic colonoscopy images generated by Generative Adversarial Network (GAN) inversion can be used as training data to improve polyp classification performance by deep learning models. We invert pairs of images with the same label to a semantically rich and disentangled latent space and manipulate latent representations to produce new synthetic images. These synthetic images maintain the same label as the input pairs. We perform image modality translation (style transfer) between white light and narrow-band imaging (NBI). We also generate realistic synthetic lesion images by interpolating between original training images to increase the variety of lesion shapes in the training dataset. Our experiments show that GAN inversion can produce multiple colonoscopy data augmentations that improve the downstream polyp classification performance by 2.7% in F1-score and 4.9% in sensitivity over other methods, including state-of-the-art data augmentation. Testing on unseen out-of-domain data also showcased an improvement of 2.9% in F1-score and 2.7% in sensitivity. This approach outperforms other colonoscopy data augmentation techniques and does not require re-training multiple generative models. It also effectively uses information from diverse public datasets, even those not specifically designed for the targeted downstream task, resulting in strong domain generalizability. Project code and model: https://github.com/DurrLab/GAN-Inversion.

Adaptive Weighting Based Metal Artifact Reduction in CT Images.

Wang H, Wu Y, Wang Y, Wei D, Wu X, Ma J, Zheng Y

pubmed logopapersJun 1 2025
Against the metal artifact reduction (MAR) task in computed tomography (CT) imaging, most of the existing deep-learning-based approaches generally select a single Hounsfield unit (HU) window followed by a normalization operation to preprocess CT images. However, in practical clinical scenarios, different body tissues and organs are often inspected under varying window settings for good contrast. The methods trained on a fixed single window would lead to insufficient removal of metal artifacts when being transferred to deal with other windows. To alleviate this problem, few works have proposed to reconstruct the CT images under multiple-window configurations. Albeit achieving good reconstruction performance for different windows, they adopt to directly supervise each window learning in an equal weighting way based on the training set. To improve the learning flexibility and model generalizability, in this paper, we propose an adaptive weighting algorithm, called AdaW, for the multiple-window metal artifact reduction, which can be applied to different deep MAR network backbones. Specifically, we first formulate the multiple window learning task as a bi-level optimization problem. Then we derive an adaptive weighting optimization algorithm where the learning process for MAR under each window is automatically weighted via a learning-to-learn paradigm based on the training set and validation set. This rationality is finely substantiated through theoretical analysis. Based on different network backbones, experimental comparisons executed on five datasets with different body sites comprehensively validate the effectiveness of AdaW in helping improve the generalization performance as well as its good applicability. We will release the code at https://github.com/hongwang01/AdaW.

Automated Ensemble Multimodal Machine Learning for Healthcare.

Imrie F, Denner S, Brunschwig LS, Maier-Hein K, van der Schaar M

pubmed logopapersJun 1 2025
The application of machine learning in medicine and healthcare has led to the creation of numerous diagnostic and prognostic models. However, despite their success, current approaches generally issue predictions using data from a single modality. This stands in stark contrast with clinician decision-making which employs diverse information from multiple sources. While several multimodal machine learning approaches exist, significant challenges in developing multimodal systems remain that are hindering clinical adoption. In this paper, we introduce a multimodal framework, AutoPrognosis-M, that enables the integration of structured clinical (tabular) data and medical imaging using automated machine learning. AutoPrognosis-M incorporates 17 imaging models, including convolutional neural networks and vision transformers, and three distinct multimodal fusion strategies. In an illustrative application using a multimodal skin lesion dataset, we highlight the importance of multimodal machine learning and the power of combining multiple fusion strategies using ensemble learning. We have open-sourced our framework as a tool for the community and hope it will accelerate the uptake of multimodal machine learning in healthcare and spur further innovation.

Physician-level classification performance across multiple imaging domains with a diagnostic medical foundation model and a large dataset of annotated medical images

Thieme, A. H., Miri, T., Marra, A. R., Kobayashi, T., Rodriguez-Nava, G., Li, Y., Barba, T., Er, A. G., Benzler, J., Gertler, M., Riechers, M., Hinze, C., Zheng, Y., Pelz, K., Nagaraj, D., Chen, A., Loeser, A., Ruehle, A., Zamboglou, C., Alyahya, L., Uhlig, M., Machiraju, G., Weimann, K., Lippert, C., Conrad, T., Ma, J., Novoa, R., Moor, M., Hernandez-Boussard, T., Alawad, M., Salinas, J. L., Mittermaier, M., Gevaert, O.

medrxiv logopreprintMay 31 2025
A diagnostic medical foundation model (MedFM) is an artificial intelligence (AI) system engineered to accurately determine diagnoses across various medical imaging modalities and specialties. To train MedFM, we created the PubMed Central Medical Images Dataset (PMCMID), the largest annotated medical image dataset to date, comprising 16,126,659 images from 3,021,780 medical publications. Using AI- and ontology-based methods, we identified 4,482,237 medical images (e.g., clinical photos, X-rays, ultrasounds) and generated comprehensive annotations. To optimize MedFMs performance and assess biases, 13,266 images were manually annotated to establish a multimodal benchmark. MedFM achieved physician-level performance in diagnosis tasks spanning radiology, dermatology, and infectious diseases without requiring specific training. Additionally, we developed the Image2Paper app, allowing clinicians to upload medical images and retrieve relevant literature. The correct diagnoses appeared within the top ten results in 88.4% and at least one relevant differential diagnosis in 93.0%. MedFM and PMCMID were made publicly available. FundingResearch reported here was partially supported by the National Cancer Institute (NCI) (R01 CA260271), the Saudi Company for Artificial Intelligence (SCAI) Authority, and the German Federal Ministry for Economic Affairs and Climate Action (BMWK) under the project DAKI-FWS (01MK21009E). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

ABCDEFGH: An Adaptation-Based Convolutional Neural Network-CycleGAN Disease-Courses Evolution Framework Using Generative Models in Health Education

Ruiming Min, Minghao Liu

arxiv logopreprintMay 31 2025
With the advancement of modern medicine and the development of technologies such as MRI, CT, and cellular analysis, it has become increasingly critical for clinicians to accurately interpret various diagnostic images. However, modern medical education often faces challenges due to limited access to high-quality teaching materials, stemming from privacy concerns and a shortage of educational resources (Balogh et al., 2015). In this context, image data generated by machine learning models, particularly generative models, presents a promising solution. These models can create diverse and comparable imaging datasets without compromising patient privacy, thereby supporting modern medical education. In this study, we explore the use of convolutional neural networks (CNNs) and CycleGAN (Zhu et al., 2017) for generating synthetic medical images. The source code is available at https://github.com/mliuby/COMP4211-Project.

ABCDEFGH: An Adaptation-Based Convolutional Neural Network-CycleGAN Disease-Courses Evolution Framework Using Generative Models in Health Education

Ruiming Min, Minghao Liu

arxiv logopreprintMay 31 2025
With the advancement of modern medicine and the development of technologies such as MRI, CT, and cellular analysis, it has become increasingly critical for clinicians to accurately interpret various diagnostic images. However, modern medical education often faces challenges due to limited access to high-quality teaching materials, stemming from privacy concerns and a shortage of educational resources (Balogh et al., 2015). In this context, image data generated by machine learning models, particularly generative models, presents a promising solution. These models can create diverse and comparable imaging datasets without compromising patient privacy, thereby supporting modern medical education. In this study, we explore the use of convolutional neural networks (CNNs) and CycleGAN (Zhu et al., 2017) for generating synthetic medical images. The source code is available at https://github.com/mliuby/COMP4211-Project.

QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training

Wei Dai, Peilin Chen, Chanakya Ekbote, Paul Pu Liang

arxiv logopreprintMay 31 2025
Clinical decision-making routinely demands reasoning over heterogeneous data, yet existing multimodal language models (MLLMs) remain largely vision-centric and fail to generalize across clinical specialties. To bridge this gap, we introduce QoQ-Med-7B/32B, the first open generalist clinical foundation model that jointly reasons across medical images, time-series signals, and text reports. QoQ-Med is trained with Domain-aware Relative Policy Optimization (DRPO), a novel reinforcement-learning objective that hierarchically scales normalized rewards according to domain rarity and modality difficulty, mitigating performance imbalance caused by skewed clinical data distributions. Trained on 2.61 million instruction tuning pairs spanning 9 clinical domains, we show that DRPO training boosts diagnostic performance by 43% in macro-F1 on average across all visual domains as compared to other critic-free training methods like GRPO. Furthermore, with QoQ-Med trained on intensive segmentation data, it is able to highlight salient regions related to the diagnosis, with an IoU 10x higher than open models while reaching the performance of OpenAI o4-mini. To foster reproducibility and downstream research, we release (i) the full model weights, (ii) the modular training pipeline, and (iii) all intermediate reasoning traces at https://github.com/DDVD233/QoQ_Med.
Page 13 of 22215 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.