Sort by:
Page 50 of 59587 results

Boosting polyp screening with improved point-teacher weakly semi-supervised.

Du X, Zhang X, Chen J, Li L

pubmed logopapersJun 1 2025
Polyps, like a silent time bomb in the gut, are always lurking and can explode into deadly colorectal cancer at any time. Many methods are attempted to maximize the early detection of colon polyps by screening, however, there are still face some challenges: (i) the scarcity of per-pixel annotation data and clinical features such as the blurred boundary and low contrast of polyps result in poor performance. (ii) existing weakly semi-supervised methods directly using pseudo-labels to supervise student tend to ignore the value brought by intermediate features in the teacher. To adapt the point-prompt teacher model to the challenging scenarios of complex medical images and limited annotation data, we creatively leverage the diverse inductive biases of CNN and Transformer to extract robust and complementary representation of polyp features (boundary and context). At the same time, a novel designed teacher-student intermediate feature distillation method is introduced rather than just using pseudo-labels to guide student learning. Comprehensive experiments demonstrate that our proposed method effectively handles scenarios with limited annotations and exhibits good segmentation performance. All code is available at https://github.com/dxqllp/WSS-Polyp.

Generating Synthetic T2*-Weighted Gradient Echo Images of the Knee with an Open-source Deep Learning Model.

Vrettos K, Vassalou EE, Vamvakerou G, Karantanas AH, Klontzas ME

pubmed logopapersJun 1 2025
Routine knee MRI protocols for 1.5 T and 3 T scanners, do not include T2*-w gradient echo (T2*W) images, which are useful in several clinical scenarios such as the assessment of cartilage, synovial blooming (deposition of hemosiderin), chondrocalcinosis and the evaluation of the physis in pediatric patients. Herein, we aimed to develop an open-source deep learning model that creates synthetic T2*W images of the knee using fat-suppressed intermediate-weighted images. A cycleGAN model was trained with 12,118 sagittal knee MR images and tested on an independent set of 2996 images. Diagnostic interchangeability of synthetic T2*W images was assessed against a series of findings. Voxel intensity of four tissues was evaluated with Bland-Altman plots. Image quality was assessed with the use of root mean squared error (NRMSE), structural similarity index (SSIM) and peak signal-to-noise ratio (PSNR). Code, model and a standalone executable file are provided on github. The model achieved a median NRMSE, PSNR and SSIM of 0.5, 17.4, and 0.5, respectively. Images were found interchangeable with an intraclass correlation coefficient >0.95 for all findings. Mean voxel intensity was equal between synthetic and conventional images. Four types of artifacts were identified: geometrical distortion (86/163 cases), object insertion/omission (11/163 cases), a wrap-around-like (26/163 cases) and an incomplete fat-suppression artifact (120/163 cases), which had a median 0 impact (no impact) on the diagnosis. In conclusion, the developed open-source GAN model creates synthetic T2*W images of the knee of high diagnostic value and quality. The identified artifacts had no or minor effect on the diagnostic value of the images.

FedBCD: Federated Ultrasound Video and Image Joint Learning for Breast Cancer Diagnosis.

Deng T, Huang C, Cai M, Liu Y, Liu M, Lin J, Shi Z, Zhao B, Huang J, Liang C, Han G, Liu Z, Wang Y, Han C

pubmed logopapersJun 1 2025
Ultrasonography plays an essential role in breast cancer diagnosis. Current deep learning based studies train the models on either images or videos in a centralized learning manner, lacking consideration of joint benefits between two different modality models or the privacy issue of data centralization. In this study, we propose the first decentralized learning solution for joint learning with breast ultrasound video and image, called FedBCD. To enable the model to learn from images and videos simultaneously and seamlessly in client-level local training, we propose a Joint Ultrasound Video and Image Learning (JUVIL) model to bridge the dimension gap between video and image data by incorporating temporal and spatial adapters. The parameter-efficient design of JUVIL with trainable adapters and frozen backbone further reduces the computational cost and communication burden of federated learning, finally improving the overall efficiency. Moreover, considering conventional model-wise aggregation may lead to unstable federated training due to different modalities, data capacities in different clients, and different functionalities across layers. We further propose a Fisher information matrix (FIM) guided Layer-wise Aggregation method named FILA. By measuring layer-wise sensitivity with FIM, FILA assigns higher contributions to the clients with lower sensitivity, improving personalized performance during federated training. Extensive experiments on three image clients and one video client demonstrate the benefits of joint learning architecture, especially for the ones with small-scale data. FedBCD significantly outperforms nine federated learning methods on both video-based and image-based diagnoses, demonstrating the superiority and potential for clinical practice. Code is released at https://github.com/tianpeng-deng/FedBCD.

Adaptive Weighting Based Metal Artifact Reduction in CT Images.

Wang H, Wu Y, Wang Y, Wei D, Wu X, Ma J, Zheng Y

pubmed logopapersJun 1 2025
Against the metal artifact reduction (MAR) task in computed tomography (CT) imaging, most of the existing deep-learning-based approaches generally select a single Hounsfield unit (HU) window followed by a normalization operation to preprocess CT images. However, in practical clinical scenarios, different body tissues and organs are often inspected under varying window settings for good contrast. The methods trained on a fixed single window would lead to insufficient removal of metal artifacts when being transferred to deal with other windows. To alleviate this problem, few works have proposed to reconstruct the CT images under multiple-window configurations. Albeit achieving good reconstruction performance for different windows, they adopt to directly supervise each window learning in an equal weighting way based on the training set. To improve the learning flexibility and model generalizability, in this paper, we propose an adaptive weighting algorithm, called AdaW, for the multiple-window metal artifact reduction, which can be applied to different deep MAR network backbones. Specifically, we first formulate the multiple window learning task as a bi-level optimization problem. Then we derive an adaptive weighting optimization algorithm where the learning process for MAR under each window is automatically weighted via a learning-to-learn paradigm based on the training set and validation set. This rationality is finely substantiated through theoretical analysis. Based on different network backbones, experimental comparisons executed on five datasets with different body sites comprehensively validate the effectiveness of AdaW in helping improve the generalization performance as well as its good applicability. We will release the code at https://github.com/hongwang01/AdaW.

GAN Inversion for Data Augmentation to Improve Colonoscopy Lesion Classification.

Golhar MV, Bobrow TL, Ngamruengphong S, Durr NJ

pubmed logopapersJun 1 2025
A major challenge in applying deep learning to medical imaging is the paucity of annotated data. This study explores the use of synthetic images for data augmentation to address the challenge of limited annotated data in colonoscopy lesion classification. We demonstrate that synthetic colonoscopy images generated by Generative Adversarial Network (GAN) inversion can be used as training data to improve polyp classification performance by deep learning models. We invert pairs of images with the same label to a semantically rich and disentangled latent space and manipulate latent representations to produce new synthetic images. These synthetic images maintain the same label as the input pairs. We perform image modality translation (style transfer) between white light and narrow-band imaging (NBI). We also generate realistic synthetic lesion images by interpolating between original training images to increase the variety of lesion shapes in the training dataset. Our experiments show that GAN inversion can produce multiple colonoscopy data augmentations that improve the downstream polyp classification performance by 2.7% in F1-score and 4.9% in sensitivity over other methods, including state-of-the-art data augmentation. Testing on unseen out-of-domain data also showcased an improvement of 2.9% in F1-score and 2.7% in sensitivity. This approach outperforms other colonoscopy data augmentation techniques and does not require re-training multiple generative models. It also effectively uses information from diverse public datasets, even those not specifically designed for the targeted downstream task, resulting in strong domain generalizability. Project code and model: https://github.com/DurrLab/GAN-Inversion.

Automated Ensemble Multimodal Machine Learning for Healthcare.

Imrie F, Denner S, Brunschwig LS, Maier-Hein K, van der Schaar M

pubmed logopapersJun 1 2025
The application of machine learning in medicine and healthcare has led to the creation of numerous diagnostic and prognostic models. However, despite their success, current approaches generally issue predictions using data from a single modality. This stands in stark contrast with clinician decision-making which employs diverse information from multiple sources. While several multimodal machine learning approaches exist, significant challenges in developing multimodal systems remain that are hindering clinical adoption. In this paper, we introduce a multimodal framework, AutoPrognosis-M, that enables the integration of structured clinical (tabular) data and medical imaging using automated machine learning. AutoPrognosis-M incorporates 17 imaging models, including convolutional neural networks and vision transformers, and three distinct multimodal fusion strategies. In an illustrative application using a multimodal skin lesion dataset, we highlight the importance of multimodal machine learning and the power of combining multiple fusion strategies using ensemble learning. We have open-sourced our framework as a tool for the community and hope it will accelerate the uptake of multimodal machine learning in healthcare and spur further innovation.

Deep learning-based MRI reconstruction with Artificial Fourier Transform Network (AFTNet).

Yang Y, Zhang Y, Li Z, Tian JS, Dagommer M, Guo J

pubmed logopapersJun 1 2025
Deep complex-valued neural networks (CVNNs) provide a powerful way to leverage complex number operations and representations and have succeeded in several phase-based applications. However, previous networks have not fully explored the impact of complex-valued networks in the frequency domain. Here, we introduce a unified complex-valued deep learning framework - Artificial Fourier Transform Network (AFTNet) - which combines domain-manifold learning and CVNNs. AFTNet can be readily used to solve image inverse problems in domain transformation, especially for accelerated magnetic resonance imaging (MRI) reconstruction and other applications. While conventional methods typically utilize magnitude images or treat the real and imaginary components of k-space data as separate channels, our approach directly processes raw k-space data in the frequency domain, utilizing complex-valued operations. This allows for a mapping between the frequency (k-space) and image domain to be determined through cross-domain learning. We show that AFTNet achieves superior accelerated MRI reconstruction compared to existing approaches. Furthermore, our approach can be applied to various tasks, such as denoised magnetic resonance spectroscopy (MRS) reconstruction and datasets with various contrasts. The AFTNet presented here is a valuable preprocessing component for different preclinical studies and provides an innovative alternative for solving inverse problems in imaging and spectroscopy. The code is available at: https://github.com/yanting-yang/AFT-Net.

TDSF-Net: Tensor Decomposition-Based Subspace Fusion Network for Multimodal Medical Image Classification.

Zhang Y, Xu G, Zhao M, Wang H, Shi F, Chen S

pubmed logopapersJun 1 2025
Data from multimodalities bring complementary information for deep learning-based medical image classification models. However, data fusion methods simply concatenating features or images barely consider the correlations or complementarities among different modalities and easily suffer from exponential growth in dimensions and computational complexity when the modality increases. Consequently, this article proposes a subspace fusion network with tensor decomposition (TD) to heighten multimodal medical image classification. We first introduce a Tucker low-rank TD module to map the high-level dimensional tensor to the low-rank subspace, reducing the redundancy caused by multimodal data and high-dimensional features. Then, a cross-tensor attention mechanism is utilized to fuse features from the subspace into a high-dimension tensor, enhancing the representation ability of extracted features and constructing the interaction information among components in the subspace. Extensive comparison experiments with state-of-the-art (SOTA) methods are conducted on one self-established and three public multimodal medical image datasets, verifying the effectiveness and generalization ability of the proposed method. The code is available at https://github.com/1zhang-yi/TDSFNet.

CineMA: A Foundation Model for Cine Cardiac MRI

Yunguan Fu, Weixi Yi, Charlotte Manisty, Anish N Bhuva, Thomas A Treibel, James C Moon, Matthew J Clarkson, Rhodri Huw Davies, Yipeng Hu

arxiv logopreprintMay 31 2025
Cardiac magnetic resonance (CMR) is a key investigation in clinical cardiovascular medicine and has been used extensively in population research. However, extracting clinically important measurements such as ejection fraction for diagnosing cardiovascular diseases remains time-consuming and subjective. We developed CineMA, a foundation AI model automating these tasks with limited labels. CineMA is a self-supervised autoencoder model trained on 74,916 cine CMR studies to reconstruct images from masked inputs. After fine-tuning, it was evaluated across eight datasets on 23 tasks from four categories: ventricle and myocardium segmentation, left and right ventricle ejection fraction calculation, disease detection and classification, and landmark localisation. CineMA is the first foundation model for cine CMR to match or outperform convolutional neural networks (CNNs). CineMA demonstrated greater label efficiency than CNNs, achieving comparable or better performance with fewer annotations. This reduces the burden of clinician labelling and supports replacing task-specific training with fine-tuning foundation models in future cardiac imaging applications. Models and code for pre-training and fine-tuning are available at https://github.com/mathpluscode/CineMA, democratising access to high-performance models that otherwise require substantial computational resources, promoting reproducibility and accelerating clinical translation.

Physician-level classification performance across multiple imaging domains with a diagnostic medical foundation model and a large dataset of annotated medical images

Thieme, A. H., Miri, T., Marra, A. R., Kobayashi, T., Rodriguez-Nava, G., Li, Y., Barba, T., Er, A. G., Benzler, J., Gertler, M., Riechers, M., Hinze, C., Zheng, Y., Pelz, K., Nagaraj, D., Chen, A., Loeser, A., Ruehle, A., Zamboglou, C., Alyahya, L., Uhlig, M., Machiraju, G., Weimann, K., Lippert, C., Conrad, T., Ma, J., Novoa, R., Moor, M., Hernandez-Boussard, T., Alawad, M., Salinas, J. L., Mittermaier, M., Gevaert, O.

medrxiv logopreprintMay 31 2025
A diagnostic medical foundation model (MedFM) is an artificial intelligence (AI) system engineered to accurately determine diagnoses across various medical imaging modalities and specialties. To train MedFM, we created the PubMed Central Medical Images Dataset (PMCMID), the largest annotated medical image dataset to date, comprising 16,126,659 images from 3,021,780 medical publications. Using AI- and ontology-based methods, we identified 4,482,237 medical images (e.g., clinical photos, X-rays, ultrasounds) and generated comprehensive annotations. To optimize MedFMs performance and assess biases, 13,266 images were manually annotated to establish a multimodal benchmark. MedFM achieved physician-level performance in diagnosis tasks spanning radiology, dermatology, and infectious diseases without requiring specific training. Additionally, we developed the Image2Paper app, allowing clinicians to upload medical images and retrieve relevant literature. The correct diagnoses appeared within the top ten results in 88.4% and at least one relevant differential diagnosis in 93.0%. MedFM and PMCMID were made publicly available. FundingResearch reported here was partially supported by the National Cancer Institute (NCI) (R01 CA260271), the Saudi Company for Artificial Intelligence (SCAI) Authority, and the German Federal Ministry for Economic Affairs and Climate Action (BMWK) under the project DAKI-FWS (01MK21009E). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Page 50 of 59587 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.