RELICT-NI: Replica Detection in Synthetic Neuroimaging-A Study on Noncontrast CT and Time-of-Flight MRA.
Authors
Affiliations (5)
Affiliations (5)
- CLAIM - Charité Lab for AI in Medicine, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, 101117, Berlin, Germany. [email protected].
- CLAIM - Charité Lab for AI in Medicine, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, 101117, Berlin, Germany.
- Department of Neurosurgery, Mie Chuo Medical Center, 2158-5 Myojin-cho, Hisai, 514- 1101, Tsu, Japan.
- Department of Neurosurgery, Mie University Graduate School of Medicine, 2-174 Edobashi, Tsu, 514-8507, Japan.
- Department of Neurosurgery, Charité - Universitätsmedizin Berlin, Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, 101117, Berlin, Germany.
Abstract
Synthetic neuroimaging data has the potential to augment and improve the generalizability of deep learning models. However, memorization in generative models can lead to unintended leakage of sensitive patient information, limiting model utility and jeopardizing patient privacy. We propose RELICT-NI (REpLIca deteCTion-NeuroImaging), a framework for detecting replicas in synthetic neuroimaging datasets. RELICT-NI evaluates image similarity using three complementary approaches: (1) image-level analysis, (2) feature-level analysis via a pretrained medical foundation model, and (3) segmentation-level analysis. RELICT-NI was validated on two clinically relevant neuroimaging use cases: non-contrast head CT with intracerebral hemorrhage (N = 774) and time-of-flight MR angiography of the Circle of Willis (N = 1,782). Expert visual scoring was used as the reference for identifying replicas. Balanced accuracy at the optimal threshold was reported to assess replica classification performance of each method. The reference visual rating identified 45 of 50 and 5 of 50 generated images as replicas for the NCCT and TOF-MRA use cases, respectively. For the NCCT use case, both image-level and feature-level analyses achieved perfect replica detection (balanced accuracy = 1) at optimal thresholds. A perfect classification of replicas for the TOF-MRA case was not possible at any threshold, with the segmentation-level analysis achieving the highest balanced accuracy (0.79). Replica detection is a crucial but often neglected validation step in developing deep generative models in neuroimaging. The proposed RELICT-NI framework provides a standardized, easy-to-use tool for replica detection and aims to facilitate responsible and ethical synthesis of neuroimaging data. Our developed replica detection framework provides an important step towards standardized and rigorous validation practices of generative models in neuroimaging. Our method promotes the secure sharing of neuroimaging data and facilitates the development of robust deep learning models.