Synthetic data generation for tabular health records: A systematic review

M Hernandez, G Epelde, A Alberdi, R Cilla, D Rankin - Neurocomputing, 2022 - Elsevier
Synthetic data generation (SDG) research has been ongoing for some time with promising
results in different application domains, including healthcare, biometrics and energy …

A survey of generative adversarial networks for synthesizing structured electronic health records

GO Ghosheh, J Li, T Zhu - ACM Computing Surveys, 2024 - dl.acm.org
Electronic Health Records (EHRs) are a valuable asset to facilitate clinical research and
point of care applications; however, many challenges such as data privacy concerns impede …

[HTML][HTML] Evaluating identity disclosure risk in fully synthetic health data: model development and validation

K El Emam, L Mosquera, J Bass - Journal of medical Internet research, 2020 - jmir.org
Background There has been growing interest in data synthesis for enabling the sharing of
data for secondary analysis; however, there is a need for a comprehensive privacy risk …

Can synthetic data be a proxy for real clinical trial data? A validation study

Z Azizi, C Zheng, L Mosquera, L Pilote, K El Emam - BMJ open, 2021 - bmjopen.bmj.com
Objectives There are increasing requirements to make research data, especially clinical trial
data, more broadly available for secondary analyses. However, data availability remains a …

[HTML][HTML] Utility metrics for evaluating synthetic health data generation methods: validation study

K El Emam, L Mosquera, X Fang… - JMIR medical …, 2022 - medinform.jmir.org
Background A regular task by developers and users of synthetic data generation (SDG)
methods is to evaluate and compare the utility of these methods. Multiple utility metrics have …

[HTML][HTML] Synthetic tabular data evaluation in the health domain covering resemblance, utility, and privacy dimensions

M Hernadez, G Epelde, A Alberdi… - … of information in …, 2023 - thieme-connect.com
Background Synthetic tabular data generation is a potentially valuable technology with great
promise for data augmentation and privacy preservation. However, prior to adoption, an …

[HTML][HTML] FoGGAN: Generating realistic Parkinson's disease freezing of gait data using GANs

N Peppes, P Tsakanikas, E Daskalakis, T Alexakis… - Sensors, 2023 - mdpi.com
Data scarcity in the healthcare domain is a major drawback for most state-of-the-art
technologies engaging artificial intelligence. The unavailability of quality data due to both …

Validating a membership disclosure metric for synthetic health data

K El Emam, L Mosquera, X Fang - JAMIA open, 2022 - academic.oup.com
Background One of the increasingly accepted methods to evaluate the privacy of synthetic
data is by measuring the risk of membership disclosure. This is a measure of the F1 …

Optimizing the synthesis of clinical trial data using sequential trees

KE Emam, L Mosquera, C Zheng - Journal of the American …, 2021 - academic.oup.com
Objective With the growing demand for sharing clinical trial data, scalable methods to
enable privacy protective access to high-utility data are needed. Data synthesis is one such …

[HTML][HTML] Synthetic electronic health records generated with variational graph autoencoders

G Nikolentzos, M Vazirgiannis, C Xypolopoulos… - NPJ Digital …, 2023 - nature.com
Data-driven medical care delivery must always respect patient privacy—a requirement that
is not easily met. This issue has impeded improvements to healthcare software and has …