Machine learning for synthetic data generation: a review

Y Lu, M Shen, H Wang, X Wang, C van Rechem… - arXiv preprint arXiv …, 2023 - arxiv.org
Machine learning heavily relies on data, but real-world applications often encounter various
data-related issues. These include data of poor quality, insufficient data points leading to …

[图书][B] Synthetic data for deep learning

SI Nikolenko - 2021 - Springer
You are holding in your hands… oh, come on, who holds books like this in their hands
anymore? Anyway, you are reading this, and it means that I have managed to release one of …

More than privacy: Adopting differential privacy in game-theoretic mechanism design

L Zhang, T Zhu, P Xiong, W Zhou, PS Yu - ACM Computing Surveys …, 2021 - dl.acm.org
The vast majority of artificial intelligence solutions are founded on game theory, and
differential privacy is emerging as perhaps the most rigorous and widely adopted privacy …

Differentially private diffusion models

T Dockhorn, T Cao, A Vahdat, K Kreis - arXiv preprint arXiv:2210.09929, 2022 - arxiv.org
While modern machine learning models rely on increasingly large training datasets, data is
often limited in privacy-sensitive domains. Generative models trained with differential privacy …

Dp-cgan: Differentially private synthetic data and label generation

R Torkzadehmahani, P Kairouz… - Proceedings of the …, 2019 - openaccess.thecvf.com
Abstract Generative Adversarial Networks (GANs) are one of the well-known models to
generate synthetic data including images, especially for research communities that cannot …

Generative models for effective ML on private, decentralized datasets

S Augenstein, HB McMahan, D Ramage… - arXiv preprint arXiv …, 2019 - arxiv.org
To improve real-world applications of machine learning, experienced modelers develop
intuition about their datasets, their models, and how the two interact. Manual inspection of …

Gs-wgan: A gradient-sanitized approach for learning differentially private generators

D Chen, T Orekondy, M Fritz - Advances in Neural …, 2020 - proceedings.neurips.cc
The wide-spread availability of rich data has fueled the growth of machine learning
applications in numerous domains. However, growth in domains with highly-sensitive data …

Winning the NIST Contest: A scalable and general approach to differentially private synthetic data

R McKenna, G Miklau, D Sheldon - arXiv preprint arXiv:2108.04978, 2021 - arxiv.org
We propose a general approach for differentially private synthetic data generation, that
consists of three steps:(1) select a collection of low-dimensional marginals,(2) measure …

Using gans for sharing networked time series data: Challenges, initial promise, and open questions

Z Lin, A Jain, C Wang, G Fanti, V Sekar - Proceedings of the ACM …, 2020 - dl.acm.org
Limited data access is a longstanding barrier to data-driven research and development in
the networked systems community. In this work, we explore if and how generative …

{PrivSyn}: Differentially private data synthesis

Z Zhang, T Wang, N Li, J Honorio, M Backes… - 30th USENIX Security …, 2021 - usenix.org
In differential privacy (DP), a challenging problem is to generate synthetic datasets that
efficiently capture the useful information in the private data. The synthetic dataset enables …