Data augmentation using llms: Data perspectives, learning paradigms and challenges

B Ding, C Qin, R Zhao, T Luo, X Li… - Findings of the …, 2024 - aclanthology.org
In the rapidly evolving field of large language models (LLMs), data augmentation (DA) has
emerged as a pivotal technique for enhancing model performance by diversifying training …

A survey of generative adversarial networks for synthesizing structured electronic health records

GO Ghosheh, J Li, T Zhu - ACM Computing Surveys, 2024 - dl.acm.org
Electronic Health Records (EHRs) are a valuable asset to facilitate clinical research and
point of care applications; however, many challenges such as data privacy concerns impede …

A comprehensive survey on generative diffusion models for structured data

H Koo, TE Kim - arXiv e-prints, 2023 - ui.adsabs.harvard.edu
In recent years, generative diffusion models have achieved a rapid paradigm shift in deep
generative models by showing groundbreaking performance across various applications …

A survey on diffusion models for time series and spatio-temporal data

Y Yang, M Jin, H Wen, C Zhang, Y Liang, L Ma… - arXiv preprint arXiv …, 2024 - arxiv.org
The study of time series data is crucial for understanding trends and anomalies over time,
enabling predictive insights across various sectors. Spatio-temporal data, on the other hand …

Guided discrete diffusion for electronic health record generation

J Han, Z Chen, Y Li, Y Kou, E Halperin… - arXiv preprint arXiv …, 2024 - arxiv.org
Electronic health records (EHRs) are a pivotal data source that enables numerous
applications in computational medicine, eg, disease progression prediction, clinical trial …

Fast and reliable generation of ehr time series via diffusion models

M Tian, B Chen, A Guo, S Jiang, AR Zhang - arXiv preprint arXiv …, 2023 - arxiv.org
Electronic Health Records (EHRs) are rich sources of patient-level data, including laboratory
tests, medications, and diagnoses, offering valuable resources for medical data analysis …

面向扩散模型的电子健康档案数据生成研究综述.

魏博伦, 张贤坤 - Application Research of Computers …, 2024 - search.ebscohost.com
医学领域的电子健康档案(electronichealthrecords, EHR) 数据涵盖了大量宝贵的生物医学知识,
为医疗数据分析提供了重要的资源. 然而, 隐私保护和数据共享的限制成为研究的主要瓶颈 …

A Survey on Generative Diffusion Models for Structured Data

H Koo - arXiv preprint arXiv:2306.04139, 2023 - arxiv.org
In recent years, generative diffusion models have achieved a rapid paradigm shift in deep
generative models by showing groundbreaking performance across various applications …

SynSUM--Synthetic Benchmark with Structured and Unstructured Medical Records

P Rabaey, H Arno, S Heytens, T Demeester - arXiv preprint arXiv …, 2024 - arxiv.org
We present the SynSUM benchmark, a synthetic dataset linking unstructured clinical notes
to structured background variables. The dataset consists of 10,000 artificial patient records …

Fast Sampling via De-randomization for Discrete Diffusion Models

Z Chen, H Yuan, Y Li, Y Kou, J Zhang, Q Gu - arXiv preprint arXiv …, 2023 - arxiv.org
Diffusion models have emerged as powerful tools for high-quality data generation, such as
image generation. Despite its success in continuous spaces, discrete diffusion models …