Beyond privacy: Navigating the opportunities and challenges of synthetic data

B van Breugel, M van der Schaar - arXiv preprint arXiv:2304.03722, 2023 - arxiv.org
Generating synthetic data through generative models is gaining interest in the ML
community and beyond. In the past, synthetic data was often regarded as a means to private …

A deep learning approach for intrusion detection in Internet of Things using focal loss function

AS Dina, AB Siddique, D Manivannan - Internet of Things, 2023 - Elsevier
Abstract Internet of Things (IoT) is likely to revolutionize healthcare, energy, education,
transportation, manufacturing, military, agriculture, and other industries. However, for the …

Attribute-Centric and Synthetic Data Based Privacy Preserving Methods: A Systematic Review

A Majeed - Journal of Cybersecurity and Privacy, 2023 - mdpi.com
Anonymization techniques are widely used to make personal data broadly available for
analytics/data-mining purposes while preserving the privacy of the personal information …

Synthetic data, real errors: how (not) to publish and use synthetic data

B Van Breugel, Z Qian… - … on Machine Learning, 2023 - proceedings.mlr.press
Generating synthetic data through generative models is gaining interest in the ML
community and beyond, promising a future where datasets can be tailored to individual …

Can you rely on your model evaluation? improving model evaluation with synthetic test data

B van Breugel, N Seedat, F Imrie… - Advances in Neural …, 2024 - proceedings.neurips.cc
Evaluating the performance of machine learning models on diverse and underrepresented
subgroups is essential for ensuring fairness and reliability in real-world applications …

Addressing the class imbalance problem in network intrusion detection systems using data resampling and deep learning

A Abdelkhalek, M Mashaly - The journal of Supercomputing, 2023 - Springer
Network intrusion detection systems (NIDS) are the most common tool used to detect
malicious attacks on a network. They help prevent the ever-increasing different attacks and …

Reimagining synthetic tabular data generation through data-centric AI: A comprehensive benchmark

L Hansen, N Seedat… - Advances in Neural …, 2023 - proceedings.neurips.cc
Synthetic data serves as an alternative in training machine learning models, particularly
when real-world data is limited or inaccessible. However, ensuring that synthetic data …

Synthetic data in biomedicine via generative artificial intelligence

B van Breugel, T Liu, D Oglic… - Nature Reviews …, 2024 - nature.com
The creation and application of data in biomedicine and healthcare often face privacy
constraints, bias, distributional shifts, underrepresentation of certain groups and data …

MRI-based radiomics combined with deep learning for distinguishing IDH-mutant WHO grade 4 astrocytomas from IDH-wild-type glioblastomas

SA Hosseini, E Hosseini, G Hajianfar, I Shiri, S Servaes… - Cancers, 2023 - mdpi.com
Simple Summary To differentiate IDH-mutant grade 4 astrocytomas from IDH-wild-type
glioblastomas, two MRI sequences (post-contrast T1 and T2-FLAIR) were acquired from 57 …

CTGAN-MOS: Conditional generative adversarial network based minority-class-augmented oversampling scheme for imbalanced problems

A Majeed, SO Hwang - IEEE Access, 2023 - ieeexplore.ieee.org
This paper proposes a novel data augmentation scheme called the conditional generative
adversarial network minority-class-augmented oversampling scheme (CTGAN-MOS) for …