Deep learning on a data diet: Finding important examples early in training

M Paul, S Ganguli… - Advances in neural …, 2021 - proceedings.neurips.cc
Recent success in deep learning has partially been driven by training increasingly
overparametrized networks on ever larger datasets. It is therefore natural to ask: how much …

Overview of deep learning-based CSI feedback in massive MIMO systems

J Guo, CK Wen, S Jin, GY Li - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Many performance gains achieved by massive multiple-input and multiple-output depend on
the accuracy of the downlink channel state information (CSI) at the transmitter (base station) …

Grad-match: Gradient matching based data subset selection for efficient deep model training

K Killamsetty, S Durga… - International …, 2021 - proceedings.mlr.press
The great success of modern machine learning models on large datasets is contingent on
extensive computational resources with high financial and environmental costs. One way to …

Glister: Generalization based data subset selection for efficient and robust learning

K Killamsetty, D Sivasubramanian… - Proceedings of the …, 2021 - ojs.aaai.org
Large scale machine learning and deep models are extremely data-hungry. Unfortunately,
obtaining large amounts of labeled data is expensive, and training state-of-the-art models …

Gcr: Gradient coreset based replay buffer selection for continual learning

R Tiwari, K Killamsetty, R Iyer… - Proceedings of the …, 2022 - openaccess.thecvf.com
Continual learning (CL) aims to develop techniques by which a single model adapts to an
increasing number of tasks encountered sequentially, thereby potentially leveraging …

Coresets for data-efficient training of machine learning models

B Mirzasoleiman, J Bilmes… - … Conference on Machine …, 2020 - proceedings.mlr.press
Incremental gradient (IG) methods, such as stochastic gradient descent and its variants are
commonly used for large scale optimization in machine learning. Despite the sustained effort …

Random features for kernel approximation: A survey on algorithms, theory, and beyond

F Liu, X Huang, Y Chen… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
The class of random features is one of the most popular techniques to speed up kernel
methods in large-scale problems. Related works have been recognized by the NeurIPS Test …

Optimal experimental design: Formulations and computations

X Huan, J Jagalur, Y Marzouk - Acta Numerica, 2024 - cambridge.org
Questions of 'how best to acquire data'are essential to modelling and prediction in the
natural and social sciences, engineering applications, and beyond. Optimal experimental …

Retrieve: Coreset selection for efficient and robust semi-supervised learning

K Killamsetty, X Zhao, F Chen… - Advances in neural …, 2021 - proceedings.neurips.cc
Semi-supervised learning (SSL) algorithms have had great success in recent years in
limited labeled data regimes. However, the current state-of-the-art SSL algorithms are …

Compressed gastric image generation based on soft-label dataset distillation for medical data sharing

G Li, R Togo, T Ogawa, M Haseyama - Computer Methods and Programs in …, 2022 - Elsevier
Background and objective: Sharing of medical data is required to enable the cross-agency
flow of healthcare information and construct high-accuracy computer-aided diagnosis …