Optimal experimental design for infinite-dimensional Bayesian inverse problems governed by PDEs: A review

A Alexanderian - Inverse Problems, 2021 - iopscience.iop.org
We present a review of methods for optimal experimental design (OED) for Bayesian inverse
problems governed by partial differential equations with infinite-dimensional parameters …

Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration

J Gardner, G Pleiss, KQ Weinberger… - Advances in neural …, 2018 - proceedings.neurips.cc
Despite advances in scalable models, the inference tools used for Gaussian processes
(GPs) have yet to fully capitalize on developments in computing hardware. We present an …

Pyhessian: Neural networks through the lens of the hessian

Z Yao, A Gholami, K Keutzer… - 2020 IEEE international …, 2020 - ieeexplore.ieee.org
We present PYHESSIAN, a new scalable framework that enables fast computation of
Hessian (ie, second-order derivative) information for deep neural networks. PYHESSIAN …

Hawq-v2: Hessian aware trace-weighted quantization of neural networks

Z Dong, Z Yao, D Arfeen, A Gholami… - Advances in neural …, 2020 - proceedings.neurips.cc
Quantization is an effective method for reducing memory footprint and inference time of
Neural Networks. However, ultra low precision quantization could lead to significant …

How to train your neural ODE: the world of Jacobian and kinetic regularization

C Finlay, JH Jacobsen, L Nurbekyan… - … on machine learning, 2020 - proceedings.mlr.press
Training neural ODEs on large datasets has not been tractable due to the necessity of
allowing the adaptive numerical ODE solver to refine its step size to very small values. In …

Gradient norm aware minimization seeks first-order flatness and improves generalization

X Zhang, R Xu, H Yu, H Zou… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Recently, flat minima are proven to be effective for improving generalization and sharpness-
aware minimization (SAM) achieves state-of-the-art performance. Yet the current definition of …

Generative data augmentation for commonsense reasoning

Y Yang, C Malaviya, J Fernandez… - arXiv preprint arXiv …, 2020 - arxiv.org
Recent advances in commonsense reasoning depend on large-scale human-annotated
training data to achieve peak performance. However, manual curation of training examples …

There are many consistent explanations of unlabeled data: Why you should average

B Athiwaratkun, M Finzi, P Izmailov… - arXiv preprint arXiv …, 2018 - arxiv.org
Presently the most successful approaches to semi-supervised learning are based on
consistency regularization, whereby a model is trained to be robust to small perturbations of …

The hessian penalty: A weak prior for unsupervised disentanglement

W Peebles, J Peebles, JY Zhu, A Efros… - Computer Vision–ECCV …, 2020 - Springer
Existing disentanglement methods for deep generative models rely on hand-picked priors
and complex encoder-based architectures. In this paper, we propose the Hessian Penalty, a …

Network quantization with element-wise gradient scaling

J Lee, D Kim, B Ham - … of the IEEE/CVF conference on …, 2021 - openaccess.thecvf.com
Network quantization aims at reducing bit-widths of weights and/or activations, particularly
important for implementing deep neural networks with limited hardware resources. Most …