A review of the gumbel-max trick and its extensions for discrete stochasticity in machine learning
IAM Huijben, W Kool, MB Paulus… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
The Gumbel-max trick is a method to draw a sample from a categorical distribution, given by
its unnormalized (log-) probabilities. Over the past years, the machine learning community …
its unnormalized (log-) probabilities. Over the past years, the machine learning community …
Illuminating protein space with a programmable generative model
Three billion years of evolution has produced a tremendous diversity of protein molecules,
but the full potential of proteins is likely to be much greater. Accessing this potential has …
but the full potential of proteins is likely to be much greater. Accessing this potential has …
Data selection for language models via importance resampling
Selecting a suitable pretraining dataset is crucial for both general-domain (eg, GPT-3) and
domain-specific (eg, Codex) language models (LMs). We formalize this problem as selecting …
domain-specific (eg, Codex) language models (LMs). We formalize this problem as selecting …
Graph neural networks: foundation, frontiers and applications
The field of graph neural networks (GNNs) has seen rapid and incredible strides over the
recent years. Graph neural networks, also known as deep learning on graphs, graph …
recent years. Graph neural networks, also known as deep learning on graphs, graph …
A contrastive framework for neural text generation
Text generation is of great importance to many natural language processing applications.
However, maximization-based decoding methods (eg, beam search) of neural language …
However, maximization-based decoding methods (eg, beam search) of neural language …
Argmax flows and multinomial diffusion: Learning categorical distributions
E Hoogeboom, D Nielsen, P Jaini… - Advances in Neural …, 2021 - proceedings.neurips.cc
Generative flows and diffusion models have been predominantly trained on ordinal data, for
example natural images. This paper introduces two extensions of flows and diffusion for …
example natural images. This paper introduces two extensions of flows and diffusion for …
Mist: Multi-modal iterative spatial-temporal transformer for long-form video question answering
Abstract To build Video Question Answering (VideoQA) systems capable of assisting
humans in daily activities, seeking answers from long-form videos with diverse and complex …
humans in daily activities, seeking answers from long-form videos with diverse and complex …
Pixel contrastive-consistent semi-supervised semantic segmentation
We present a novel semi-supervised semantic segmentation method which jointly achieves
two desiderata of segmentation model regularities: the label-space consistency property …
two desiderata of segmentation model regularities: the label-space consistency property …
Unsupervised data augmentation for consistency training
Semi-supervised learning lately has shown much promise in improving deep learning
models when labeled data is scarce. Common among recent approaches is the use of …
models when labeled data is scarce. Common among recent approaches is the use of …
Self-evaluation guided beam search for reasoning
Breaking down a problem into intermediate steps has demonstrated impressive
performance in Large Language Model (LLM) reasoning. However, the growth of the …
performance in Large Language Model (LLM) reasoning. However, the growth of the …