Stochastic beams and where to find them: The gumbel-top-k trick for sampling sequences without...

IAM Huijben, W Kool, MB Paulus… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

The Gumbel-max trick is a method to draw a sample from a categorical distribution, given by
its unnormalized (log-) probabilities. Over the past years, the machine learning community …

被引用次数：103 相关文章所有 9 个版本

[PDF] nature.com

Illuminating protein space with a programmable generative model

JB Ingraham, M Baranov, Z Costello, KW Barber… - Nature, 2023 - nature.com

Three billion years of evolution has produced a tremendous diversity of protein molecules,
but the full potential of proteins is likely to be much greater. Accessing this potential has …

被引用次数：310 相关文章所有 17 个版本

[PDF] neurips.cc

Data selection for language models via importance resampling

SM Xie, S Santurkar, T Ma… - Advances in Neural …, 2023 - proceedings.neurips.cc

Selecting a suitable pretraining dataset is crucial for both general-domain (eg, GPT-3) and
domain-specific (eg, Codex) language models (LMs). We formalize this problem as selecting …

被引用次数：130 相关文章所有 5 个版本

[PDF] github.io

Graph neural networks: foundation, frontiers and applications

L Wu, P Cui, J Pei, L Zhao, X Guo - … of the 28th ACM SIGKDD Conference …, 2022 - dl.acm.org

The field of graph neural networks (GNNs) has seen rapid and incredible strides over the
recent years. Graph neural networks, also known as deep learning on graphs, graph …

被引用次数：432 相关文章所有 11 个版本

[PDF] neurips.cc

A contrastive framework for neural text generation

Y Su, T Lan, Y Wang, D Yogatama… - Advances in Neural …, 2022 - proceedings.neurips.cc

Text generation is of great importance to many natural language processing applications.
However, maximization-based decoding methods (eg, beam search) of neural language …

被引用次数：194 相关文章所有 7 个版本

[PDF] neurips.cc

Argmax flows and multinomial diffusion: Learning categorical distributions

E Hoogeboom, D Nielsen, P Jaini… - Advances in Neural …, 2021 - proceedings.neurips.cc

Generative flows and diffusion models have been predominantly trained on ordinal data, for
example natural images. This paper introduces two extensions of flows and diffusion for …

被引用次数：364 相关文章所有 7 个版本

[PDF] thecvf.com

Mist: Multi-modal iterative spatial-temporal transformer for long-form video question answering

D Gao, L Zhou, L Ji, L Zhu, Y Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract To build Video Question Answering (VideoQA) systems capable of assisting
humans in daily activities, seeking answers from long-form videos with diverse and complex …

被引用次数：89 相关文章所有 8 个版本

[PDF] thecvf.com

Pixel contrastive-consistent semi-supervised semantic segmentation

Y Zhong, B Yuan, H Wu, Z Yuan… - Proceedings of the …, 2021 - openaccess.thecvf.com

We present a novel semi-supervised semantic segmentation method which jointly achieves
two desiderata of segmentation model regularities: the label-space consistency property …

被引用次数：200 相关文章所有 7 个版本

[PDF] neurips.cc

Unsupervised data augmentation for consistency training

Q Xie, Z Dai, E Hovy, T Luong… - Advances in neural …, 2020 - proceedings.neurips.cc

Semi-supervised learning lately has shown much promise in improving deep learning
models when labeled data is scarce. Common among recent approaches is the use of …

被引用次数：2607 相关文章所有 12 个版本

[PDF] neurips.cc

Self-evaluation guided beam search for reasoning

Y Xie, K Kawaguchi, Y Zhao, JX Zhao… - Advances in …, 2024 - proceedings.neurips.cc

Breaking down a problem into intermediate steps has demonstrated impressive
performance in Large Language Model (LLM) reasoning. However, the growth of the …

被引用次数：68 相关文章所有 6 个版本