- 学术资源搜索

[HTML][HTML] Attention Is All You Need.(Nips), 2017

A Vaswani, N Shazeer, N Parmar, J Uszkoreit… - arXiv preprint arXiv …, 2017 - codetds.com

摘要占主导地位的序列转导模型基于复杂的递归或卷积神经网络, 包括编码器和解码器.
性能最好的模型还通过注意力机制连接编码器和解码器. 我们提出了一种新的简单网络架构 …

On the state of the art of evaluation in neural language models

G Melis, C Dyer, P Blunsom - arXiv preprint arXiv:1707.05589, 2017 - arxiv.org

Ongoing innovations in recurrent neural network architectures have provided a steady influx
of apparently state-of-the-art results on language modelling benchmarks. However, these …

被引用次数：649 相关文章所有 4 个版本

[PDF] arxiv.org

Unsupervised opinion summarization as copycat-review generation

A Bražinskas, M Lapata, I Titov - arXiv preprint arXiv:1911.02247, 2019 - arxiv.org

Opinion summarization is the task of automatically creating summaries that reflect subjective
information expressed in multiple documents, such as product reviews. While the majority of …

被引用次数：134 相关文章所有 8 个版本

[PDF] mlr.press

Robust speech recognition via large-scale weak supervision

A Radford, JW Kim, T Xu, G Brockman… - International …, 2023 - proceedings.mlr.press

We study the capabilities of speech processing systems trained simply to predict large
amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual …

被引用次数：2767 相关文章所有 11 个版本

[PDF] arxiv.org

Simvlm: Simple visual language model pretraining with weak supervision

Z Wang, J Yu, AW Yu, Z Dai, Y Tsvetkov… - arXiv preprint arXiv …, 2021 - arxiv.org

With recent progress in joint modeling of visual and textual representations, Vision-
Language Pretraining (VLP) has achieved impressive performance on many multimodal …

被引用次数：753 相关文章所有 6 个版本

[PDF] mlr.press

Unifying vision-and-language tasks via text generation

J Cho, J Lei, H Tan, M Bansal - International Conference on …, 2021 - proceedings.mlr.press

Existing methods for vision-and-language learning typically require designing task-specific
architectures and objectives for each task. For example, a multi-label answer classifier for …

被引用次数：508 相关文章所有 6 个版本

[PDF] mlr.press

Training graph neural networks with 1000 layers

G Li, M Müller, B Ghanem… - … conference on machine …, 2021 - proceedings.mlr.press

Deep graph neural networks (GNNs) have achieved excellent results on various tasks on
increasingly large graph datasets with millions of nodes and edges. However, memory …

被引用次数：257 相关文章所有 6 个版本

[PDF] salesforceairesearch.com

Ctrl: A conditional transformer language model for controllable generation

NS Keskar, B McCann, LR Varshney, C Xiong… - arXiv preprint arXiv …, 2019 - arxiv.org

Large-scale language models show promising text generation capabilities, but users cannot
easily control particular aspects of the generated text. We release CTRL, a 1.63 billion …

被引用次数：1225 相关文章所有 2 个版本

[PDF] thecvf.com

Virtex: Learning visual representations from textual annotations

K Desai, J Johnson - … of the IEEE/CVF conference on …, 2021 - openaccess.thecvf.com

The de-facto approach to many vision tasks is to start from pretrained visual representations,
typically learned via supervised training on ImageNet. Recent methods have explored …

被引用次数：444 相关文章所有 8 个版本

[PDF] openreview.net

Generalization through memorization: Nearest neighbor language models

U Khandelwal, O Levy, D Jurafsky… - arXiv preprint arXiv …, 2019 - arxiv.org

We introduce $ k $ NN-LMs, which extend a pre-trained neural language model (LM) by
linearly interpolating it with a $ k $-nearest neighbors ($ k $ NN) model. The nearest …

被引用次数：743 相关文章所有 8 个版本