Blackout: Speeding up recurrent neural network language models with very large vocabularies

YN Dauphin, A Fan, M Auli… - … conference on machine …, 2017 - proceedings.mlr.press

The pre-dominant approach to language modeling to date is based on recurrent neural
networks. Their success on this task is often linked to their ability to capture unbounded …

被引用次数：2773 相关文章所有 11 个版本

[PDF] arxiv.org

Recurrent neural networks with top-k gains for session-based recommendations

B Hidasi, A Karatzoglou - Proceedings of the 27th ACM international …, 2018 - dl.acm.org

RNNs have been shown to be excellent models for sequential data and in particular for data
that is generated by users in an session-based manner. The use of RNNs provides …

被引用次数：903 相关文章所有 9 个版本

[HTML] chd.edu.cn

[HTML][HTML] 卷积神经网络及其在智能交通系统中的应用综述

马永杰，程时升，马芸婷，马义德 - 交通运输工程学报, 2021 - transport.chd.edu.cn

从特征传输方式, 空间维度, 特征维度3 个角度, 论述了近年来卷积神经网络结构的改进方向,
介绍了卷积层, 池化层, 激活函数, 优化算法的工作原理, 从基于值, 等级, 概率和转换域四大类 …

被引用次数：10 相关文章所有 5 个版本

[PDF] arxiv.org

Vision-radar fusion for robotics bev detections: A survey

A Singh - 2023 IEEE Intelligent Vehicles Symposium (IV), 2023 - ieeexplore.ieee.org

Due to the trending need of building autonomous robotic perception system, sensor fusion
has attracted a lot of attention amongst researchers and engineers to make best use of cross …

被引用次数：19 相关文章所有 4 个版本

[PDF] nowpublishers.com

An introduction to neural information retrieval

B Mitra, N Craswell - Foundations and Trends® in Information …, 2018 - nowpublishers.com

Neural ranking models for information retrieval (IR) use shallow or deep neural networks to
rank search results in response to a query. Traditional learning to rank models employ …

被引用次数：408 相关文章所有 8 个版本

[PDF] mlr.press

Efficient softmax approximation for GPUs

A Joulin, M Cissé, D Grangier… - … conference on machine …, 2017 - proceedings.mlr.press

We propose an approximate strategy to efficiently train neural network based language
models over very large vocabularies. Our approach, called adaptive softmax, circumvents …

被引用次数：315 相关文章所有 10 个版本

[PDF] arxiv.org

Tree-to-sequence attentional neural machine translation

A Eriguchi, K Hashimoto, Y Tsuruoka - arXiv preprint arXiv:1603.06075, 2016 - arxiv.org

Most of the existing Neural Machine Translation (NMT) models focus on the conversion of
sequential data and do not directly use syntactic information. We propose a novel end-to …

被引用次数：337 相关文章所有 4 个版本

[PDF] arxiv.org

Learning to parse and translate improves neural machine translation

A Eriguchi, Y Tsuruoka, K Cho - arXiv preprint arXiv:1702.03525, 2017 - arxiv.org

There has been relatively little attention to incorporating linguistic prior to neural machine
translation. Much of the previous work was further constrained to considering linguistic prior …

被引用次数：173 相关文章所有 6 个版本

[PDF] arxiv.org

Frustratingly short attention spans in neural language modeling

M Daniluk, T Rocktäschel, J Welbl, S Riedel - arXiv preprint arXiv …, 2017 - arxiv.org

Neural language models predict the next token using a latent representation of the
immediate token history. Recently, various methods for augmenting neural language models …

被引用次数：148 相关文章所有 7 个版本

[PDF] arxiv.org

Von mises-fisher loss for training sequence to sequence models with continuous outputs

S Kumar, Y Tsvetkov - arXiv preprint arXiv:1812.04616, 2018 - arxiv.org

The Softmax function is used in the final layer of nearly all existing sequence-to-sequence
models for language generation. However, it is usually the slowest layer to compute which …

被引用次数：80 相关文章所有 4 个版本