Language modeling with gated convolutional networks

YN Dauphin, A Fan, M Auli… - … conference on machine …, 2017 - proceedings.mlr.press
The pre-dominant approach to language modeling to date is based on recurrent neural
networks. Their success on this task is often linked to their ability to capture unbounded …

Recurrent neural networks with top-k gains for session-based recommendations

B Hidasi, A Karatzoglou - Proceedings of the 27th ACM international …, 2018 - dl.acm.org
RNNs have been shown to be excellent models for sequential data and in particular for data
that is generated by users in an session-based manner. The use of RNNs provides …

[HTML][HTML] 卷积神经网络及其在智能交通系统中的应用综述

马永杰, 程时升, 马芸婷, 马义德 - 交通运输工程学报, 2021 - transport.chd.edu.cn
从特征传输方式, 空间维度, 特征维度3 个角度, 论述了近年来卷积神经网络结构的改进方向,
介绍了卷积层, 池化层, 激活函数, 优化算法的工作原理, 从基于值, 等级, 概率和转换域四大类 …

Vision-radar fusion for robotics bev detections: A survey

A Singh - 2023 IEEE Intelligent Vehicles Symposium (IV), 2023 - ieeexplore.ieee.org
Due to the trending need of building autonomous robotic perception system, sensor fusion
has attracted a lot of attention amongst researchers and engineers to make best use of cross …

An introduction to neural information retrieval

B Mitra, N Craswell - Foundations and Trends® in Information …, 2018 - nowpublishers.com
Neural ranking models for information retrieval (IR) use shallow or deep neural networks to
rank search results in response to a query. Traditional learning to rank models employ …

Efficient softmax approximation for GPUs

A Joulin, M Cissé, D Grangier… - … conference on machine …, 2017 - proceedings.mlr.press
We propose an approximate strategy to efficiently train neural network based language
models over very large vocabularies. Our approach, called adaptive softmax, circumvents …

Tree-to-sequence attentional neural machine translation

A Eriguchi, K Hashimoto, Y Tsuruoka - arXiv preprint arXiv:1603.06075, 2016 - arxiv.org
Most of the existing Neural Machine Translation (NMT) models focus on the conversion of
sequential data and do not directly use syntactic information. We propose a novel end-to …

Learning to parse and translate improves neural machine translation

A Eriguchi, Y Tsuruoka, K Cho - arXiv preprint arXiv:1702.03525, 2017 - arxiv.org
There has been relatively little attention to incorporating linguistic prior to neural machine
translation. Much of the previous work was further constrained to considering linguistic prior …

Frustratingly short attention spans in neural language modeling

M Daniluk, T Rocktäschel, J Welbl, S Riedel - arXiv preprint arXiv …, 2017 - arxiv.org
Neural language models predict the next token using a latent representation of the
immediate token history. Recently, various methods for augmenting neural language models …

Von mises-fisher loss for training sequence to sequence models with continuous outputs

S Kumar, Y Tsvetkov - arXiv preprint arXiv:1812.04616, 2018 - arxiv.org
The Softmax function is used in the final layer of nearly all existing sequence-to-sequence
models for language generation. However, it is usually the slowest layer to compute which …