Deep recurrent models with fast-forward connections for neural machine translation

A Vaswani, N Shazeer, N Parmar, J Uszkoreit… - arXiv preprint arXiv …, 2017 - codetds.com

摘要占主导地位的序列转导模型基于复杂的递归或卷积神经网络, 包括编码器和解码器.
性能最好的模型还通过注意力机制连接编码器和解码器. 我们提出了一种新的简单网络架构 …

The best of both worlds: Combining recent advances in neural machine translation

MX Chen, O Firat, A Bapna, M Johnson… - arXiv preprint arXiv …, 2018 - arxiv.org

The past year has witnessed rapid advances in sequence-to-sequence (seq2seq) modeling
for Machine Translation (MT). The classic RNN-based approaches to MT were first out …

被引用次数：543 相关文章所有 6 个版本

[HTML] nih.gov

Machine learning for healthcare: on the verge of a major shift in healthcare epidemiology

J Wiens, ES Shenoy - Clinical infectious diseases, 2018 - academic.oup.com

The increasing availability of electronic health data presents a major opportunity in
healthcare for both discovery and practical applications to improve healthcare. However, for …

被引用次数：589 相关文章所有 7 个版本

[PDF] arxiv.org

Learning deep transformer models for machine translation

Q Wang, B Li, T Xiao, J Zhu, C Li, DF Wong… - arXiv preprint arXiv …, 2019 - arxiv.org

Transformer is the state-of-the-art model in recent machine translation evaluations. Two
strands of research are promising to improve models of this kind: the first uses wide …

被引用次数：829 相关文章所有 9 个版本

[PDF] arxiv.org

Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network

A Sherstinsky - Physica D: Nonlinear Phenomena, 2020 - Elsevier

Because of their effectiveness in broad practical applications, LSTM networks have received
a wealth of coverage in scientific journals, technical blogs, and implementation guides …

被引用次数：4813 相关文章所有 5 个版本

[PDF] arxiv.org

Self-attentive sequential recommendation

WC Kang, J McAuley - 2018 IEEE international conference on …, 2018 - ieeexplore.ieee.org

Sequential dynamics are a key feature of many modern recommender systems, which seek
to capture the'context'of users' activities on the basis of actions they have performed recently …

被引用次数：2693 相关文章所有 10 个版本

[PDF] researchgate.net

Attention, please! A survey of neural attention models in deep learning

A de Santana Correia, EL Colombini - Artificial Intelligence Review, 2022 - Springer

In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …

被引用次数：216 相关文章所有 8 个版本

[PDF] aclanthology.org

What you can cram into a single vector: Probing sentence embeddings for linguistic properties

A Conneau, G Kruszewski, G Lample, L Barrault… - arXiv preprint arXiv …, 2018 - arxiv.org

Although much effort has recently been devoted to training high-quality sentence
embeddings, we still have a poor understanding of what they are capturing." Downstream" …

被引用次数：1033 相关文章所有 8 个版本

[PDF] arxiv.org

Massively multilingual neural machine translation in the wild: Findings and challenges

N Arivazhagan, A Bapna, O Firat, D Lepikhin… - arXiv preprint arXiv …, 2019 - arxiv.org

We introduce our efforts towards building a universal neural machine translation (NMT)
system capable of translating between any language pair. We set a milestone towards this …

被引用次数：422 相关文章所有 3 个版本

[PDF] hhu.de

[PDF][PDF] Attention is all you need

A Vaswani - Advances in Neural Information Processing Systems, 2017 - user.phil.hhu.de

The dominant sequence transduction models are based on complex recurrent
orconvolutional neural networks in an encoder and decoder configuration. The best …

被引用次数：146896 相关文章