Step size matters in deep learning

K Nar, S Sastry - Advances in Neural Information …, 2018 - proceedings.neurips.cc
Training a neural network with the gradient descent algorithm gives rise to a discrete-time
nonlinear dynamical system. Consequently, behaviors that are typically observed in these …

Stndt: Modeling neural population activity with spatiotemporal transformers

T Le, E Shlizerman - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Modeling neural population dynamics underlying noisy single-trial spiking activities is
essential for relating neural observation and behavior. A recent non-recurrent method …

Reverse engineering recurrent neural networks with Jacobian switching linear dynamical systems

J Smith, S Linderman… - Advances in Neural …, 2021 - proceedings.neurips.cc
Recurrent neural networks (RNNs) are powerful models for processing time-series data, but
it remains challenging to understand how they function. Improving this understanding is of …

Teaching recurrent neural networks to infer global temporal structure from local examples

JZ Kim, Z Lu, E Nozari, GJ Pappas… - Nature Machine …, 2021 - nature.com
The ability to store and manipulate information is a hallmark of computational systems.
Whereas computers are carefully engineered to represent and perform mathematical …

Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network

A Sherstinsky - Physica D: Nonlinear Phenomena, 2020 - Elsevier
Because of their effectiveness in broad practical applications, LSTM networks have received
a wealth of coverage in scientific journals, technical blogs, and implementation guides …