Random feature attention

H Peng, N Pappas, D Yogatama, R Schwartz… - arXiv preprint arXiv …, 2021 - arxiv.org
Transformers are state-of-the-art models for a variety of sequence modeling tasks. At their
core is an attention function which models pairwise interactions between the inputs at every …

A generalizable and accessible approach to machine learning with global satellite imagery

E Rolf, J Proctor, T Carleton, I Bolliger… - Nature …, 2021 - nature.com
Combining satellite imagery with machine learning (SIML) has the potential to address
global challenges by remotely estimating socioeconomic and environmental conditions in …

Implicit kernel learning

CL Li, WC Chang, Y Mroueh, Y Yang… - The 22nd …, 2019 - proceedings.mlr.press
Kernels are powerful and versatile tools in machine learning and statistics. Although the
notion of universal kernels and characteristic kernels has been studied, kernel selection still …

Software and application patterns for explanation methods

M Alber - Explainable AI: interpreting, explaining and visualizing …, 2019 - Springer
Deep neural networks successfully pervaded many applications domains and are
increasingly used in critical decision processes. Understanding their workings is desirable …

Uncertainty-aware (una) bases for deep bayesian regression using multi-headed auxiliary networks

S Thakur, C Lorsung, Y Yacoby, F Doshi-Velez… - arXiv preprint arXiv …, 2020 - arxiv.org
Neural Linear Models (NLM) are deep Bayesian models that produce predictive
uncertainties by learning features from the data and then performing Bayesian linear …

Predicting pairwise relations with neural similarity encoders

F Horn, KR Müller - arXiv preprint arXiv:1702.01824, 2017 - arxiv.org
Matrix factorization is at the heart of many machine learning algorithms, for example,
dimensionality reduction (eg kernel PCA) or recommender systems relying on collaborative …

Detecting Local Insights from Global Labels: Supervised and Zero-Shot Sequence Labeling via a Convolutional Decomposition

A Schmaltz - Computational Linguistics, 2021 - direct.mit.edu
We propose a new, more actionable view of neural network interpretability and data analysis
by leveraging the remarkable matching effectiveness of representations derived from deep …

How to iNNvestigate neural networks' predictions!

M Alber, S Lapuschkin, P Seegerer, M Hägele… - 2018 - openreview.net
In recent years, deep neural networks have revolutionized many application domains of
machine learning and are key components of many critical decision or predictive processes …

[图书][B] Towards Efficient and Generalizable Natural Language Processing

H Peng - 2022 - search.proquest.com
Natural language processing (NLP) is having a paradigm shift. Scaling up in terms of the
sizes of models and data plays an increasingly important role. Despite the remarkable …

Understanding uncertainty in bayesian deep learning

C Lorsung - arXiv preprint arXiv:2106.13055, 2021 - arxiv.org
Neural Linear Models (NLM) are deep Bayesian models that produce predictive uncertainty
by learning features from the data and then performing Bayesian linear regression over …