A survey of numerical linear algebra methods utilizing mixed-precision arithmetic

A Abdelfattah, H Anzt, EG Boman… - … Journal of High …, 2021 - journals.sagepub.com
The efficient utilization of mixed-precision numerical linear algebra algorithms can offer
attractive acceleration to scientific computing applications. Especially with the hardware …

[图书][B] Numerical methods for least squares problems

Å Björck - 2024 - SIAM
Excerpt More than 25 years have passed since the first edition of this book was published in
1996. Least squares and least-norm problems have become more significant with every …

BLIS: A framework for rapidly instantiating BLAS functionality

FG Van Zee, RA Van De Geijn - ACM Transactions on Mathematical …, 2015 - dl.acm.org
The BLAS-like Library Instantiation Software (BLIS) framework is a new infrastructure for
rapidly instantiating Basic Linear Algebra Subprograms (BLAS) functionality. Its fundamental …

Elemental: A new framework for distributed memory dense matrix computations

J Poulson, B Marker, RA Van de Geijn… - ACM Transactions on …, 2013 - dl.acm.org
Parallelizing dense matrix computations to distributed memory architectures is a well-
studied subject and generally considered to be among the best understood domains of …

Efficient orthogonal parametrisation of recurrent neural networks using householder reflections

Z Mhammedi, A Hellicar, A Rahman… - … on Machine Learning, 2017 - proceedings.mlr.press
The problem of learning long-term dependencies in sequences using Recurrent Neural
Networks (RNNs) is still a major challenge. Recent methods have been suggested to solve …

Low synchronization Gram–Schmidt and generalized minimal residual algorithms

K Świrydowicz, J Langou, S Ananthan… - … Linear Algebra with …, 2021 - Wiley Online Library
Summary The Gram–Schmidt process uses orthogonal projection to construct the A= QR
factorization of a matrix. When Q has linearly independent columns, the operator P= I− Q …

Householder QR factorization with randomization for column pivoting (HQRRP)

PG Martinsson, G Quintana OrtÍ, N Heavner… - SIAM Journal on …, 2017 - SIAM
A fundamental problem when adding column pivoting to the Householder QR factorization is
that only about half of the computation can be cast in terms of high performing matrix-matrix …

Statistical structure analysis in MRI brain tumor segmentation

X Xuan, Q Liao - … Conference on Image and Graphics (ICIG …, 2007 - ieeexplore.ieee.org
Automated MRI (Magnetic Resonance Imaging) brain tumor segmentation is a difficult task
due to the variance and complexity of tumors. In this paper, a statistical structure analysis …

Gated Delta Networks: Improving Mamba2 with Delta Rule

S Yang, J Kautz, A Hatamizadeh - arXiv preprint arXiv:2412.06464, 2024 - arxiv.org
Linear Transformers have gained attention as efficient alternatives to standard Transformers,
but their performance in retrieval and long-context tasks has been limited. To address these …

randUTV: A blocked randomized algorithm for computing a rank-revealing UTV factorization

PG Martinsson, G Quintana-Orti… - ACM Transactions on …, 2019 - dl.acm.org
A randomized algorithm for computing a so-called UTV factorization efficiently is presented.
Given a matrix A, the algorithm “randUTV” computes a factorization A= UTV*, where U and V …