Domain wall QCD with physical quark masses

T Blum, PA Boyle, NH Christ, J Frison, N Garron… - Physical Review D, 2016 - APS
We present results for several light hadronic quantities (f π, f K, BK, mud, ms, t 0 1/2, w 0)
obtained from simulations of 2+ 1 flavor domain wall lattice QCD with large physical volumes …

[图书][B] Parallel programming

T Rauber, G Rünger - 2013 - Springer
Innovations in hardware architecture, such as hyper-threading or multicore processors,
make parallel computing resources available for computer systems in different areas …

Efficient vector quantization of LPC parameters at 24 bits/frame

KK Paliwal, BS Atal - IEEE transactions on speech and audio …, 1993 - ieeexplore.ieee.org
For low bit rate speech coding applications, it is important to quantize the LPC parameters
accurately using as few bits as possible. Though vector quantizers are more efficient than …

A massively parallel tensor contraction framework for coupled-cluster computations

E Solomonik, D Matthews, JR Hammond… - Journal of Parallel and …, 2014 - Elsevier
Precise calculation of molecular electronic wavefunctions by methods such as coupled-
cluster requires the computation of tensor contractions, the cost of which has polynomial …

Quantum service-oriented computing: current landscape and challenges

E Moguel, J Rojo, D Valencia, J Berrocal… - Software Quality …, 2022 - Springer
The development that quantum computing technologies are achieving is beginning to attract
the interest of companies that could potentially be users of quantum software. Thus, it is …

Evaluation of Blue Gene/Q hardware support for transactional memories

A Wang, M Gaudet, P Wu, JN Amaral… - Proceedings of the 21st …, 2012 - dl.acm.org
This paper describes an end-to-end system implementation of the transactional memory
(TM) programming model on top of the hardware transactional memory (HTM) of the Blue …

Scalable hierarchical aggregation protocol (SHArP): A hardware architecture for efficient data reduction

RL Graham, D Bureddy, P Lui… - … in HPC (COMHPC), 2016 - ieeexplore.ieee.org
Increased system size and a greater reliance on utilizing system parallelism to achieve
computational needs, requires innovative system architectures to meet the simulation …

Transactional memory architecture and implementation for IBM System z

C Jacobi, T Slegel, D Greiner - 2012 45th Annual IEEE/ACM …, 2012 - ieeexplore.ieee.org
We present the introduction of transactional memory into the next generation IBM System z
CPU. We first describe the instruction-set architecture features, including requirements for …

Evaluation of hardware data prefetchers on server processors

M Bakhshalipour, S Tabaeiaghdaei… - ACM Computing …, 2019 - dl.acm.org
Data prefetching, ie, the act of predicting an application's future memory accesses and
fetching those that are not in the on-chip caches, is a well-known and widely used approach …

Supercomputing with commodity CPUs: Are mobile SoCs ready for HPC?

N Rajovic, PM Carpenter, I Gelado, N Puzovic… - Proceedings of the …, 2013 - dl.acm.org
In the late 1990s, powerful economic forces led to the adoption of commodity desktop
processors in high-performance computing. This transformation has been so effective that …