Domain wall QCD with physical quark masses
We present results for several light hadronic quantities (f π, f K, BK, mud, ms, t 0 1/2, w 0)
obtained from simulations of 2+ 1 flavor domain wall lattice QCD with large physical volumes …
obtained from simulations of 2+ 1 flavor domain wall lattice QCD with large physical volumes …
[图书][B] Parallel programming
T Rauber, G Rünger - 2013 - Springer
Innovations in hardware architecture, such as hyper-threading or multicore processors,
make parallel computing resources available for computer systems in different areas …
make parallel computing resources available for computer systems in different areas …
Efficient vector quantization of LPC parameters at 24 bits/frame
KK Paliwal, BS Atal - IEEE transactions on speech and audio …, 1993 - ieeexplore.ieee.org
For low bit rate speech coding applications, it is important to quantize the LPC parameters
accurately using as few bits as possible. Though vector quantizers are more efficient than …
accurately using as few bits as possible. Though vector quantizers are more efficient than …
A massively parallel tensor contraction framework for coupled-cluster computations
Precise calculation of molecular electronic wavefunctions by methods such as coupled-
cluster requires the computation of tensor contractions, the cost of which has polynomial …
cluster requires the computation of tensor contractions, the cost of which has polynomial …
Quantum service-oriented computing: current landscape and challenges
The development that quantum computing technologies are achieving is beginning to attract
the interest of companies that could potentially be users of quantum software. Thus, it is …
the interest of companies that could potentially be users of quantum software. Thus, it is …
Evaluation of Blue Gene/Q hardware support for transactional memories
This paper describes an end-to-end system implementation of the transactional memory
(TM) programming model on top of the hardware transactional memory (HTM) of the Blue …
(TM) programming model on top of the hardware transactional memory (HTM) of the Blue …
Scalable hierarchical aggregation protocol (SHArP): A hardware architecture for efficient data reduction
RL Graham, D Bureddy, P Lui… - … in HPC (COMHPC), 2016 - ieeexplore.ieee.org
Increased system size and a greater reliance on utilizing system parallelism to achieve
computational needs, requires innovative system architectures to meet the simulation …
computational needs, requires innovative system architectures to meet the simulation …
Transactional memory architecture and implementation for IBM System z
C Jacobi, T Slegel, D Greiner - 2012 45th Annual IEEE/ACM …, 2012 - ieeexplore.ieee.org
We present the introduction of transactional memory into the next generation IBM System z
CPU. We first describe the instruction-set architecture features, including requirements for …
CPU. We first describe the instruction-set architecture features, including requirements for …
Evaluation of hardware data prefetchers on server processors
M Bakhshalipour, S Tabaeiaghdaei… - ACM Computing …, 2019 - dl.acm.org
Data prefetching, ie, the act of predicting an application's future memory accesses and
fetching those that are not in the on-chip caches, is a well-known and widely used approach …
fetching those that are not in the on-chip caches, is a well-known and widely used approach …
Supercomputing with commodity CPUs: Are mobile SoCs ready for HPC?
In the late 1990s, powerful economic forces led to the adoption of commodity desktop
processors in high-performance computing. This transformation has been so effective that …
processors in high-performance computing. This transformation has been so effective that …