Quantum package 2.0: An open-source determinant-driven suite of programs

Y Garniron, T Applencourt, K Gasperich… - Journal of chemical …, 2019 - ACS Publications
Quantum chemistry is a discipline which relies heavily on very expensive numerical
computations. The scaling of correlated wave function methods lies, in their standard …

Cere: Llvm-based codelet extractor and replayer for piecewise benchmarking and optimization

PDO Castro, C Akel, E Petit, M Popov… - ACM Transactions on …, 2015 - dl.acm.org
This article presents Codelet Extractor and REplayer (CERE), an open-source framework for
code isolation. CERE finds and extracts the hotspots of an application as isolated fragments …

Kerncraft: A tool for analytic performance modeling of loop kernels

J Hammer, J Eitzinger, G Hager, G Wellein - Tools for High Performance …, 2017 - Springer
Achieving optimal program performance requires deep insight into the interaction between
hardware and software. For software developers without an in-depth background in …

Quantum Monte Carlo with very large multideterminant wavefunctions

A Scemama, T Applencourt, E Giner… - Journal of …, 2016 - Wiley Online Library
An algorithm to compute efficiently the first two derivatives of (very) large multideterminant
wavefunctions for quantum Monte Carlo calculations is presented. The calculation of …

An automated tool for analysis and tuning of gpu-accelerated code in hpc applications

K Zhou, X Meng, R Sai, D Grubisic… - … on Parallel and …, 2021 - ieeexplore.ieee.org
The US Department of Energy's fastest supercomputers and forthcoming exascale systems
employ Graphics Processing Units (GPUs) to increase the computational performance of …

Automatic loop kernel analysis and performance modeling with kerncraft

J Hammer, G Hager, J Eitzinger, G Wellein - Proceedings of the 6th …, 2015 - dl.acm.org
Analytic performance models are essential for understanding the performance
characteristics of loop kernels, which consume a major part of CPU cycles in computational …

Mira: A framework for static performance analysis

K Meng, B Norris - 2017 IEEE International Conference on …, 2017 - ieeexplore.ieee.org
The performance model of an application can provide understanding about its runtime
behavior on particular hardware. Such information can be analyzed by developers for …

Comparing performance of C compilers optimizations on different multicore architectures

RS Machado, RB Almeida, AD Jardim… - … architecture and high …, 2017 - ieeexplore.ieee.org
Multithread programming tools become popular for exploitation of high performance
processing with the dissemination of multicore processors. In this context, it is also popular …

Is source-code isolation viable for performance characterization?

C Akel, Y Kashnikov… - 2013 42nd …, 2013 - ieeexplore.ieee.org
Source-code isolation finds and extracts the hotspots of an application as independent
isolated fragments of code, called codelets. Codelets can be modified, compiled, run, and …

VeriTracer: Context-enriched tracer for floating-point arithmetic analysis

Y Chatelain, PDO Castro, E Petit… - 2018 IEEE 25th …, 2018 - ieeexplore.ieee.org
VeriTracer automatically instruments a code and traces the accuracy of floating-point
variables over time. VeriTracer enriches the visual traces with contextual information such as …