Issues and challenges in the performance analysis of real disk arrays
E Varki, A Merchant, J Xu, X Qiu - IEEE Transactions on …, 2004 - ieeexplore.ieee.org
The performance modeling and analysis of disk arrays is challenging due to the presence of
multiple disks, large array caches, and sophisticated array controllers. Moreover, storage …
multiple disks, large array caches, and sophisticated array controllers. Moreover, storage …
Understanding and optimizing asynchronous low-precision stochastic gradient descent
Stochastic gradient descent (SGD) is one of the most popular numerical algorithms used in
machine learning and other domains. Since this is likely to continue for the foreseeable …
machine learning and other domains. Since this is likely to continue for the foreseeable …
Synthetic traces for trace-driven simulation of cache memories
Two techniques for producing synthetic address traces that produce good emulations of the
locality of reference of real programs are presented. The first algorithm generates synthetic …
locality of reference of real programs are presented. The first algorithm generates synthetic …
Systematic energy characterization of CMP/SMT processor systems via automated micro-benchmarks
R Bertran, A Buyuktosunoglu, MS Gupta… - 2012 45th Annual …, 2012 - ieeexplore.ieee.org
Microprocessor-based systems today are composed of multi-core, multi-threaded
processors with complex cache hierarchies and gigabytes of main memory. Accurate …
processors with complex cache hierarchies and gigabytes of main memory. Accurate …
Improved automatic testcase synthesis for performance model validation
RH Bell Jr, LK John - Proceedings of the 19th annual international …, 2005 - dl.acm.org
Performance simulation tools must be validated during the design process as functional
models and early hardware are developed, so that designers can be sure of the …
models and early hardware are developed, so that designers can be sure of the …
Performance cloning: A technique for disseminating proprietary applications as benchmarks
A Joshi, L Eeckhout, RH Bell… - 2006 IEEE International …, 2006 - ieeexplore.ieee.org
Many embedded real world applications are intellectual property, and vendors hesitate to
share these proprietary applications with computer architects and designers. This poses a …
share these proprietary applications with computer architects and designers. This poses a …
Synthesizing memory-level parallelism aware miniature clones for spec cpu2006 and implantbench workloads
We generate and provide miniature synthetic benchmark clones for modern workloads to
solve two pre-silicon design challenges, namely: 1) huge simulation time (weeks to months) …
solve two pre-silicon design challenges, namely: 1) huge simulation time (weeks to months) …
EMISSARY: Enhanced Miss Awareness Replacement Policy for L2 Instruction Caching
For decades, architects have designed cache replacement policies to reduce cache misses.
Since not all cache misses affect processor performance equally, researchers have also …
Since not all cache misses affect processor performance equally, researchers have also …
Synchronizing namespaces with invertible bloom filters
W Fu, HB Abraham, P Crowley - 2015 ACM/IEEE Symposium …, 2015 - ieeexplore.ieee.org
Data synchronization-long a staple in le systems-is emerging as a signicant communications
primitive. In a distributed system, data synchronization resolves di erences among …
primitive. In a distributed system, data synchronization resolves di erences among …
Fast and accurate exploration of multi-level caches using hierarchical reuse distance
Exploring the design space of the memory hierarchy requires the use of effective
methodologies, tools, and models to evaluate different parameter values. Reuse distance is …
methodologies, tools, and models to evaluate different parameter values. Reuse distance is …