Benchmarking TPU, GPU, and CPU platforms for deep learning
Training deep learning models is compute-intensive and there is an industry-wide trend
towards hardware specialization to improve performance. To systematically benchmark …
towards hardware specialization to improve performance. To systematically benchmark …
DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …
ing performance, scalability, and energy efficiency in modern systems. Computer systems …
Fathom: Reference workloads for modern deep learning methods
Deep learning has been popularized by its recent successes on challenging artificial
intelligence problems. One of the reasons for its dominance is also an ongoing challenge …
intelligence problems. One of the reasons for its dominance is also an ongoing challenge …
REVAMP: A systematic framework for heterogeneous CGRA realization
Coarse-Grained Reconfigurable Architectures (CGRAs) provide an excellent balance
between performance, energy efficiency, and flexibility. However, increasingly sophisticated …
between performance, energy efficiency, and flexibility. However, increasingly sophisticated …
A systematic methodology for analysis of deep learning hardware and software platforms
Training deep learning models is compute-intensive and there is an industry-wide trend
towards hardware and software specialization to improve performance. To systematically …
towards hardware and software specialization to improve performance. To systematically …
Dynamic resource management of heterogeneous mobile platforms via imitation learning
The complexity of heterogeneous mobile platforms is growing at a rate faster than our ability
to manage them optimally at runtime. For example, state-of-the-art systems-on-chip (SoCs) …
to manage them optimally at runtime. For example, state-of-the-art systems-on-chip (SoCs) …
The accelerator wall: Limits of chip specialization
A Fuchs, D Wentzlaff - 2019 IEEE International Symposium on …, 2019 - ieeexplore.ieee.org
Specializing chips using hardware accelerators has become the prime means to alleviate
the gap between the growing computational demands and the stagnating transistor budgets …
the gap between the growing computational demands and the stagnating transistor budgets …
Dypo: Dynamic pareto-optimal configuration selection for heterogeneous mpsocs
Modern multiprocessor systems-on-chip (MpSoCs) offer tremendous power and
performance optimization opportunities by tuning thousands of potential voltage, frequency …
performance optimization opportunities by tuning thousands of potential voltage, frequency …
An energy-aware online learning framework for resource management in heterogeneous platforms
Mobile platforms must satisfy the contradictory requirements of fast response time and
minimum energy consumption as a function of dynamically changing applications. To …
minimum energy consumption as a function of dynamically changing applications. To …
A deep Q-learning approach for dynamic management of heterogeneous processors
Heterogeneous multiprocessor system-on-chips (SoCs) provide a wide range of parameters
that can be managed dynamically. For example, one can control the type (big/little), number …
that can be managed dynamically. For example, one can control the type (big/little), number …