A very fast simulator for exploring the many-core future

O Certner, Z Li, A Raman… - 2011 IEEE International …, 2011 - ieeexplore.ieee.org
O Certner, Z Li, A Raman, O Temam
2011 IEEE International Parallel & Distributed Processing Symposium, 2011ieeexplore.ieee.org
Although multi-core architectures with a large number of cores (" many-cores'') are
considered the future of computing systems, there are currently few practical tools to quickly
explore both their design and general program scalability. In this paper, we present SiMany,
a discrete-event-based many-core simulator able to support more than a thousand cores
while being orders of magnitude faster than existing flexible approaches. One of the difficult
challenges for a reasonably realistic many-core simulation is to model faithfully the …
Although multi-core architectures with a large number of cores ("many-cores'') are considered the future of computing systems, there are currently few practical tools to quickly explore both their design and general program scalability. In this paper, we present SiMany, a discrete-event-based many-core simulator able to support more than a thousand cores while being orders of magnitude faster than existing flexible approaches. One of the difficult challenges for a reasonably realistic many-core simulation is to model faithfully the potentially high concurrency a program can exhibit. SiMany uses a novel virtual time synchronization technique, called spatial synchronization, to achieve this goal in a completely local and distributed fashion, which diminishes interactions and preserves locality. Compared to previous simulators, it raises the level of abstraction by focusing on modeling concurrent interactions between cores, which enables fast coarse comparisons of high-level architecture design choices and parallel programs performance. Sequential pieces of code are executed natively for maximal speed. We exercise the simulator with a set of dwarf-like task-based benchmarks with dynamic control flow and irregular data structures. Scalability results are validated through comparison with a cycle-level simulator up to 64 cores. They are also shown consistent with well-known benchmark characteristics. We finally demonstrate how SiMany can be used to efficiently compare the benchmarks' behavior over a wide range of architectural organizations, such as polymorphic architectures and network of clusters.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果