There and back again: Optimizing the interconnect in networks of memory cubes
ACM SIGARCH Computer Architecture News, 2017•dl.acm.org
High-performance computing, enterprise, and datacenter servers are driving demands for
higher total memory capacity as well as memory performance. Memory" cubes" with high per-
package capacity (from 3D integration) along with high-speed point-to-point interconnects
provide a scalable memory system architecture with the potential to deliver both capacity
and performance. Multiple such cubes connected together can form a" Memory
Network"(MN), but the design space for such MNs is quite vast, including multiple topology …
higher total memory capacity as well as memory performance. Memory" cubes" with high per-
package capacity (from 3D integration) along with high-speed point-to-point interconnects
provide a scalable memory system architecture with the potential to deliver both capacity
and performance. Multiple such cubes connected together can form a" Memory
Network"(MN), but the design space for such MNs is quite vast, including multiple topology …
High-performance computing, enterprise, and datacenter servers are driving demands for higher total memory capacity as well as memory performance. Memory "cubes" with high per-package capacity (from 3D integration) along with high-speed point-to-point interconnects provide a scalable memory system architecture with the potential to deliver both capacity and performance. Multiple such cubes connected together can form a "Memory Network" (MN), but the design space for such MNs is quite vast, including multiple topology types and multiple memory technologies per memory cube.
In this work, we first analyze several MN topologies with different mixes of memory package technologies to understand the key tradeoffs and bottlenecks for such systems. We find that most of a MN's performance challenges arise from the interconnection network that binds the memory cubes together. In particular, arbitration schemes used to route through MNs, ratio of NVM to DRAM, and specific topologies used have dramatic impact on performance and energy results. Our initial analysis indicates that introducing non-volatile memory to the MN presents a unique tradeoff between memory array latency and network latency. We observe that placing NVM cubes in a specific order in the MN improves performance by reducing the network size/diameter up to a certain NVM to DRAM ratio. Novel MN topologies and arbitration schemes also provide performance and energy deltas by reducing the hop count of requests and response in the MN. Based on our analyses, we introduce three techniques to address MN latency issues: (1) Distance-based arbitration scheme to improve queuing latencies throughout the network, (2) skip-list topology, derived from the classic data structure, to improve network latency and link usage, and (3) the MetaCube, a denser memory cube that leverages advanced packaging technologies to improve latency by reducing MN size.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果