Hi-fi playback: Tolerating position errors in shift operations of racetrack memory

C Zhang, G Sun, X Zhang, W Zhang, W Zhao… - Proceedings of the …, 2015 - dl.acm.org
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015dl.acm.org
Racetrack memory is an emerging non-volatile memory based on spintronic domain wall
technology. It can achieve ultra-high storage density. Also, its read/write speed is
comparable to that of SRAM. Due to the tape-like structure of its storage cell, a" shift"
operation is introduced to access racetrack memory. Thus, prior research mainly focused on
minimizing shift latency/energy of racetrack memory while leveraging its ultra-high storage
density. Yet the reliability issue of a shift operation, however, is not well addressed. In fact …
Racetrack memory is an emerging non-volatile memory based on spintronic domain wall technology. It can achieve ultra-high storage density. Also, its read/write speed is comparable to that of SRAM. Due to the tape-like structure of its storage cell, a "shift" operation is introduced to access racetrack memory. Thus, prior research mainly focused on minimizing shift latency/energy of racetrack memory while leveraging its ultra-high storage density. Yet the reliability issue of a shift operation, however, is not well addressed. In fact, racetrack memory suffers from unsuccessful shift due to domain misalignment. Such a problem is called "position error" in this work. It can significantly reduce mean-time-to-failure (MTTF) of racetrack memory to an intolerable level. Even worse, conventional error correction codes (ECCs), which are designed for "bit errors", cannot protect racetrack memory from the position errors.
In this work, we investigate the position error model of a shift operation and categorize position errors into two types: "stop-in-middle" error and "out-of-step" error. To eliminate the stop-in-middle error, we propose a technique called sub-threshold shift (STS) to perform a more reliable shift in two stages. To detect and recover the out-of-step error, a protection mechanism called position error correction code (p-ECC) is proposed. We first describe how to design a p-ECC for different protection strength and analyze corresponding design overhead. Then, we further propose how to reduce area cost of p-ECC by leveraging the "overhead region" in a racetrack memory stripe. With these protection mechanisms, we introduce a position-error-aware shift architecture. Experimental results demonstrate that, after using our techniques, the overall MTTF of racetrack memory is improved from 1.33μs to more than 69 years, with only 0:2% performance degradation. Trade-off among reliability, area, performance, and energy is also explored with comprehensive discussion.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果