Managing Performance-Reliability Tradeoffs in Multicore Processors

Managing Performance-Reliability Tradeoffs in Multicore Processors

William J. Song, Saibal Mukhopadhyay and Sudhakar Yalamanchili. “Managing Performance-Reliability Tradeoffs in Multicore Processors.” The 2015 IEEE International Reliability Physics Symposium (IRPS 2015). April 2015 (Best Student Paper).


There is a fundamental tradeoff between processor performance and lifetime reliability. High throughput operations increase power and heat dissipations that have adverse impacts on lifetime reliability. On the contrary, lifetime reliability favors low utilization to reduce stresses and avoid failures. A key challenge of understanding this tradeoff is in connecting application characteristics to device-level degradation behaviors. Using a full-system microarchitecture and physics simulation, the performance-reliability tradeoff in a multicore processor is analyzed by introducing a metric, throughput-lifetime product (TLP). A finding reveals that reducing the variance of degradation distribution on the multicore die leads to effectively enhancing processor lifetime with minimal impact on performance. This concept is referred to as dynamic reliability variance management (DRVM). We discuss three possible microarchitectural techniques that perform DRVM and improve the TLP; i) phase-aware thread migration, ii) dynamic voltage scaling, and iii) turbo-mode execution combined with DRVM. The simulation results with selected PARSEC and SPLASH-2 benchmarks show that DRVM techniques improve processor lifetime up to 15% or enhance the throughput-lifetime tradeoff by 12% without adding extra design margins or spare components on the multicore die.


paper [PDF]


author={William J. Song and Saibal Mukhopadhyay and Sudhakar Yalamanchili},
booktitle={The 2015 IEEE International Reliability Physics Symposium (IRPS 2015)},
title={Managing Performance-Reliability Tradeoffs in Multicore Processors},