A Unified Approach to Adaptive Code Selection for Modern Systems

Joseph Blomstedt
M.S. Thesis, Department of Electrical and Computer Engineering, University of Colorado. May, 2008.
Code compilation and deployment is a much more complex affair today than it was in years past. Historically, the primary metric of compiled code was raw performance, and an individual simply compiled his code using the best available compiler and opti- mization flags to get the best resulting binary. Applications that required the utmost performance would take the step even further and utilize advanced techniques such em- ploying profile-guided optimizations. In recent years, however, additional metrics have arisen along side performance, such as power consumption, temperature dissipation, and overall system throughput. To further complicate the matter, the trend in modern microprocessor design is increasingly focused on multi-threaded architectures; and, by virtue of many engineering decisions all existing and proposed multi-threaded designs include some number of hardware subsystems that are shared between the various contexts of execution. Contention over these shared resources provides an additional dimension to evaluate. Simply compiling a given application and evaluating its performance in isolation is no longer the optimal approach, since characteristics of the eventual dynamic co-scheduled workload will invariably have unevaluated effects. To address this issue, this thesis presents ARCA - a compilation toolchain de- signed to generate numerous versions of a piece of code, profile and classify the various versions, and subsequently select between the different code versions at runtime based on dynamic workload characteristics. The primary contributions of the work are a version analysis and classification approach which allows a large set of candidate versions to be culled down to a useful subset, a set of three dynamic adaptation algorithms that are employed to guide the runtime selection between the available code versions, and the demonstration of a unified framework capable of addressing a number of emerging issues. In addition to these contributions, this work showcases the ARCA framework with two case studies: one focusing on performance improvement on multi-threaded systems, and one focusing on compromising temperature dissipation and performance. Addition ally, this work presents future research opportunities that the techniques presented in this thesis are well suited to address.

[ PDF ]