Investigating the Potential of a GPU-based Math Library

Daniel Fay.
M.S. Thesis, Department of Electrical and Computer Engineering, University of Colorado. August, 2007.
In the last few years, Graphics Processing Units (GPUs) have evolved from be- ing a graphics-specific integrated circuit into a high performance programmable vec- tor/stream processor. Contemporary GPUs provide a compelling platform for running compute-intensive applications. In addition to tens of gigabytes per second of memory bandwidth, they also possess vast computation resources capable of achieving hun- dreds of giga-FLOPs of single precision floating-point computation power. Moreover, the consumer-oriented focus of contemporary GPUs means that even the highest end graphics cards cost well under a thousand dollars. Developments on the software side have also made GPU systems far more accessible for general-purpose use: new program- ming languages reduce the need for GPU programmers to understand esoteric graphics concepts, and high speed interconnect technologies improve CPU-GPU communication. Developing a high performance math library is one way to help programmers make full use of increasingly-powerful GPUs as well as to study the potential of using GPUs for general purpose applications. Math functions are a critical part of many high performance applications, and their use consumes a large percentage of many programs♠CPU times. In order for a GPU-based math library to be useful, it must provide accurate results. Similarly, it must show a performance and/or power consumption advantage over a CPU-based math library. This thesis investigates the potential of porting Apple, Inc.â™s vForce math library to four different GPUs found in current Apple computers. Using this hardware, the thesis investigates whether current GPU technology can be gainfully employed to run a high performance math library on the GPU. The thesis investigates the potential of a GPU-based math library using three metrics: accuracy, performance, and power. These three metrics are used to study the GPU-ported math library as it runs on the four GPUs. Comparisons are also made between the four different GPUs tested as well as against the CPU version of vForce.

[ PDF ]