|
|
A Detailed Study of the Numerical Accuracy of GPU-Implemented Math Functions
Dan Fay, Ali Sazegari, Daniel A. Connors.
Supercomputing '06 Workshop on General-Purpose GPU Computing: Practice And Experience.
November,
2006.
|
Modern programmable GPUs have demonstrated their ability to
significantly accelerate certain important classes of non-graphics
applications; however, GPUs' slipshod support for floating-point
arithmetic severely limits their usefulness for general-purpose
computing. Current GPUs do not support double-precision computation
and their single-precision support glosses over important aspects of
the IEEE-754 floating-point standard. Producing correctly rounded
results and providing proper closure of the number system is critical
for the adaptation of GPUs for general purpose computing.
Previous studies of GPUs' numerical accuracy quantified only the
"overall" accuracy of different arithmetic and math functions on the
GPU by providing an average error and/or an error bounds for each
operation. Since many algorithms' correctness depends on the precise,
consistent results provided by the IEEE-754 floating-point
standard[5], it is also essential to exactly quantify the GPUs'
correctness for important edge cases. These edge cases deliberately
expose numeric errors likely to occur in IEEE-754 implementations,
such as inputs that involve denormalized numbers, +/- 0, infinities,
and Not a Number (NaN).
GPUs must also provide the programmer with rrobust results: given
the same input, the same program should produce the same output
regardless of the GPU platform. Such robustness needs to exist not
only between different GPU vendors, but also between different GPU
software platforms (shader language compiler, driver and operating
system) and between vendors' GPU families.
To investigate the issues of edge-case correctness and robustness,
we tested the accuracy of the basic arithmetic operators (add,
subtract, multiply and divide) as well as other important math
functions (sine, cosine, tangent, exponential, etc.). These tests
were run on a variety of different GPU platforms from both ATi and
nVIDIA. For the math functions, we tested the GPUs' results produced
using the math functions built in to the OpenGL Shading Language
(GLSL) along with the results produced with a GPU port of the
high-performance Cephes Math Library. Finally, we compared these
results against reference values produced by libm as well as by
vForce, Apple Computer's high-performance vectorized math library.
Our results show that there are serious errors with the GPUs'
results at certain edge cases, in addition to the incorrect handling
of denormalized numbers. One example of this is the incorrect
handling of -0. This causes problems with division; for example,
+1/-0 should equal -infinity, not +infinity. Another example is the
square root, where the sqrt() function in GLSL completely ignores the
sign of the operand, returning a positive normal number instead of a
NaN. Finally, we have observed inconsistencies between GPUs from
different vendors. An example of this is with 0/0: the nVIDIA GeForce
FX 7300 correctly produces a NaN result, while the ATi x1600 hardware
produces an incorrect result of +0.
|
| [ PDF ] |
|