|
|
The Design of Cost-Effective Stride-Prefetching for Modern Processors
Hassan Al-Sukhni, James Holt, Daniel A. Connors, Mike Snyder, Matt Smittle,
Brian Grayson
4th Workshop on Memory Performance Issues (WMPI-2006)
February,
2006.
|
Data prefetching of regular access patterns is an effective mechanism
to hide the memory latency for modern microprocessors. However, to be
included in an architecture design, prefetching systems must be
cost-effective and have little impact to the microarchitecture. For
example, while many proposed prefetching systems use the full program
counter (PC) to help detect patterns with arbitrary strides, such
systems are impractical and prohibitive. To overcome the issues
related to using the entire PC for effective prefetching, this
paper combines other instruction attributes with a small subset of the
PC to help detect the regularity in program data accesses. Such
detection is enabled by a finite state machine that resolves data
stream allocation, maintains prefetch priorities, and manages prefetch
run-ahead. The experimental results suggest that as little as 4 bits
of the PC are sufficient to achieve within 1% of the same prefetching
effectiveness as using the full PC.
|
| [ PDF ] |
|