PPT Slide
PREFETCH address+n*stride
Instead of computing n we can use a lookahead PC
Notes:
If the loop delay is less than the memory latency, we can multiply the stride by some n before adding it to the address. This will issue a prefetch n loop iterations ahead of the current. For example, if n equals two, then our prefetch will complete in time if the memory latency equals two loop iterations. The problem though is that n is difficult to estimate.
Instead of computing n, Chen and Baer proposed the use of a lookahead PC to time prefetches better. The lookahead PC is designed to be ahead of the regular PC by an amount equal to the memory latency. By indexing the reference prediction table with the lookahead PC, we can issue prefetches just in time.