Slide 6 of 30
Notes:
Cache prefetching overcomes this restriction by bringing data to the L1 cache or an on-chip buffer to avoid as much as possible of the cache miss penalty. No register is allocated and the address is usually computed speculatively. This slide shows the ideal case where the prefetch address was correct and the prefetch was scheduled early enough to eliminate the processor stall completely.
The prefetch timing is important. If it is too late, then only part of the cache miss will be hidden. If it is much too early, then the prefetched data may be replaced from the cache before it is referenced.