> I've pointed out to Matthew that using speculative loads (for PA2.0) > would help to preload cachelines. On the x86 athlon boxes it helps massively for memcpy performance too as a lot of our page copies are not cache->cache. The other place Arjan found was a big win which isnt yet mainstream is prefetching task structs in the scheduler