I need help understand this

1
2
3
4
5
6
7
a 3 GHz CPU ....
whenever it fetches some data from the fastest cache, it is just one CPU cycle, all of a sudden,
 it needs a piece of data that is not in any of the caches so it has to wait for a full main memory
access, with some of the fastest memory available today, it takes 6 memory cycles to fetch a
 cache line from main memory, however, even with memory buses running at 512 MHz,
 the CPU will still have to wait about 36 CPU cycles, using faster processors or slower
 memory will make things even worse


why does the CPU have to wait 36 cycles exactly (How was that calculated) and why using faster processors will make things worse.
Last edited on
the number of cycles, I cannot say. It may be documented for a given hardware.

But say your piece of data is in ram. To use it, it has to do a page of ram into the cache, which is done with an algorithm (it figures out which page already in there needs to be removed to make room) and that takes a moment. Then there is the data copy: its a lot of bytes to a page, so there is a moment to copy it all. Memory and cpu are distinct hardware, and the memory has a speed, and the cpu has a speed. If the cpu needs data from memory, and memory takes 1 second to respond (just an easy example, its not this slow) then a 1 hz cpu loses 1 cycle waiting on ram. A 3ghz cpu loses 3billion cycles in that same second. The faster processor does not 'make it worse' it just 'spends more cpu clock cycles waiting'. The amount of wall clock time spent waiting on the cpus is identical, though. The author is looking at it from wasted cpu cycles perspective, so he is correct, but that is just one way to think about the situation; actual time spent is another way to look at it and holistically the faster cpu will at some point get ahead of the slower one when it has the data it needs in the cache for a while.

I don't recall how the layered cache part works in terms of burning time to copy and shuffle. But that may be part of the effect as well, if you want to get really down in the weeds.
Last edited on
why does the CPU have to wait 36 cycles exactly

It's just math.

At 512MHz, one memory cycle is 1.863x10-9 seconds. If it takes 6 memory cycles, that's 1.118x10-8 seconds.

If a CPU runs at 3GHz, then 1 CPU cycle is 3.104x10-10 seconds
Dividing the time for 6 memory cycles by the time for 1 CPU cycle gives 36.

If the processor is faster or the if the memory is slower then it takes even more CPU cycles to access main memory, which means the CPU is blocked even longer.

Last edited on
Topic archived. No new replies allowed.