A fair warning: this isn’t going to be terse and to the point. It’s a story (but packed with information, more or less useful).
A lesson: you’ll learn how the exact same hardware configuration and same software configuration and settings can still gimp your PC up to 10% of its speed. And you can’t do anything about it.
A few years ago I had to buy an LCD monitor. I was still using old, bulky CRTs and I was horrified when I tried side by side with a standard LCD. To summarize, I have a big problem with a number of aspects, but especially color banding. That is a thing in MOST consumer LCD monitors, because they “cheat” the spec and tell you they are 8bit monitor when instead, all of them, are 6bit + FRC, or a way to “simulate” a higher number of colors, leading to horrid color banding. 10bit color HDR monitors don’t have this shit, but of course the price tag is different…
But none of that matters, the important part is that when it was overdue time to buy the LCD I decided to go with one of the most common models by DELL, that usually had pretty good reviews and was generally recommended. Turns out the monitor is kind of bad, and after some research I found out the problem: starting from a certain point in time, DELL replaced the actual internal panel with a cheaper one from a completely different producer. I bought the EXACT same model, same ID that got good reviews for years. And so I learn this new lesson: sometime hardware vendors change the product and sell you something else that it’s not what you think it is. And it’s not simple to find out, unless you take the thing apart yourself…
There were then variations on the same. For example I remember that when ATI pushed out the r9 290 they sent to reviewers special selected models that performed much differently from the actual retail ones that people bought. And all the problems with thermal throttling and bios updates that followed.
Now I’m building a new system, to replace… a dual core E8400, 4Gb ram. An overdue update.
Since prices are a bit crazy right now, I decided to stay in the previous generation but aim relatively higher than I normally would. So Intel i9 10900k(f). Coming from another PC I have (but not use), it’s a 4770k, relatively recent. It has 4 cores, 3.5Ghz and 3.9 in turbo. It also overclocks fairly easy even if temperature goes up fast, but in general it can do 4.2 even on air.
But what I knew, up to this point, is that when CPU is busy it can easily stay at its 3.9 turbo and sustain it. It’s a real speed. What’s advertised is what you get.
Now I’m reading about the 10900k, and it’s a mess. It starts at 3.700 by default, so just 200+ from the base 4770k, but nominally it goes up to 5.3! But it’s a totally different story from the 3.9 turbo of the 4770k.
It basically turns your home PC into a phone, where the performance written on paper is useless, because it can keep it only in very short bursts before thermal throttling kicks in, and only on 2 of the 10 cores. The performance on all cores is set at 4.8, but that too is a lie, because it’s all still burst-limited. It can go to 4.8 on all cores… for a minute or less.
And then come in the motherboards manufacturers, that by default enable their new performance mode to “remove the limits”. This is called MCE (multi core enhancement) and it’s based on a terrible idea because removing those limits means the CPU is going to suck unreasonable amount of energy, get extremely hot very quickly, and then throttle back FASTER than it did with the default settings. The result is you consume more energy while losing performance. INNOVATION!
So, I bought a 10900k, and I have no idea of what actual speed it can ACTUALLY sustain. It’s going to be trial and error, and learn technical details that look more like alchemy. Setting obscure P2 states and all that stuff.
This CPU is a cheater.
But the problem here isn’t a CPU, and it isn’t even the problem of specific hardware parts that perform better or worse due to pure luck known as “silicon lottery”. The theme here is something I just discovered, that seems to make a rather big difference, but that no one usually considers.
What do we generally know? Nothing too complex. One usually goes for as much RAM capacity one can reasonably afford. Then you look at the max frequency, since that provides the bandwidth, and then, maybe you look at timings. It gets tricky because there isn’t any known formula to decide when lower timings are better than frequency. But that’s all, right?
No one gives a shit about these things, because when you look at a benchmark the difference is minimal:
See? That’s +3 fps, going from 3600 to 4000. Who cares?
That’s again, why no one cares about RAM. You get the amount you want, find a decent frequency with the budget you have, and done. No one will ever notice the difference.
But there’s this other thing that not many know about. You know about single channel, and dual channel. These days you generally can buy just dual channel modules. They come in two. It’s generally for the best. This because you can go to four, but there’s no difference because they are still in dual channel. And so you stick to two, because two are usually easier to kept stable than four.
What you might not know about is… single rank and dual rank.
The greater absurdity here is that THERE’S NO WAY to find out, when you buy a module. It’s a surprise. Either you find out by looking at a special code (but that no online shop will show you), or you’ll have to plug in the memory and use a software to probe the module itself.
What’s single rank and dual rank? Just the way the memory chips are arranged on the memory module. But the big deal is that it makes a rather significant DIFFERENCE IN PERFORMANCE. Because dual rank has basically a broader parallel access to the modules, so it is faster. Real world faster, not some benchmarks. I’ve seen tests where single rank can lose up to 10 fps, and that’s WAY MORE than the typical difference shown above.
What makes it worse is that the EXACT same memory module, same producer, same ID, can be single rank or dual rank. On paper, single rank is “newer” and better. It means that the actual memory chips are twice as large. They work better. But since they get access in single rank, they end up slower.
Now… Manufactures move from dual rank to single rank. Because for them is more efficient to produce. And more and more people (like me) find out that what until yesterday was a standard dual rank model, today becomes single rank, ending with a much different performance despite you bought the EXACT same hardware part, with the same price. Welcome, your brand new pc is now 10% slower. Same modules, same prices… at some point the hardware vendors made a sneaky update and started selling something that is 10% slower. And almost no one is aware of this, while maybe still fretting about frequency and timers that are a drop in the ocean when it comes to performance.
This is especially hitting what I suppose is now one of the most common, quite mainstream target: 16x2 Gb modules. It’s these that are now commonly “upgraded” to single rank.
It also makes the whole deal with dual channel more complex, because single/dual rank applies across the board. FOUR single rank modules perform like TWO dual rank ones. This means that if your memory is single rank YOU GET MORE BANDWIDTH by having FOUR modules onboard instead of two. This throws out of the window the rule that once you are in dual channel there’s no difference between two or four modules. Four is better (if single rank).
Yet, it might be slightly worse to have four sticks in dual channel, because dual channel stresses a bit more the CPU memory controller and especially with Ryzen this might be a step too far.
- two sticks in dual rank
- or four in single rank
- two sticks that are single rank, even if you are in that lovely dual channel
In many cases, with exact same timings, a dual rank 3200 modules performs up to 10% FASTER than a module at 3600, but single rank.
But you don’t know what to buy, because no one cares to tell you whether the module is single or dual rank.
I bought this:
It’s a fairly typical 16x2 3600 with 16CL. You can look at the “specifications” and you’ll find the usual information. But no one will tell if it’s dual or single rank. Neither will the online shops. And you won’t even know, because Crucial produced the exact same module BOTH in dual and single rank.
If it’s dual rank you’ll see the product code, it ends with M16FE1, if it’s single rank it ends with M8FB1. But this isn’t a code that is shown in the shops that sell this. You’ll find out when you receive it.
This is a Crucial rep from a year ago. “We only use Micron tuned die in our Crucial Ballistix line, that’s something we are commited to!”
Yes, because they care!
… Then later that year they moved from the E-die to a B-die, that is single rank and performs significantly worse. Now the greater majority of the models that are circulating are the “updated”, nerfed model. Same price, -10% performance, ACTUAL real use performance, and something only the hardware geeks find out about.
People went crazy when the Meltdown exploit forced a minimal loss of performance, especially outside of specific benchmarks. Now we have something significant that changes from a day to the other, and pretty much no one knows about it. It’s hidden away into some serial code.
Even the difference between different major CPUs isn’t 10%, in many real use cases.
(by the way, this also means that buying 4 sticks with 8Gb each might be significantly better, and cheaper, than buy 2 sticks 16Gb each. With four sticks you have dual rank guaranteed. And it’s also easier to find good timings with smaller modules.)