- Mar 3, 2017
- 1,623
- 5,894
- 136
The problem with e-cores is multiple. When you are using ALL cores 100% of the time, you lose a LOT of performance, and avx-512 is not even allowed, even in the early alderlake's before BIOS updates. I could go on. for Intel they may have a reason for desktop, for distributed computing, none. perf/watt then is a useless metic.Well, I think Stefan's original concern was w.r.t. what the perf/W will be for Zen5.
I'm suggesting that if it's for MT workloads, then E cores could have improved that, but it looks like there won’t be any on Zen5 DT.
So then the question is what perf/W improvement we can expect from the regular Zen5 cores. There's not been any leaks or even speculation about that IIRC. Or has someone found any such info, or would care to guesstimate?
Edit: do you see e-cores in Intel server chips ? Not that I know of (could be wrong), AMD c-cores have 100% functionality, just run a little slower for more density.
There is a 288 e-core (and therefore thread) only server chip coming soon. It will be stomped by 384 thread Turin, in both performance and functionality...
Only 352? What is this heresy :-DI think Stefan was talking about 32-128 fat cores with avx-512 for DT performance, but not speaking for him. Yes, he and I both already have EPYC systems, and Intel won't do, due to efficiency> I have 352 Genoa cores myself.
Be interesting to see areal density too.Now that we've smoked out the rat, I can't wait for details on Zen 5 LP.
Not just the perf, but what did they take out, power draw, etc.
Someone already mentioned that with the split AVX 512 implementation they had in Raphael, they can keep AVX 512 including in the LP cores. But will they?
I want a full breakdown of the thing, see what was taken off, etc. I love low power stuff.
But Genoa is 5nm, and Turin is supposed to be 3 or 4 nm ?Be interesting to see areal density too.
Given Bergamo was only a 1.33x increase in cores over Genoa and the Zen5 successor is supposed to be more like 1.5x there must be a significant difference in layout there too.
But Genoa is 5nm, and Turin is supposed to be 3 or 4 nm ?
we know it isn't 8 wide decode , it does something in decode but we don't know exactly what, two fetch blocks is all that is listed. is that parallel, used for branches etc .I'm preparing a little video (no I'm not trying to be a MLID/RGT, it's a different kind of video) about Zen 5, can we recap what we know about its internals?
- 8 wide decode
- Same or higher clocks
- SPECINT +40%
- full width AVX 512 implem
What else?
I’ve heard from people that I consider reliable that it uses N4X. I’ve got a hard time believing it since N4X was regarded as a bit of a meme.Turin is N4P or N4X (not sure which tbh) which is in the same family as N5. Just better/more refined.
Early Q3 is one estimate.Does anyone know when the 9000 series will hit the stores? Microcenter here I come!
April but now July.What about @adroc_thurston saying it would already be available? Surely he is an insider, no?
On a single socket? Are you able to utilize them properly or do you need to resort to putting your workloads in VMs for better core occupancy?I have 352 Genoa cores myself.
Not sure if the Win11 scheduler has been improved but Linux is supposedly better at dealing with hybrid cores: https://www.phoronix.com/news/Linux-6.5-Intel-Hybrid-SchedSome distributed computing enthusiasts do have Intel p+e CPUs, but even though an e core performs roughly similar to one p HT thread, these CPUs are still awkward to handle in a distributed computing node. Just recently I heard of weird issues with Windows' CPU time accounting on these CPUs. And way before that I saw several reports of performance problems of multithreaded distributed computing applications on these CPUs, which are completely to be expected and can only be worked around by restricting the application to run on cores of same type.
Special instructions to accelerate AI workloads. (Source: AMD slides)- 8 wide decode
- Same or higher clocks
- SPECINT +40%
- full width AVX 512 implem
What else?
you forget AI gimmickI'm preparing a little video (no I'm not trying to be a MLID/RGT, it's a different kind of video) about Zen 5, can we recap what we know about its internals?
- 8 wide decode
- Same or higher clocks
- SPECINT +40%
- full width AVX 512 implem
What else?
Intel can't even afford to put 16 fat cores. Power consumption would either shoot beyond 500W or the cores would be power starved if Intel limits the TDP.I think Stefan was talking about 32-128 fat cores with avx-512 for DT performance
What about: Possibly stagnant or even lower SMT uplift, despite beefier execution resources, due to much improved frontend; source: Hopium? :-)Possibly more performant SMT (due to beefier execution resources). (Source: Hopium)
My hopium is for >30% st uplift.What about: Possibly stagnant or even lower SMT uplift, despite beefier execution resources, due to much improved frontend; source: Hopium? :-)
It will be hopefully higher in certain cases. Some applications/games will benefit more from the expanded resources than others that are bottlenecked elsewhere, either due to bad programming or simple limits of x86 instruction execution.My hopium is for >30% st uplift.