Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 425 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

StefanR5R

Elite Member
Dec 10, 2016
5,633
8,107
136
My take I guess is just my own, but its close to yours.
Well, not owning Zen 1 and Zen 3 myself, I don't ultimately trust my own assessments of them. Though back in the day, the Zen 1-->2 step evidently was a big one in perf/host and perf/W thanks to the Glofo 14nm --> TSMC 7nm switch, but not only due to that as the Zen 2 core and SOC update was far from a straightforward shrink.

The step which lies ahead, TSMC 5nm --> 4nm, will be nothing in comparison, yet AMD appears to widen the core a lot, presumably put a lot of smarts into the frontend to actually be able to put this width to use, yet at the same time will practically keep the power budget per core unchanged. I am really curious how that will turn out in power limited loads.

Efficiency is key for us,
Yep, as the aggregate core count in the household reaches certain above-average levels, and many of these cores are actually used 24/7 (be it for Citizen Science or for engineering jobs etc.), small things like the electric bill, the heat load in the home, or which computer to attach to which power circuit do become more of a concern. I find myself thinking more often in terms of perf/host and perf/W than perf/core. So, while the (alas rather circular) iso-clock performance discussions here in this thread are surely interesting (vulgo: IPC), what I am looking forward to more is to eventually get to see perf/W figures.
 

StefanR5R

Elite Member
Dec 10, 2016
5,633
8,107
136
For compute nodes,
– CPUs with cores of uneven per-core performance,​
– area-optimized cores​
are not attractive. You'd want
+ CPUs with homogeneous cores,​
+ cores and SOCs which are optimized towards a certain point between the three targets performance, performance efficiency, and performance density.​
The particular location of the optimization sweet spot depends on your cost structure (e.g. whether or not there are software licensing costs involved; whether or not rack space is at a premium to you…).

Edit, that's also true for home computers, if used for computing in the narrower sense, "HPC at home" if you will. E.g. when I built my first two dual-socket computers a while back, I needed not just plain perf/dollar (which would have been much better with desktop computers) but also perf/node (due to synchronization overhead in my application, which was too high over Ethernet for my purpose) and perf/core (due scaling difficulties in this application). If CPUs with "e cores" had been available back at that time, they would not have been what I needed due to the latter aspect. Edit 2, nowadays I accumulated enough computers that "rack space" (shelf space actually) is definitely a criterion to me too. (Energy consumption more so, though.)
 
Last edited:

Fjodor2001

Diamond Member
Feb 6, 2010
3,867
336
126
For compute nodes,
– CPUs with cores of uneven per-core performance,​
– area-optimized cores​
are not attractive. You'd want
+ CPUs with homogeneous cores,​
+ cores and SOCs which are optimized towards a certain point between the three targets performance, performance efficiency, and performance density.​
The particular location of the optimization sweet spot depends on your cost structure (e.g. whether or not there are software licensing costs involved; whether or not rack space is at a premium to you…).

Edit, that's also true for home computers, if used for computing in the narrower sense, "HPC at home" if you will. E.g. when I built my first two dual-socket computers a while back, I needed not just plain perf/dollar (which would have been much better with desktop computers) but also perf/node (due to synchronization overhead in my application, which was too high over Ethernet for my purpose) and perf/core (due scaling difficulties in this application). If CPUs with "e cores" had been available back at that time, they would not have been what I needed due to the latter aspect. Edit 2, nowadays I accumulated enough computers that "rack space" (shelf space actually) is definitely a criterion to me too. (Energy consumption more so, though.)
I guess it depends on what workloads you are running. Most people with DT systems do not have them mounted in racks. So space is not really a concern.

I think for a typical DT user with mixed workloads this is more important:
1. Max ST performance up to a certain amount of cores, e.g. ~8C.
2. For use cases with above ~8C, you want max MT perf, max perf/watt, and max E core count for lowest price.

For 1) you want P cores, and for 2) you want E cores. For those only needing 1), they can be satisfied with ~8C P cores only.
 

Fjodor2001

Diamond Member
Feb 6, 2010
3,867
336
126
Why not up to 16 threads, that is , 16C..?...

Because intel cant provide more than 8..?.
Could be more than 8C, which is why I wrote ~8C.

But I think at some point, if your targeting max MT perf, max perf/watt, and max core count at lowest price, then using only P cores often does not make sense (although there are exceptions).

Ideally you’d like to have separate SKUs tailored for each individual user’s needs. But that’s not realistic.

That said, I think e.g. both 8P+0E, 8P+16E, 16P+16E, and 8P+24E could make sense on DT. The question is how many SKUs are realistic to provide. But I think at least some mix of P and E cores would be good. Basically all other CPU manufacturers than AMD already provide that, both on mobile and desktop. I guess they’re doing it for some reasons.
 
Last edited:

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,670
14,676
136
If you want more P cores, then why don’t you buy an EPYC CPU. Or are you too poor for that?
I think Stefan was talking about 32-128 fat cores with avx-512 for DT performance, but not speaking for him. Yes, he and I both already have EPYC systems, and Intel won't do, due to efficiency> I have 352 Genoa cores myself.
 
Reactions: Tlh97 and StefanR5R

Abwx

Lifer
Apr 2, 2011
11,103
3,780
136
Could be more than 8C, which is why I wrote ~8C.

But I think at some point, if your targeting max MT perf, max perf/watt, and max core count at lowest price, then using only P cores often does not make sense (although there are exceptions).

Ideally you’d like to have separate SKUs tailored for each individual user’s needs. But that’s not realistic.

That said, I think e.g. both 8P+0E, 8P+16E, 16P+16E, and 8P+24E could make sense on DT. The question is how many SKUs are realistic to provide. But I think at least some mix of P and E cores would be good. Basically all other CPU manufacturers than AMD already provide that, both on mobile and desktop. I guess they’re doing it for some reasons.

You said on mixed work loads, so that s assuming some multitasking, hence some encoding FI would easily make use of 8 threads while you would have 8 others strong threads for other tasks, because i dont think that one who do nothing with his PC while it s encoding some files.

Now Intel s mixed cores can work as well but from 8 to 16 threads it would be at a big disadvantage peaking at 16 threads with 8 P + 8E facing 16P, and as the thread count increase over 16 the disadvantage would be reduced but only progressivly as the things would be equivalent only when getting close to 32 threads.
 

Fjodor2001

Diamond Member
Feb 6, 2010
3,867
336
126
I think Stefan was talking about 32-128 fat cores with avx-512 for DT performance, but not speaking for him. Yes, he and I both already have EPYC systems, and Intel won't do, due to efficiency> I have 352 Genoa cores myself.
Well, I think Stefan's original concern was w.r.t. what the perf/W will be for Zen5.

I'm suggesting that if it's for MT workloads, then E cores could have improved that, but it looks like there won’t be any on Zen5 DT.

So then the question is what perf/W improvement we can expect from the regular Zen5 cores. There's not been any leaks or even speculation about that IIRC. Or has someone found any such info, or would care to guesstimate?
 
Last edited:

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,670
14,676
136
Well, I think Stefan's original concern was w.r.t. what the perf/W will be for Zen5.

I'm suggesting that if it's for MT workloads, then E cores could have improved that, but it looks like there won’t be any on Zen5 DT.

So then the question is what perf/W improvement we can expect from the regular Zen5 cores. There's not been any leaks or even speculation about that IIRC. Or has someone found any such info, or would care to guesstimate?
The problem with e-cores is multiple. When you are using ALL cores 100% of the time, you lose a LOT of performance, and avx-512 is not even allowed, even in the early alderlake's before BIOS updates. I could go on. for Intel they may have a reason for desktop, for distributed computing, none. perf/watt then is a useless metic.

Edit: do you see e-cores in Intel server chips ? Not that I know of (could be wrong), AMD c-cores have 100% functionality, just run a little slower for more density.
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,633
8,107
136
Intel's e cores are currently only available in hybrid CPUs, that is, in combination with differently performing p cores. Already for this one reason, Intel's e cores are not suitable in compute nodes, as they require homogeneous performance and features across all cores.

Some distributed computing enthusiasts do have Intel p+e CPUs, but even though an e core performs roughly similar to one p HT thread, these CPUs are still awkward to handle in a distributed computing node. Just recently I heard of weird issues with Windows' CPU time accounting on these CPUs. And way before that I saw several reports of performance problems of multithreaded distributed computing applications on these CPUs, which are completely to be expected and can only be worked around by restricting the application to run on cores of same type. [EDIT: I wrote this post before seeing @Markfw's post which already points this out.]

Sometime soon, pure e core server CPUs should become available from Intel. But these many-small-cores CPUs are not targeted to HPC. They could be useful in some sub-niches of the HPC niche but won't be as flexible as pure p core CPUs.

Somewhat similar, Zen 4 dense is targeted to cloud hyperscalers and edge computing at this time, and certainly not to HPC. It remains to be seen whether or not Zen 5 dense's targets will extend further, but AFAICT classic HPC will still be served by Zen 5 non-dense.

On the topic of "efficiency": Let's not forget what Intel's e cores are primarily efficient at: In die area. Would have been nice if Intel had called them "ae cores", for "area efficient cores". In #10,842 I chose the terms area optimized, performance density optimized, and performance efficiency optimized. They are correlated to a degree but not the same. For FP performance density and efficiency, you don't go for cores with cut-down FP execution units. For cache-churning applications, performance and energy efficiency suffer if you go for CPUs with cut down caches…

EDIT 2: For energy efficiency at the application level, energy efficiency of the host (and rack…) is what counts. Due to base consumption by memory, cooling, and so on, very low power CPU cores are typically not putting you into the energy efficiency sweet spot of compute nodes, even in the lucky case that your application is able to scale to a large number of cores with negligible overhead.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |