Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

Exist50 · Jul 26, 2023

DrMrLordX said:
Only it isn't just the game being statically bound to the wrong CCD, it can also be the OS allowing threads to wander between CCDs.

That's just a scheduling problem, and a more difficult one for the 7950X3D than for a theoretical hybrid chip at that. Having two CCDs which are both best at ST, just under different workload conditions, is much harder to deal with than knowing that one will always be faster. We see that in practice games never just get pinned to the E-cores on Intel hybrid chip, so I think Windows scheduling is already where it needs to be in that regard.

DrMrLordX said:
If even a few threads wander to a high-density, low-clock CCD on a Zen5 desktop chip then performance could crater.

As mentioned, shouldn't be any worse than what we see today. If the OS allows a thread to wander, it would presumably be a perf-insensitive one. The hit from the other CCD seems to be more about the cross-CCD latency than clock speed. Might get a one or two games that need patching, but I doubt it'd be anything significant.

DrMrLordX said:
Traditionally they have been in the x86 market.

Yes, the top SKU usually puts out the top gaming numbers, but the difference is almost entirely clock speed, which can be normalized by overclocking, or AMD just making different SKU decisions. Either way, the gap is negligible vs a single CCD 8c chip. If you have a 4090 and money to burn, sure, go for it, but for your average gamer, even enthusiast, grab the 8c X3D and you'll never have to worry about it. Though I am assuming that AMD fixes the v-cache frequency penalty sooner or later.

DrMrLordX said:
If AMD joins Intel in the box of "8c is enough" and instead encourages the proliferation of many slow cores in one form or another, then it's going to discourage a certain amount of innovation in that regard.

Is that not the current reality anyway? The 8c CCD boundary is a very similar problem to Intel's 8 P-core limitation. Long term, I'm sure game devs will find ways to use all that compute sitting idle. And when you think about it, density vs peak performance might be an appealing tradeoff for consoles...

Markfw · Jul 26, 2023

Exist50 said:
There is nothing Intel-specific about what I wrote. I've shown you benchmarks before, and will not waste the time to repeat myself. Productivity apps make good use of hybrid, and there's no reason to believe that would not apply to an AMD hybrid offering.

But clearly you just wanted an excuse to go on your typical rant. I'd be shocked if you could go a single comment without mentioning your obsessive hatred of Intel.

Since AMD has no hybrid chips, you must be talking about Intel. And with Bergamo, they have made it clear that they will NOT be going with a hybrid approach.

As far as my "hatred of Intel", thats not true. I just would never buy or even use their chips vs AMD (currently) as they are slower and use more power. Those are facts, not opinion.

Geddagod · Jul 26, 2023

. And with Bergamo, they have made it clear that they will NOT be going with a hybrid approach.

They have not

As far as my "hatred of Intel", thats not true.

lol

eek2121 · Jul 26, 2023

Geddagod said:
They have not

AMD did, in fact, say that Bergamo was server only. They also did say they would not be taking the same approach as Intel. Neither of those statements rules out a hybrid approach, but the few serious rumors I have seen thus far indicate AMD is sticking with 8-16 cores on the desktop. They are even reusing the naming scheme.

Best guess for Zen 5 9000 series (desktop):

Healthy IPC increase, same clocks or slight clock regression, decent performance uplift of around 15-25% at similar power limits. Improved GPU. AI Instructions. Better power management. New features.

Motherboard chipsets and BIOS maturity much improved.

Mobile? Potentially a different story. Mobile could very well be a monolithic hybrid design. 4-8 high frequency cores, 8-16 lower clocking small cores.

This is also the only scenario that doesn’t drastically increase costs for AMD, and mobile could use more cores for both efficiency and performance reasons.

On the desktop you have a 170W power limit, so 16 big cores works great. You could never do that effectively with a 15-30W power limit, however, so on mobile we could see hybrid. The leaks also line up best with this scenario.

Oh and a bold prediction: Arrow Lake will be a huge step up for Intel in terms of perf/watt and will be an okay perf uplift, but Zen 5 will be faster thanks to a healthy IPC improvement.

Regarding your comments about @Markfw , uncalled for. IMO. Many of us are pretty neutral about who makes our hardware, as long as it performs great. Intel is significantly trailing AMD in perf/watt and so many of us won’t use them because of that. If they flipped tomorrow and started winning that race again, or they released a chip that significantly outperformed AMD, many of us would probably buy it. Same reason I bought a 4090: NVIDIA has a better product at the high end. AMD needs to step up the GPU game if they want to win me over.

Finally, sometimes folks here are super pessimistic about Intel’s ability to execute because they have over promised and under- delivered so many times. There is nothing wrong with that. Results are proven with actions, not words.

Geddagod · Jul 26, 2023

AMD did, in fact, say that Bergamo was server only.

No one here is contesting that?

They also did say they would not be taking the same approach as Intel.

Yes.

Neither of those statements rules out a hybrid approach,

So what @Markfw said was false, and was exactly what I disagreed with. Nothing about Bergamo has shown that AMD is not going with a hybrid approach, does Intel's Sierra Forest prove that Intel isn't going with a hybrid approach with Arrow Lake then?
AMD has not made it clear they ruled out a hybrid approach.
A lot of news sites are clickbaiting (cough wccftech) readers with this in their titles- but let's look at the exact quote from AMD:
"One is the notion that P-Cores and E-Cores that the competition uses is not the approach that we plan on taking at all"
Well that makes it cut and dry right? Look at the sentence right after it:
" Because I think the reality is that when you get to the point of having core types with different ISA capabilities or IPC or things like that, it makes it very complicated to ensure that the right workloads are scheduled on the right cores, consistently"
Different ISA and different IPC are both problems AMD avoids with Zen 4C. So it's obvious what AMD is referring to by the statement that they won't use "P+E-cores like the competition" isn't a nod to the use of P+E cores together in the products, but rather the development and design of the P+E cores themselves.
AMD mentions that using P+E cores in desktop is a harder argument to make, but there are more obvious places, and places where we will see a hybrid approach more quickly - laptops.

Mobile? Potentially a different story. Mobile could very well be a monolithic hybrid design. 4-8 high frequency cores, 8-16 lower clocking small cores.

Why wait for Zen 5? Apparently Zen 4 mobile will have Zen 4C cores as well.

Arrow Lake will be a huge step up for Intel in terms of perf/watt and will be an okay perf uplift, but Zen 5 will be faster thanks to a healthy IPC improvement.

That igor leak for ARL makes it look like Zen 5 will be a good a 10-20% faster in ST and MT

Regarding your comments about @Markfw , uncalled for.

All I said was "lol"
But I suggest you look through Markfw's comment history on Intel threads or about Intel.

Finally, sometimes folks here are super pessimistic about Intel’s ability to execute because they have over promised and under- delivered so many times. There is nothing wrong with that. Results are proven with actions, not words.

There's a difference between pessimism and outright fanboying, and I suspect many of here on Anandtech noticed Markfw cross that line... numerous times- in a repetitive fashion that has become recognizable as a pattern.
But, in respect to both Markfw and Anandtech's rules (as I very recently got a ban hammer for a day for cussing lol) I won't talk about his antics beyond this response for a bit.

bakyt115 · Jul 26, 2023

Abwx · Jul 26, 2023

Geddagod said:
There's a difference between pessimism and outright fanboying, and I suspect many of here on Anandtech noticed Markfw cross that line... numerous times- in a repetitive fashion that has become recognizable as a pattern.
But, in respect to both Markfw and Anandtech's rules (as I very recently got a ban hammer for a day for cussing lol) I won't talk about his antics beyond this response for a bit.

Ypu should look within a longer time frame, at a time he was using exclusively Intel CPUs for years and not a single AMD one IIRC...

moinmoin · Jul 26, 2023

DrMrLordX said:
That's a decision made by AMD when establishing clockspeed and power targets for each CCD under specific utilization scenarios.

How did you manage to bring Intel's outdated turbo tables into a Zen thread? The only frequency hardcoded in today's Zen chips should be the upper limit, everything else depends on cooling headroom.

DrMrLordX · Jul 26, 2023

Exist50 said:
Is that not the current reality anyway? The 8c CCD boundary is a very similar problem to Intel's 8 P-core limitation. Long term, I'm sure game devs will find ways to use all that compute sitting idle. And when you think about it, density vs peak performance might be an appealing tradeoff for consoles...

Intel has introduced this problem by going with an 8P core limit, instead of chasing higher counts of P cores the way AMD did up through the 7950X. If AMD takes the same path then there will be no real alternative. For the foreseeable future, we'll be stuck on 8 main cores and then low clock/low performance core spam. AMD has already leaned into that approach with the 7950X3D. Now whether you think game+scheduler will have a harder time balancing threads on a 7950X3D than on a hypothetical Zen5 with dense cores is purely speculation, but I do think having a dense CCD with low clocks could be just as problematic since it will have the same ISA etc. and will be indifferentiable from the main CCD except when it comes to clocks. So the scheduler won't necessarily know what's going on until it teases a few threads onto CCD1. Could lead to the infamous ping-ponging of threads moving between CCDs.

Hopefully Zen5 will do more to address interconnect speeds/latencies, making cache locality less of an issue for inter-CCD workloads (such as future games needing/wanting more than 8c).

moinmoin said:
How did you manage to bring Intel's outdated turbo tables into a Zen thread?

I didn't. See below.

moinmoin said:
The only frequency hardcoded in today's Zen chips should be the upper limit, everything else depends on cooling headroom.

Based on what we know from messing with PBO + Curve Optimizer, AMD has the capability to set whatever power/clockspeed target they want on a per-core or per-CCD basis. There's a lot of room for fine-grain control here. If AMD wanted to keep CCD0's clockspeed target high(er) when utilizing CCD1 then they could probably do so, at the cost of clocks on CCD1. Out-of-the-box they don't do so, and they still haven't offered a "gamer" mode where they attempt such behavior (instead, they let you just disable the second CCD). Again there's no free lunch, but they certainly have the controls available to shift power budget towards CCD0. Of course taking that behavior too far could lead to the same things I'm speculating could happen on future Zen5 products with a dense CCD: CCD1's clocks could become low enough that moving threads there would tank overall performance. In which case AMD might be better off raising power budget a bit for gaming workloads, but only for CCD0. As it stands, most 2 CCD Zen CPUs do not reach their full power budget when running games.

moinmoin · Jul 26, 2023

DrMrLordX said:
Based on what we know from messing with PBO + Curve Optimizer, AMD has the capability to set whatever power/clockspeed target they want on a per-core or per-CCD basis. There's a lot of room for fine-grain control here. If AMD wanted to keep CCD0's clockspeed target high(er) when utilizing CCD1 then they could probably do so, at the cost of clocks on CCD1. Out-of-the-box they don't do so, and they still haven't offered a "gamer" mode where they attempt such behavior (instead, they let you just disable the second CCD). Again there's no free lunch, but they certainly have the controls available to shift power budget towards CCD0. Of course taking that behavior too far could lead to the same things I'm speculating could happen on future Zen5 products with a dense CCD: CCD1's clocks could become low enough that moving threads there would tank overall performance. In which case AMD might be better off raising power budget a bit for gaming workloads, but only for CCD0. As it stands, most 2 CCD Zen CPUs do not reach their full power budget when running games.

I think we are talking about different thing. The part you are talking about is essentially binning, of course different cores and different CCDs have different qualities. Of course they are set accordingly, you don't want processes to end up on weak cores or the weak CCD if better cores/CCD are available.

What I was talking about is different voltage/frequency curves, mobile and likely also the dense Zen cores have a lower voltage/frequency curve that ends at a wall at a rather low frequency where going any higher kills all efficiency but below that essentially all frequencies are more efficient (Intel's E-cores showcase the same behaviour). Normal Zen cores have a higher voltage/frequency curve that has a gentler slope at the upper end, making higher frequencies somewhat more feasible. On desktop those higher frequencies are mostly used. But if all cores of a strong CCD try to max out their frequency power consumption = heat can hit the upper limit instead, throttling the frequency of all cores again. Depending on how both curves turn out there may be a possibility that giving a specific TDP a non-dense CCD at 8c max may well run at a lower frequency than 8c of a dense CCD. That's all.

dacostafilipe · Jul 26, 2023

Geddagod said:
Nothing about Bergamo has shown that AMD is not going with a hybrid approach

Let's say they put Zen4 and Zen4c in an APU. Would it really be hybrid? Isn't "the hybrid" we talk about more of something like ARM big.LITTLE? If the cores are 100% similar in terms of functionality but only differ in terms of performance (frequency, cache, ...) could we really call it hybrid? We already have CPUs where single cores boost/perform better than others ... where does hybrid start?

Exist50 · Jul 26, 2023

dacostafilipe said:
Let's say they put Zen4 and Zen4c in an APU. Would it really be hybrid?

AMD themselves call it hybrid, which I think is sufficient by itself. The scheduler doesn't care about the underlying RTL or physical design. Different means to the same end.

dr1337 · Jul 26, 2023

Exist50 said:
We see that in practice games never just get pinned to the E-cores on Intel hybrid chip, so I think Windows scheduling is already where it needs to be in that regard.

There are many benchmarks for games that run a lot better with E-cores off, WoW being a very prominent example. Frankly I've seen nothing good to be said by either vendor about their relationship with windows thread scheduling. I am not sure why you keep posting this rhetoric that there is some AMD scheduling problem when it literally crops up in Intel chips all the time too.

Here's the deal, even with the few games that hate the 7950x3d, the v-cache 16c chip is still faster on average in games than the base 7950x. Just as Intel has better performance with e-cores 95% of the time, so does AMD with the asymmetric CCDs. I think there legitimately would be a customer base for a 7990x3d/8990x3d with 8+16 with v-cache on the 8c die only. Even if a handful of games need process lasso for maximum performance.

The people considering a 24c/32t 14/13900k are most likely looking for raw thread count first and foremost, with strong gaming performance second. AMD could sell 48 threads in a halo gaming SKU and I have zero doubts It wouldn't chart better than a normal 8+8 chip even in games, all the while being an MT monster.

Exist50 · Jul 26, 2023

dr1337 said:
There are many benchmarks for games that run a lot better with E-cores off

That is certainly not what Raptor Lake benchmarks show.

dr1337 said:
I am not sure why you keep posting this rhetoric that there is some AMD scheduling problem when it literally crops up in Intel chips all the time too.

Huh? I was explaining why hybrid would not be an issue for gaming, if AMD decides to go that direction. I think 8+16 would be the natural evolution for AMD's halo product.

eek2121 · Jul 26, 2023

moinmoin said:
How did you manage to bring Intel's outdated turbo tables into a Zen thread? The only frequency hardcoded in today's Zen chips should be the upper limit, everything else depends on cooling headroom.

AMD does indeed put limits into their silicon (and AGESA). This was discussed elsewhere and I believe AMD also mentioned this during a video once. FMAX is only one of those limits.

Those limits are disabled when overclocking, of course.

moinmoin · Jul 26, 2023

dacostafilipe said:
where does hybrid start?

I think it's mostly a PR term at this point, filled with different meanings by different manufacturers. AMD is being pushed to have a "hybrid" solution of its own after Intel's push for "marketing cores" is widely seen as successful at ensuring Intel's competitiveness in desktop. Intel's E-cores are mainly good for being area efficient, and so are AMD's dense cores.

But I'm getting the impression AMD likes to call the inclusion of the AI Engine hybrid as well.

Exist50 · Jul 26, 2023

moinmoin said:
But I'm getting the impression AMD likes to call the inclusion of the AI Engine hybrid as well.

That's more under the umbrella of what they used to call "Fusion". I think if AMD talks about hybrid in marketing, they will either use the term for the mix of CPU cores, or just invent a new term altogether. Well, that would at least be the sane way to brand things. We'll see what their marketing department thinks...

moinmoin · Jul 26, 2023

Exist50 said:
That's more under the umbrella of what they used to call "Fusion". I think if AMD talks about hybrid in marketing, they will either use the term for the mix of CPU cores, or just invent a new term altogether. Well, that would at least be the sane way to brand things. We'll see what their marketing department thinks...

No, Fusion is from a decade ago. I was thinking of Papermaster's interview from May:

"Paul Alcorn: So, it's probably safe to say that a hybrid architecture will be coming to client [consumer PCs]?

Mark Papermaster: Absolutely. It's already there today, and you'll see more coming."

AMD to Make Hybrid CPUs, Also Using AI for Chip Design: CTO Papermaster at ITF World

More cores, with a new twist.

www.tomshardware.com

There is no big.LITTLE alike product on the market by AMD right now. (Big) Phoenix with its AI Engine however is already available. While Little Phoenix with its supposed mix of cores is MIA (and people like Ian Cutress talk like it doesn't exist/isn't mixing cores). Or what do you think Papermaster is referring to?

jpiniero · Jul 26, 2023

moinmoin said:
There is no big.LITTLE alike product on the market by AMD right now. (Big) Phoenix with its AI Engine however is already available. While Little Phoenix with its supposed mix of cores is MIA (and people like Ian Cutress talk like it doesn't exist/isn't mixing cores). Or what do you think Papermaster is referring to?

Little Phoenix is probably what he's talking about.

Exist50 · Jul 26, 2023

moinmoin said:
No, Fusion is from a decade ago. I was thinking of Papermaster's interview from May:

"Paul Alcorn: So, it's probably safe to say that a hybrid architecture will be coming to client [consumer PCs]?

Mark Papermaster: Absolutely. It's already there today, and you'll see more coming."

AMD to Make Hybrid CPUs, Also Using AI for Chip Design: CTO Papermaster at ITF World

More cores, with a new twist.

www.tomshardware.com

There is no big.LITTLE alike product on the market by AMD right now. (Big) Phoenix with its AI Engine however is already available. While Little Phoenix with its supposed mix of cores is MIA (and people like Ian Cutress talk like it doesn't exist/isn't mixing cores). Or what do you think Papermaster is referring to?

I think it's useful to include the context:

But what you'll also see is more variations of the cores themselves, you'll see high-performance cores mixed with power-efficient cores mixed with acceleration. So where, Paul, we're moving to now is not just variations in core density, but variations in the type of core, and how you configure the cores. It's not only how you've optimized for either performance or energy efficiency, but stacked cache for applications that can take advantage of it, and accelerators that you put around it.

When you go to the data center, you're also going to see a variation. Certain workloads move more slowly [...] You might be in that sweet spot of 16 to 32 cores on a server. But many businesses are indeed adding point AI applications and analytics. As AI moves from not only being in the cloud, where the heavy training and large language model inferencing will continue, but you're going to see AI applications in the edge. And it's going to be in enterprise data centers as well. They're also going to need different core counts and accelerators.
Paul Alcorn: So, it's probably safe to say that a hybrid architecture will be coming to client [consumer PCs]?

Mark Papermaster: Absolutely. It's already there today, and you'll see more coming.

So it's tough to say what he means in particular, but he is lumping in V-cache SKUs under the "hybrid" umbrella, so maybe that's the intended reference?

All just semantics at the end of the day, but I hope they don't overload existing terms too much, at least in anything public-facing.

dr1337 · Jul 26, 2023

Exist50 said:
That is certainly not what Raptor Lake benchmarks show.

9 games having a significant preference for E cores disabled certainly isn't nothing. And with at least 25 games being basically equal, within a realistic margin of error, it would seem E cores are seriously under-utilized in gaming. Makes me wonder now how this chart would look if both configurations were at maximum OC. Surely turning off 16 extra cores would net more headroom for the P cores to clock a bit higher.

Exist50 · Jul 26, 2023

dr1337 said:
9 games having a significant preference for E cores disabled certainly isn't nothing

But the average is 1%, and at 1080p with a 4090. Higher resolution and ray tracing would all be likely to flip that gap. Either way, I don't think that's a meaningful difference. Plus, most of the penalty is likely to be the still extant (albeit smaller) hit to ring clock speeds, which AMD would not have a problem with in a two CCD hybrid approach.

dr1337 said:
it would seem E cores are seriously under-utilized in gaming

Well yeah, games in general make terrible use of modern high end chips. That's what I was saying earlier about the second CCD for the 7950x being useless for gaming. If you have the game pinned to one CCD, as it typical right now, whatever you do to the other is unlikely to matter. That's why I think it's much more sensible to talk about the productivity benefit of a hybrid setup.

dr1337 said:
Surely turning off 16 extra cores would net more headroom for the P cores to clock a bit higher.

For both Intel and AMD, gaming workloads simply do not push the processors' power limits. For example, ComputerBase measured 141W for the 13900k, 105W for the 7950X, and 72W for the 7950X3D. All well below their respective boost power limits. Thermal density (cooling) and silicon limitations are more likely to limit gaming performance than power.

Also, that's a problem better suited for the scheduler to solve. If some threads can run more efficiently on dense cores, that will better free up headroom than disabling them.

eek2121 · Jul 27, 2023

Here is your answer for hybrid:

AMD confirms Ryzen 3 7440U features 6-core Phoenix2 APU with Zen4 hybrid design - VideoCardz.com

The smaller AMD Phoenix APU features hybrid architecture Ryzen 7540U and 7440U have a combination of Zen4 and Zen4c cores. AMD Phoenix2 APU, Source: Golden Pig Upgrade In a recent exchange with XDA-Developers, AMD has officially acknowledged that the AMD Phoenix APU used in the Ryzen 3 7440U...

videocardz.com

If this is true, looks like it is launching on the mid-low end.

AMD managed to get the die size down to 137mm2. I am curious as to how they are doing this. Multi-die or mixing HP/HD libs on one die? Hopefully we get details soon.

Side Note: Would love to see them add (“small”) cores to the IO die for Zen 5. That would be one way to add cores without having to make multiple types of chiplets.

biostud · Jul 27, 2023

Can the "c" and regular cores be made on a monolithic silicon or do they require two different pieces of silicon?

Geddagod · Jul 27, 2023

biostud said:
Can the "c" and regular cores be made on a monolithic silicon or do they require two different pieces of silicon?

Zen 4C uses the same node as Zen 4 IIRC so it shouldn't be a problem

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Platinum Member

Moderator Emeritus, Elite Member

Golden Member

Platinum Member

Golden Member

Member

Lifer

Diamond Member

Lifer

Diamond Member

Senior member

Platinum Member

Senior member

Platinum Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Lifer

Platinum Member

Senior member

Platinum Member

Platinum Member

Lifer

Golden Member