Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

Kepler_L2 · May 14, 2024

SteinFG said:
With a quad-core LP cluster on the SOC die, Strix Halo will have 20 cores, right? I assume Zen 6 desktop will also get LP cluster (bacause it'll share the SOC tile with Fire Range succesor, most likely), thus slightly bumping core count too.

Not exactly, the LP island is supposed to be invisible and transition seamlessly depending on the load.

Mahboi · May 14, 2024

SteinFG said:
With a quad-core LP cluster on the SOC die, Strix Halo will have 20 cores, right? I assume Zen 6 desktop will also get LP cluster (bacause it'll share the SOC tile with Fire Range succesor, most likely), thus slightly bumping core count too.

My bet was Strix Point(4 Z5, 8 Z5c) + LP island of 4 Z5 LP cores on the SoC.
Why do you think it'll have 20 cores?

Kepler_L2 · May 14, 2024

Mahboi said:
My bet was Strix Point(4 Z5, 8 Z5c) + LP island of 4 Z5 LP cores on the SoC.
Why do you think it'll have 20 cores?

Halo doesn't have any Zen5c cores.

Mahboi · May 14, 2024

Kepler_L2 said:
Halo doesn't have any Zen5c cores.

That's a weird choice for a mobile product...
They're really going for the Phat thing eh.

biostud · May 14, 2024

adroc_thurston said:
MBP buyers? they sort of know what they're doing, those things start at 2 kilobuck.

Sure, some pros do, but there are still people who buy the most expensive because it must be better and then use their MBP purely for office work.

I don't have a problem with this as people can buy their computer for whatever reason, I'm just questioning how small a fraction of buyers knows the differences between an AMD, Intel, or Apple based computer.

I might be biased because I work with teenagers in a school environment though.

Joe NYC · May 14, 2024

SteinFG said:
With a quad-core LP cluster on the SOC die, Strix Halo will have 20 cores, right? I assume Zen 6 desktop will also get LP cluster (bacause it'll share the SOC tile with Fire Range succesor, most likely), thus slightly bumping core count too.

The LP cores will be on Strix Halo SoC die, which will likely be N3E.

Whether AMD migrates LP cores to N6 node - remains to be seen (unlikely) but it would be useful.

ryanjagtap · May 14, 2024

There is also TSMC's N4C node as an option. It uses the same design infrastructure as the N4P node, but with fewer masking layers and some rearchitecting in the SRAM and standard cells. Most probably a cheaper N4P with some minor changes.

https://www.anandtech.com/show/21371/tsmc-preps-lower-cost-4nm-n4c-process-for-2025

ToTTenTranz · May 14, 2024

Joe NYC said:
Strix Scenario
- 128 bit (memory width) N6 base die with memory controllers, MALL, IO, analog
- ~20 CU GPU stacked on top ~80 mm2

I think it's borderline criminal that Strix Point doesn't have any LLC for the GPU. That APU will be the king of 540p rasterization I guess.

It would be interesting to know how bandwidth-intensive are all the temporal upscalers and frame generators, though. I don't know if anyone ever did such a study.

Kepler_L2 said:
Halo doesn't have any Zen5c cores.

Why though? AMD isn't making CCDs with Zen5c?
Doesn't this mean a chop on power efficiency for lighter loads, or do LP cores actually take over the regular ones when lighter loads are detected?

I also wonder if Halo's Zen5 CCDs can take Vcache on top.

Joe NYC said:
The LP cores will be on Strix Halo SoC die, which will likely be N3E.

Whether AMD migrates LP cores to N6 node - remains to be seen (unlikely) but it would be useful.

It's always useful to put cores into higher-end nodes (good scaling for power and area).
AFAIK it's cache, ROPs and anything analog that can be put into older nodes without losing too much.

Kepler_L2 · May 14, 2024

ToTTenTranz said:
I think it's borderline criminal that Strix Point doesn't have any LLC for the GPU.

Thanks Microsoft

ToTTenTranz said:
Why though? AMD isn't making CCDs with Zen5c?
Doesn't this mean a chop on power efficiency for lighter loads, or do LP cores actually take over the regular ones when lighter loads are detected?

Lighter loads -> work is transferred to LP cores. Heavier loads -> work is transferred to real cores. Supposed to happen automatically and transparent to OS but we'll see how it works out

ToTTenTranz said:
I also wonder if Halo's Zen5 CCDs can take Vcache on top.

Doubt it, you're never going to be bottlenecked by Zen5 with just 40 CUs of RDNA3.5

Timorous · May 14, 2024

Kepler_L2 said:
you're never going to be bottlenecked by Zen5 with just 40 CUs of RDNA3.5

Stellaris and HOI4 say 'give me moar tics'

Mahboi · May 14, 2024

Kepler_L2 said:
Lighter loads -> work is transferred to LP cores. Heavier loads -> work is transferred to real cores. Supposed to happen automatically and transparent to OS but we'll see how it works out

That bit is obvious, but the compressed cores are supposed to be slightly more efficient in a constrained power situation, and also they're just less area.
I honestly find it a bit jarring that they decided to go all or nothing. I doubt that almost any workload will truly benefit from 12 Z5 cores vs 4 Z5 cores for "pure perf" and 8 Z5c for "slightly below pure perf". It's just a really strange design decision.

ToTTenTranz · May 14, 2024

Kepler_L2 said:
Doubt it, you're never going to be bottlenecked by Zen5 with just 40 CUs of RDNA3.5

Have you ever heard the tale of Darth Unoptimize the Enlarger of Frametimes?

BorisTheBlade82 · May 14, 2024

Mahboi said:
That bit is obvious, but the compressed cores are supposed to be slightly more efficient in a constrained power situation, and also they're just less area.
I honestly find it a bit jarring that they decided to go all or nothing. I doubt that almost any workload will truly benefit from 12 Z5 cores vs 4 Z5 cores for "pure perf" and 8 Z5c for "slightly below pure perf". It's just a really strange design decision.

They simply will not have designed an 8c Zen5c CCD just for such a niche product.

BorisTheBlade82 · May 14, 2024

Kepler_L2 said:
Thanks Microsoft

Lighter loads -> work is transferred to LP cores. Heavier loads -> work is transferred to real cores. Supposed to happen automatically and transparent to OS but we'll see how it works out

Will be highly interesting to see how this works out and what strategy for switching the threads over they might employ.
Will they switch on a thread level, on a core level (thread pairs), what thresholds, etc. ...

maddie · May 14, 2024

Mahboi said:
That bit is obvious, but the compressed cores are supposed to be slightly more efficient in a constrained power situation, and also they're just less area.
I honestly find it a bit jarring that they decided to go all or nothing. I doubt that almost any workload will truly benefit from 12 Z5 cores vs 4 Z5 cores for "pure perf" and 8 Z5c for "slightly below pure perf". It's just a really strange design decision.

Halo uses chiplets. How would you partition the APU functions using the existing chiplets as much as possible?

jpiniero · May 14, 2024

Kepler_L2 said:
Thanks Microsoft

You'll laugh but I think if Strix Halo ends up being popular with OEMs, it'll be because of AI.

Jan Olšan · May 14, 2024

Makaveli said:
They have also used comments from this thread to source articles.

I did that too... Sometimes there is news drought, sometimes the info is too attractive. IIRC, there were cases when leaked docs were directly pasted here. Obviously you have to be superprudent in trying to judge what may be a legit informed person and what's a random forum dude making stuff up. No offense...

Kepler_L2 said:
Thanks Microsoft

Lighter loads -> work is transferred to LP cores. Heavier loads -> work is transferred to real cores. Supposed to happen automatically and transparent to OS but we'll see how it works out

Doubt it, you're never going to be bottlenecked by Zen5 with just 40 CUs of RDNA3.5

Wait wait wait, where did the talk of LP island/LP cores in Strix (Halo) IOD came from?

Was there some leak saying that that I missed or is this just some wild seronxposting? This feature was #1 on my wishlist (that or some advanced packaging chiplet tech that would not have power overhead)...

BorisTheBlade82 · May 14, 2024

Jan Olšan said:
I did that too... Sometimes there is news drought, sometimes the info is too attractive. IIRC, there were cases when leaked docs were directly pasted here. Obviously you have to be superprudent in trying to judge what may be a legit informed person and what's a random forum dude making stuff up. No offense...

Wait wait wait, where did the talk of LP island/LP cores in Strix (Halo) IOD came from?

Was there some leak saying that that I missed or is this just some wild seronxposting? This feature was #1 on my wishlist (that or some advanced packaging chiplet tech that would not have power overhead)...

Advanced packaging with less overhead is pretty much a given for Halo right now.
LP island was something that a forum member guessed and @uzzi38 / @Kepler_L2 / @adroc_thurston confirmed accidentally in the last couple of pages. But of course it is up to you, how much you want to make out of it.

Mopetar · May 14, 2024

Joe NYC said:
Strix Halo (and Mac) pose a different question: "Do you really need a costly and power inefficient dGPU if iGPU can do the same job for less money and less power consumption".

The M3 Max has a lot of the die devoted to the GPU, but for encoding video/audio it'll use dedicated hardware blocks assuming it's a codec Apple supports. That's what makes is so fast. There are certainly users who need all of those GPU cores, but for a lot of Apple users they're seldom going be fully utilized.

Apple can get away with this approach more easily since they control the software as well as the hardware. The add accelerators for their own ProRes format that Final Cut uses and don't necessarily care about other formats. They only recently added AV1 hardware decoders to their products, but will probably never added a dedicated hardware encoder.

That's what I mean when I say it's more difficult for PC makers to play in the same space as Apple. Apple just does things they way they want and if it's not for you, then you're not buying a Mac. The PC space is by far more flexible, but it comes at the expense of doing some of the targeted things that Apple does.

Doug S · May 14, 2024

biostud said:
Sure, some pros do, but there are still people who buy the most expensive because it must be better and then use their MBP purely for office work.

I don't have a problem with this as people can buy their computer for whatever reason, I'm just questioning how small a fraction of buyers knows the differences between an AMD, Intel, or Apple based computer.

I might be biased because I work with teenagers in a school environment though.

I think most people know an Apple computer is different than the rest, though they may not know exactly how other than something like "it won't run all your programs". That's not fair but that's probably what they've picked up over the years. A lot of people probably think Intel is found in "real" PCs, and AMD are found in cheap or off brand PCs. Again, that's not fair but that's probably what they've picked up over the years.

People have a lot of biases that may have a grain of truth or have been true in the past, and it takes a lot of time for them to be overcome.

Saylick · May 14, 2024

Mahboi said:
Well duh.
The real question is what will Zen 5 LP do and not do. And what will it do "well enough".

There's a world of difference between a low power core that can scroll through a document or webpage but needs to awaken the big cores on every single new tab or page load, and one that can do web/document editing and possibly also low power video decoding without awakening the rest.
I could easily see a low power islands of 4 Zen 5 LP cores that could still have enough punch to let you watch Youtube/Twitch/Netflix or browse forums or Discord as a total beatdown on the market. 10+ hours of full scrolling/viewing/handling basic tasks without a single worry about battery.

Even if you have to set some limits like say "1080p video decoding but nothing above" or set some "low power mode" in Word or Chrome, you'd have a real monster product.
And with STX Halo you'd also have a strong graphical beast for when you want to play to boot.

My guess is that 4 Zen 5 LP cores can likely do a lot, especially if they retain much of the IPC uplift over Zen 4 that the vanilla Zen 5 core enjoys. What they strip out might be the additional transistors that allows Zen 5 to clock high. Combine that with the use of TSMC N3E FinFlex and a high density library, a hypothetical Zen 5 LP that only clocks to 2.5 GHz can probably punch pretty high while sipping power in a much smaller area. Considering that MTL's LP-E cores have roughly similar IPC as Skylake (in INT, in FP it's lower), Zen 5 should have like 50% higher IPC. Those MTL LP-E cores are also clocked to 2.5 GHz but there's only 2 of them. 4 Zen 5 LP with 50% higher ST and 3x the MT as the 2 MTL LP-E cores will cover a lot of typical use scenarios.

adroc_thurston · May 14, 2024

Saylick said:
What they strip out might be the additional transistors that allows Zen 5 to clock high.

Dense already does that, big dawg.
Read the ISSCC slideware.

Saylick said:
a hypothetical Zen 5 LP that only clocks to 2.5 GHz

Doesn't even need to go that high, most industry LITTLEs are 1.8ish.

Saylick · May 14, 2024

adroc_thurston said:
Dense already does that, big dawg.
Read the ISSCC slideware.

Yeah, I was trying to find it after I saw your comment. But the combined effect of reducing buffer cells due to a lower clock target + prolific use of a high density library should result in a very compact, power efficient core. There's nothing novel to this approach, but it will be interesting to see how much small AMD got it down to.

Edit: Found some of the slides: https://www.allaboutcircuits.com/ne...zen-4cthe-area-optimized-cloud-computing-core

But it isn't entirely clear to me how much of the core area they shaved off from removing logic transistors. They were able to shrink the SRAM by using a different layout, but that's not the logic itself. For this reason, I think Dylan's Zen 4c analysis is more thought-provoking: https://www.semianalysis.com/p/zen-4c-amds-response-to-hyperscale

adroc_thurston said:
Doesn't even need to go that high, most industry LITTLEs are 1.8ish.

Sure, whatever it takes to serve >80% of the typical user's needs.

Mopetar · May 14, 2024

jpiniero said:
You'll laugh but I think if Strix Halo ends up being popular with OEMs, it'll be because of AI.

I think that it'll be most popular with CAD users or people dealing with 3D graphics.

For serious AI work you'll want a top end workstation if you're doing the work locally. If you aren't a $200 Chromebook that can SSH into a workstation is about as good as anything else.

adroc_thurston · May 14, 2024

Saylick said:
But the combined effect of reducing buffer cells due to a lower clock target + prolific use of a high density library should result in a very compact, power efficient core

AMD's also doing some real funky SRAM stuff for dense.

Saylick said:
but it will be interesting to see how much small AMD got it down to.

The real question is just how castrated Z5LP really is.

Saylick said:
Sure, whatever it takes to serve >80% of the typical user's needs.

yeah needa ask what's the standard (for Apple) LITTLE cluster residency.

Mopetar said:
I think that it'll be most popular with CAD users or people dealing with 3D graphics.

see? very smart.
Getting the ISV certs and making sure Radeon runs fine with antique CAD garbage will be tricky, but the upside is funny.

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Senior member

Senior member

Senior member

Senior member

Lifer

Platinum Member

Member

Member

Senior member

Golden Member

Senior member

Member

Senior member

Senior member

Diamond Member

Lifer

Senior member

Senior member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Platinum Member