Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

Kaluan · Apr 24, 2023

I wonder if Strix Point will have only 1 monolithic die (up to 4C+8c/16CU), or split into 2 like PHX. Same for Grey Hawk refresh.

Those, along with PHX2 and Mendocino may fit into fanless designs.

RDNA3+ is allegedly implementing the scalar arithmetic logic units planned for RDNA4 (Kepler said), along with reworks of whatever got busted in the current design (my guess), so likely will clock much higher/will be more efficient and have less workarounds/bottlenecks in feeding data through the graphics pipeline.

Anhiel · Apr 24, 2023

At that size it makes no sense to use chiplets. They'll need another chiplet design for lower end products. I'm guessing that work will go along with the embedded lineup.

Anyhow, the more interesting question is whether the IO die for Strix Halo is made of two chiplets each with 20CU and 128-bit memory controller/connections.

Obviously, the number of different chiplets has increased for the Zen5 generation. That's gonna cost a lot and with a larger and complex organization/package production overhead.
I just hope pricing won't be too bad for Strix Halo SKUs or it won't sell much better than Apple's M-MAX, Ultras etc. Apple still has a bandwidth advantage here.

And since it seems MTL might have 2 Crestmont cores in their SOC, Intel has stepped up their idle power usage advantage. In these new leaks Zen5c/d cores still don't seem to offer any real power saving advantage beyond lowering CLK. Well, the 3% savings I get could as well be a rounding error here. So it seems they only offer an area advantage. Good enough for mobile and high core density. But without going to smaller process node I've doubts about their competitiveness against Intel's e-cores and ARM.

BorisTheBlade82 · Apr 24, 2023

Kaluan said:
I wonder if Strix Point will have only 1 monolithic die (up to 4C+8c/16CU), or split into 2 like PHX. Same for Grey Hawk refresh.

How come, you think that PHX is not monolithic? Might you have confused that with Dragon Range (Raphael, but for mobile)? And did you mean Hawk Point, which basically is to Phoenix Point what Lucienne was to Renoir?

Tigerick · Apr 26, 2023

Guys, I have compiled the upcoming mobile APU from AMD next year. Man, AMD is really targeting full range of notebook segments and price points. There are six new models, I am trying to list main specs and launching dates. Again, this is based on leaks from RGT and MLID, and some speculations and corrections from mine. So any insights please let me know, I will update the table accordingly.

Name	Model	Launch Date	Node	CPU cores	L3 Cache	Memory LPDDR5x	Memory BW	GPU	ALU	IC	AIE (TOPS)	TDP
Escher	R3 8050	2025	N4P	2xZen5 + 4xZen5c	12 MB	64-bit 8533	68 GB/s	RDNA3+ 4 CU 256SP	512	NA	?	15W
STX	R7 8050	Q3 2024	N4P	4xZen5 + 8xZen5c	24 MB	128-bit 8533	136 GB/s	RDNA3+ 8CU 512SP	1024	NA	20	15-45W
STX Halo/ Sarlak	R9 8050	Q4 2024	N4P + N4P	8xZen5 + 8xZen5c	40 MB	256-bit 8533	272 GB/s	RDNA3+ 20CU 1280SP	2560	32MB	40	55W+
				6xZen5 + 8xZen5c	32 MB	192-bit 8533	204 GB/s	RDNA3+ 16CU 1024SP	2048	?	?
Fire Range	R7&9 8055	Q3 2024	N4P x 3	16xZen5	64 MB	128-bit		RDNA3+ 2CU 128SP	256	NA	20	55W+

soresu · Apr 26, 2023

Tigerick said:
Guys, I have compiled the upcoming mobile APU from AMD next year. Man, AMD is really targeting full range of notebook segments and price points. There are six new models, I am trying to list main specs and launching dates. Again, this is based on leaks from RGT and MLID, and some speculations and corrections from mine. So any insights please let me know, I will update the table accordingly.

Name Model Launch Date Node CPU cores L3 Cache Memory LPDDR5x Memory BW GPU ALU IC TDP
PHX2 R5
7040 Q3 2023 N4 115mm2 2xZen4 + 2xZen4c 4+4=8MB 64-bit 7500 60 GB/s RDNA3 4CU 256SP 256 NA 15W
STX3 R5 8050 Q3 2024 N4P 2xZen5 + 4xZen5c 12 MB 64-bit 8533 68 GB/s RDNA3+ 4CU 256SP 512 NA 15W
PHX+
Hawk Point R7&9 8040 Q1 2024 N4P 8xZen4 16 MB 128-bit 8533 136 GB/s RDNA3+ 12CU 768SP 1536 NA 15-45W
STX2 R7 8050 Q3 2024 N4P 4xZen5 + 8xZen5c 24 MB 128-bit 8533 136 GB/s RDNA3+ 8CU 512SP 1024 NA 15-45W
STX1 Halo R9 8050 Q3 2024 N4P + N4P 8xZen5 + 8xZen5c 32 MB 256-bit 8533 272 GB/s RDNA3+ 20CU 1280SP 2560 32MB 25-120W
6xZen5 + 8xZen5c 28 MB 128-bit 8533 136 GB/s RDNA3+ 10CU 640SP 1280 32MB
Fire Range R7&9 8055 Q3 2024 N4P x 3 16xZen5 64 MB 128-bit RDNA3+ 2CU 128SP 256 NA 55W+

I have my doubts about a 20 CU APU needing 120W TDP when there is a 12 CU APU in that table that only draws a max of 45W.

Unless Zen5 needs a dramatic increase in power of course 🤔

Abwx · Apr 26, 2023

soresu said:
I have my doubts

I have my doubts about a 20 CU APU needing 120W TDP when there is a 12 CU APU in that table that only draws a max of 45W.

Unless Zen5 needs a dramatic increase in power of course 🤔

If CPU + GPU use 45W with 12CUs then a 20CUs APU will use at most 70W, and that s at same node and uarch.

maddie · Apr 26, 2023

soresu said:
I have my doubts about a 20 CU APU needing 120W TDP when there is a 12 CU APU in that table that only draws a max of 45W.

Unless Zen5 needs a dramatic increase in power of course 🤔

Needs?

soresu · Apr 26, 2023

maddie said:
Needs?

For the same frequency and core count.

maddie · Apr 26, 2023

soresu said:
For the same frequency and core count.

I looked for, and didn't see frequency in that table, hence the question.

Panino Manino · Apr 26, 2023

And here we go again:

Is there any news from his own sources of just what is already know him plus his projections?

moinmoin · Apr 26, 2023

Needs...

Through all the fanless configuration talk here before I kept scratching my head since with my 6c6t 4500U the experience is that it runs at 3-4W whole system most of the time, with a TDP limit of 9W making only a difference in longer running heavy compute tasks (especially iGPU). So from that pov I don't see AMD not having a product for low TDP use, but no manufacturer using existing chips and limiting them to such low TDPs.

Instead the default TDP set seems to be on a rise in laptops as well now.

uzzi38 · Apr 26, 2023

Panino Manino said:
And here we go again:

Is there any news from his own sources of just what is already know him plus his projections?

I've seen two screenshots from this and my only response is

Anhiel · Apr 26, 2023

soresu said:
I have my doubts about a 20 CU APU needing 120W TDP when there is a 12 CU APU in that table that only draws a max of 45W.

Unless Zen5 needs a dramatic increase in power of course 🤔

Abwx said:
If CPU + GPU use 45W with 12CUs then a 20CUs APU will use at most 70W, and that s at same node and uarch.

The iGPU should consume around 37.5W. The uncertainty here is the RDNA4 modification.
Supposely, the cores are clocked similar to Ryzen 9 7940HS: with a boost of 5.2GHz and all-core ~4.9GHz then the 16 cores should consume around 2x40W. So 120W looks legit. (compared to 7950X we can see here how horribly bad TDP gets with higher clocks from here on out)

Panino Manino · Apr 26, 2023

uzzi38 said:
I've seen two screenshots from this and my only response is

He says his exclusive info is chips with 2MB and 3MB L2 (1% IPC for ST and 4% and 7% IPC for MT), with zero latency increase.

Saylick · Apr 26, 2023

lol at the literal sketch of a ladder on top of that AMD slide.

Ultimately, it like AMD is moving from a ring bus to a mesh of some sort? That way, data from one core doesn't have to traverse all the way around the ends of the loop to get to a core that is directly across from it.

Joe NYC · Apr 26, 2023

Panino Manino said:
And here we go again:

Is there any news from his own sources of just what is already know him plus his projections?

2 takeaways for me:
- the new bus, that can scale to 16 unified cores in a single CCD.
- larger L2 without latency penalty being possible in future cores

Two of those combined lead me to believe that AMD plans on dropping L3 entirely from future generations of processors. Unknown if it will be Zen 5 or Zen 6.

Abwx · Apr 26, 2023

Anhiel said:
The iGPU should consume around 37.5W. The uncertainty here is the RDNA4 modification.
Supposely, the cores are clocked similar to Ryzen 9 7940HS: with a boost of 5.2GHz and all-core ~4.9GHz then the 16 cores should consume around 2x40W. So 120W looks legit. (compared to 7950X we can see here how horribly bad TDP gets with higher clocks from here on out)

At 45W the CPU and uncore use 10W, 35W is left for the GPU, if you increase the CU count by 67% from 12 to 20 CUs then GPU power will increase accordingly to 35 x 1.67 = 58W.

A/// · Apr 26, 2023

stop linking to rumor mills with no credibility thank you.

Kepler_L2 · Apr 26, 2023

Tigerick said:
Guys, I have compiled the upcoming mobile APU from AMD next year. Man, AMD is really targeting full range of notebook segments and price points. There are six new models, I am trying to list main specs and launching dates. Again, this is based on leaks from RGT and MLID, and some speculations and corrections from mine. So any insights please let me know, I will update the table accordingly.

Name Model Launch Date Node CPU cores L3 Cache Memory LPDDR5x Memory BW GPU ALU IC TDP
PHX2 R5
7040 Q3 2023 N4 115mm2 2xZen4 + 2xZen4c 4+4=8MB 64-bit 7500 60 GB/s RDNA3 4CU 256SP 256 NA 15W
STX3 R5 8050 Q3 2024 N4P 2xZen5 + 4xZen5c 12 MB 64-bit 8533 68 GB/s RDNA3+ 4CU 256SP 512 NA 15W
PHX+
Hawk Point R7&9 8040 Q1 2024 N4P 8xZen4 16 MB 128-bit 8533 136 GB/s RDNA3+ 12CU 768SP 1536 NA 15-45W
STX2 R7 8050 Q3 2024 N4P 4xZen5 + 8xZen5c 24 MB 128-bit 8533 136 GB/s RDNA3+ 8CU 512SP 1024 NA 15-45W
STX1 Halo R9 8050 Q3 2024 N4P + N4P 8xZen5 + 8xZen5c 32 MB 256-bit 8533 272 GB/s RDNA3+ 20CU 1280SP 2560 32MB 25-120W
6xZen5 + 8xZen5c 28 MB 128-bit 8533 136 GB/s RDNA3+ 10CU 640SP 1280 32MB
Fire Range R7&9 8055 Q3 2024 N4P x 3 16xZen5 64 MB 128-bit RDNA3+ 2CU 128SP 256 NA 55W+

AFAIK Strix Point (4+8, 16 CU, 4nm monolithic) is STX1, it was originally 8+4 and 3nm but got redefined due to TSMC issues. Strix Halo is (at least internally) called SAR(Sarlak) and STX3 has been cancelled entirely.

itsmydamnation · Apr 26, 2023

Joe NYC said:
2 takeaways for me:
- the new bus, that can scale to 16 unified cores in a single CCD.
- larger L2 without latency penalty being possible in future cores

Two of those combined lead me to believe that AMD plans on dropping L3 entirely from future generations of processors. Unknown if it will be Zen 5 or Zen 6.

how do you do cache coherency .

Why would you drop L3 when you could do what IBM has done.

You people make 0 sense all the damn time!

Panino Manino · Apr 26, 2023

itsmydamnation said:
how do you do cache coherency .

Why would you drop L3 when you could do what IBM has done.

You people make 0 sense all the damn time!

I thin what he means is making L1 and L2 as big as possible and having L3 only as V-Cache.

Joe NYC · Apr 26, 2023

itsmydamnation said:
how do you do cache coherency .

Why would you drop L3 when you could do what IBM has done.

You people make 0 sense all the damn time!

I meant drop L3 from main CCD die. (sorry, about not spelling it out).

More cores (up to 16) crowding the CCD and L2 size decreasing the need for L3 would seem like a good way to push L3 out of the CCD and maybe only into the V-Cache or some other level of cache, such as system level cache.

BTW, those large L2s, if they acted together the same as what IBM outlined would work even better then having L3 present on the die.

Anhiel · Apr 26, 2023

Abwx said:
At 45W the CPU and uncore use 10W, 35W is left for the GPU, if you increase the CU count by 67% from 12 to 20 CUs then GPU power will increase accordingly to 35 x 1.67 = 58W.

Dunno which SKU you refer to but I'm pretty sure your information is wrong for many reasons.
One we are looking at a N5 to N4 transition, two Radeon 780M N5 12CU has 15W TPD. These are the facts. Look up techpowerup.
Assuming the minor changes from RDNA4 improvement and assuming the basic 16CU eats up the little N4 30% power savings.. this gives 2.5x for 40CU, hence, my 37.5W.
What's been left out here is the 7? core AI engine we don't know anything about. I suppose it can be dropped from ordinary usage scenario.

Anhiel · Apr 26, 2023

That cache "leak" above has so many errors especially when it contradicts the server source referred to.

Joe NYC said:
I meant drop L3 from main CCD die. (sorry, about not spelling it out).

More cores (up to 16) crowding the CCD and L2 size decreasing the need for L3 would seem like a good way to push L3 out of the CCD and maybe only into the V-Cache or some other level of cache, such as system level cache.

BTW, those large L2s, if they acted together the same as what IBM outlined would work even better then having L3 present on the die.

Doesn't make a lot of sense at this "early" point in time, not to mention the voltage sensitivity of having L3 as a V-cache die on top.
The recent burn out CPUs are prove of the current unsafe design/control.

I don't think L2 has grown significant as suggested for the simple reason that the die size has to be kept at the same size as before as much as possible despite node shrinks that mostly used to add other stuff (5-wide expansion) or squeeze the c/d variants. Not to mention SRAM doesn't shrink well.

The interesting rumor bit is that L2 is somehow unified. I can only guess it's somewhat similar to IBM's Z15 virtual cache or moving toward that kind of solution. I'm guessing it's more like keeping core-bound data at closer L2 cells while larger data at more distant & area sharing cells. It's still conjecture at this point. Hence, depending on how you view it the L2 has grown even if it's not.

Tigerick · Apr 27, 2023

soresu said:
I have my doubts about a 20 CU APU needing 120W TDP when there is a 12 CU APU in that table that only draws a max of 45W.

Unless Zen5 needs a dramatic increase in power of course 🤔

Yeah, 120W does seem high, even 16-core Zen4 7945HX only requires 75W. Besides high clocks, reasons that I can think of are additional 128-bit memory bus (that would require 32-bit x 4) and huge amount of caches (FYI, M2 Pro has total L2+L3 cache of 60MB, STX Halo has 64MB total cache excluding L2 cache of each CPU cores, I would assume at least 16MB L2 caches, so total cache of Halo would be at least 80MB). That's why I don't believe of 96MB rumors unless Zen5/5c has 2MB of L2 cache each...

Name	Model	Launch Date	Node	CPU cores	L3 Cache	Memory LPDDR5x	Memory BW	GPU	ALU	IC	TDP
PHX2	R5 7040	Q3 2023	N4 115mm2	2xZen4 + 2xZen4c	4+4=8MB	64-bit 7500	60 GB/s	RDNA3 4CU 256SP	256	NA	15W
STX3	R5 8050	Q3 2024	N4P	2xZen5 + 4xZen5c	12 MB	64-bit 8533	68 GB/s	RDNA3+ 4CU 256SP	512	NA	15W
PHX+ Hawk Point	R7&9 8040	Q1 2024	N4P	8xZen4	16 MB	128-bit 8533	136 GB/s	RDNA3+ 12CU 768SP	1536	NA	15-45W
STX2	R7 8050	Q3 2024	N4P	4xZen5 + 8xZen5c	24 MB	128-bit 8533	136 GB/s	RDNA3+ 8CU 512SP	1024	NA	15-45W
STX1 Halo	R9 8050	Q3 2024	N4P + N4P	8xZen5 + 8xZen5c	32 MB	256-bit 8533	272 GB/s	RDNA3+ 20CU 1280SP	2560	32MB	25-120W
				6xZen5 + 8xZen5c	28 MB	128-bit 8533	136 GB/s	RDNA3+ 10CU 640SP	1280	32MB
Fire Range	R7&9 8055	Q3 2024	N4P x 3	16xZen5	64 MB	128-bit		RDNA3+ 2CU 128SP	256	NA	55W+

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Senior member

Member

Senior member

Senior member

Platinum Member

Lifer

Diamond Member

Platinum Member

Diamond Member

Senior member

Diamond Member

Platinum Member

Member

Senior member

Diamond Member

Platinum Member

Lifer

Diamond Member

Senior member

Platinum Member

Senior member

Platinum Member

Member

Member

Senior member