Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 225 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Philste

Member
Oct 13, 2023
70
131
66
I'll preface this by saying I don't know Strix Point perf numbers, but I still wouldn't expect a 50% bump over PHX personally.

I'm expecting more along the lines of ~25-30% perf at best. Memory bandwidth is the concern really, PHX seems massively mem bw bound by 28w.
Don't hate me for this, but right now I wonder if Strix' iGPU will be any faster than Phoenix. If Strix really doesn't have a bigger GPU Cache (which is likely, because Microsoft wants big AIE) I think it will be completely memory bound.

I mean look at Phoenix:
7600 is ~8-10% faster than 6650XT at mostly same clocks in most reviews, so that's what RDNA3 delivers over RDNA2. 780M clocks 18% (2.8GHz vs 2.4GHz) higher than 680M. So expected performance uplift of 780M would be in the 25-30% region. However, with same RAM it's barely 10% faster, more like 5%. If Strix just dumps 2 more WGPs in there i don't see how it is any faster than 780M at ISO RAM.
 

FlameTail

Platinum Member
Dec 15, 2021
2,356
1,276
106
Don't hate me for this, but right now I wonder if Strix' iGPU will be any faster than Phoenix. If Strix really doesn't have a bigger GPU Cache (which is likely, because Microsoft wants big AIE) I think it will be completely memory bound.

I mean look at Phoenix:
7600 is ~8-10% faster than 6650XT at mostly same clocks in most reviews, so that's what RDNA3 delivers over RDNA2. 780M clocks 18% (2.8GHz vs 2.4GHz) higher than 680M. So expected performance uplift of 780M would be in the 25-30% region. However, with same RAM it's barely 10% faster, more like 5%. If Strix just dumps 2 more WGPs in there i don't see how it is any faster than 780M at ISO RAM.
Strix Point vs Phoenix
225 mm² vs 178 mm²
LPDDR5X-8533 vs LPDDR5X-7500

There's minor uplift in RAM bandwidth. Also they are adding 47 mm² of Silicon. Surely, that's enough to squeeze in a bigger GPU cache?
 

Philste

Member
Oct 13, 2023
70
131
66
LPDDR5X-8533 vs LPDDR5X-7500
That's what it supports, but OEM decides what's in there at the end.
There's minor uplift in RAM bandwidth. Also they are adding 47 mm² of Silicon. Surely, that's enough to squeeze in a bigger GPU cache?
ZEN5 is a lot wider than ZEN4, there's a reason it uses 4+8 Design that probably takes the same space as a 8 ZEN5 Design. 50% more L2, 50% more L3, 2 more WGPs that may also be bigger because of RDNA3.5. Probably more/more modern IO (PCIe 5.0?). Bigger AIE. That's a lot to fit in those 47mm^2, don't you think? It's N4 vs N4(P?) after all.
 

Hitman928

Diamond Member
Apr 15, 2012
5,392
8,278
136
Defining real-world benches might be helpful for this discussion?

Not to jump too much into this as I actually put more value in spec_int, but your real world application comparison actually favors Zen4. You have Cinebench r20 three times, Cinebench r23 once, then 2 other tests. So basically it shows that RPL leads Zen4 in Cinebench by about 10% but then Zen4 leads in 7Zip by about 10% and the "tie breaker" would be Wprime where Zen4 leads by about 20%. This leaves you with Zen4 being 6% faster per clock on average across the 3 different workloads. Even if you count CB R20 and R23 separately, Zen4 would still be about 2% faster on average in the shown tests.
 

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
Strix Point:
STX
TSMC N4P 225mm²
4c Zen 5 L3: 16 MB L2: 4 MB
8c Zen 5C L3: 16 MB L2: 8 MB
8 WGP RDNA3+
64 AIE tile
DDR5-5600 / LPDDR5X-8533
28-35+ W
Got corrected immediately to 8MB L3 for the ZEN5c CCX. So the 4 ZEN5 Cores get 16MB L3, the 8 ZEN5c Cores get 8MB L3.
Hmm, if that is correction, then total L3 cache of STX is 24MB....50% larger than PHX
Describing two separate caches by the sum of their sizes is alright when you discuss just area, but functionally that's a rather idealized quantification and overstates the usefulness of these caches in many relevant scenarios. Personally I would expect the designers of a mobile CPU to go for a unified last level cache. But who knows, CPUs with stranger properties have been released before.
 

misuspita

Senior member
Jul 15, 2006
403
464
136
Clocks may not be that high on Strix Halo. Depends on how high the power draw is going to go.
How high can they go? CPU wise, 40-65W is about enough... For GPU, sky's the limit. I wish they would do 100W at least, but I admit I have no idea if it is a mobile only or mixed desktop mobile part

I am really curious of this one, cause if it really is that powerful GPU wise, it's instabuy for me. My current 5700G can chug along until 2025 just fine
 

qmech

Member
Jan 29, 2022
82
179
66
We receive Zen6 DT codename before Zen5 release


I don't know, there's no new informations about production recently. But some gossip suggest Zen5 would utilize XDNA2 architecture.

I am not sure I would call the leaked roadmap "gossip". It has been accurate as far as Hawk Point is concerned.

A leaked roadmap is not as good a source as an official roadmap, but still somewhat above "gossip".
 

Joe NYC

Platinum Member
Jun 26, 2021
2,072
2,585
106
My first message here ... hi everybody !!!


I've been reading you silently for several months ... and ... I admit it : I am an enthusiast ignorant ... BUT , I would like to know ...

My question is :

I would like to know whether AMD were able to 'roll-out' Strix Point at ... let's say May ... why would they do it on October instead ??

I can think of 'commercial reasons' ... to give some time Hawk Point to get sell ...

On the other hand ... from my lack of knowledge, I think the soonest Strix Point (Zen 5) hits retailers' shelves ... the more market share AMD will take away from Intel ... at least as far as the laptop market is concerned.


Please ... shed some 'light' ...


Regards.

I think the answer to this (Strix Point) is different from, say Zen 5 desktop. Zen 5 desktop can be put in the retail box and shipped right away, when it is ready. A notebook chips is sent to OEMs and than, the OEMs have to be ready with their notebooks to be able to ship them.

But a broader question is a valid one. Tom of MLID has raised it in recent podcast. Namely, with Zen 4 being so strong, what is the reason to release Zen 5?

I think delaying a new (stronger) product is a mistake in a competitive environment. Releasing the product slots the product into the higher slots, competitively, which should lead to higher sales and higher revenue. Postponing release would forgo this advantage for a period of time, which is equivalent of leaving money on the table.
 

misuspita

Senior member
Jul 15, 2006
403
464
136
I don't mind pricey if it's also silent(y). A 13700 + 4070 is around 2k now, so if they manage to sell a miniPC at 1000-1500, I'd buy it. Also, if they can cool that combo, they could, potentially, cool a 180W NUC
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,373
2,868
136
Describing two separate caches by the sum of their sizes is alright when you discuss just area, but functionally that's a rather idealized quantification and overstates the usefulness of these caches in many relevant scenarios. Personally I would expect the designers of a mobile CPU to go for a unified last level cache. But who knows, CPUs with stranger properties have been released before.
L3 cache is not separate, I don't know why you think It's separate.
 

ikjadoon

Member
Sep 4, 2006
119
172
126
@ikjadoon, why are you first asking @Markfw to define real-world benchmarks, then leave the real world and provide Cinebench 1T figures? The Cinebench benchmark measures performance of a rendering engine which, in the real world, is used for widely parallel problems and, importantly, in its much more effective GPGPU implementation, not in the plain CPU implementation anymore these days.

I understand that @Markfw was primarily referring to fully parallel scientific computing. E.g. molecular dynamics, telescope data processing, number-theoretical transforms... That's often n×1-copies×threads, sometimes n×m-copies×threads. I am running such stuff myself and occasionally implemented actual reproducible benchmarks based on selected workunits of such science tasks.

...

Anyway; this side discussion was started with a statement that Intel had "better IPC" (without further qualification) and the implication that this is one of the aspects why the poster thought that AMD should offer Zen 5 based products as soon as they can. Meanwhile we have seen that there are indeed situations in which higher clock-normalized performance is observed on Intel CPUs, while on the other hand everybody can have a different opinion on how much this fact can or should influence AMD's time-to-market efforts. :-)

That's the nail on the head, Stefan. "Real-world" is too vague a useful term. It's not far from the "best" benchmark. Best for who and what circumstances? Everyone draws their line somewhere else.

Surely you'd agree why defining it makes sense.

It's ideally also why "IPC" without qualifiers as a catch-all term should be retired (both the practical reason that it's workload-specific and the pedantic reason that we don't measure instructions) in favour of "performance in XYZ workload at identical clocks".

Discussing IPC without a workload is like discussing frames per second without a game. Or, as we commonly do an average, it matters what gets averaged. The industry relies on SPEC, but if SPEC doesn't fit one's use-case, then one should define the use-case.

Agreed: I don't think any vendor is reading 1) these forums and 2) these discussions to magically adjust their launch timing, "The AnandTech forum posters nailed it: we must do more now."

Not to jump too much into this as I actually put more value in spec_int, but your real world application comparison actually favors Zen4. You have Cinebench r20 three times, Cinebench r23 once, then 2 other tests. So basically it shows that RPL leads Zen4 in Cinebench by about 10% but then Zen4 leads in 7Zip by about 10% and the "tie breaker" would be Wprime where Zen4 leads by about 20%. This leaves you with Zen4 being 6% faster per clock on average across the 3 different workloads. Even if you count CB R20 and R23 separately, Zen4 would still be about 2% faster on average in the shown tests.

I agree on spec_int.

Yes; I meant to share data, not necessarily conclude "Zen4 is lower IPC" → the IPC will depend on the benchmark. Thank you for checking the math; inverting wprime (Zen4 = 121.4%) & the geometric mean is 2.75% in favour of Zen4.

That's a great point on why it matters what gets inside average to define "IPC".
 

Philste

Member
Oct 13, 2023
70
131
66
L3 cache is not separate, I don't know why you think It's separate.
If Strix doesn't change AMDs use of CCXs then it kinda is separate. The 4 ZEN5 Cores will have fast access to 16MB L3 and the 8 ZEN5c Cores will have fast access to 8MB L3. Cross CCX Latencies are usually much worse.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,373
2,868
136
If Strix doesn't change AMDs use of CCXs then it kinda is separate. The 4 ZEN5 Cores will have fast access to 16MB L3 and the 8 ZEN5c Cores will have fast access to 8MB L3. Cross CCX Latencies are usually much worse.
Aren't you comparing It to desktop CPUs, which have separate CCDs?
Does PHX2's cache look to you like It's separate? But It's true, that It's only 6 cores in total.
Strix Point will have most likely a single CCX.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |