Question Zen 6 Speculation Thread

igor_kavinski · Wednesday at 2:10 PM

Josh128 said:
Its just a flat out lie by AMD.

Their statement is based on both CPUs at 4 GHz.

Abwx · Wednesday at 2:10 PM

Josh128 said:
View attachment 126543
View attachment 126544

Its not for nT. IPC is not a relevant metric to nT due to varying clocks depending on power efficiency and draw. Even if it was, it only improves by 1% vs ST, going from 10% to 11%, which is explainable by run to run variance. Its just a flat out lie by AMD.

You dont get it, it matter also for MT since this increase the SMT scaling, you didnt notice that this scaling was much improved by Zen 5, FTR it s 44% for 7 Zip.

Beside the 9950X and 9700X have lower all cores clock than the 7950X/7700X, yet their better scaling more than compensate for the lower clocks.

StefanR5R · Wednesday at 2:51 PM

Folks argue over single-digit percentage differences but (some of them) can't even tell the terms "instructions per cycle", "iso-clock performance", and "performance per clock" apart.

Josh128 · Wednesday at 8:52 PM

StefanR5R said:
Folks argue over single-digit percentage differences but (some of them) can't even tell the terms "instructions per cycle", "iso-clock performance", and "performance per clock" apart.

Both CPUs boost to 5.6 GHz in R23 ST AFAIK. Why do tests at stock show an uplift barely more than half of AMDs claims? Instead of proclaiming someone cant do such and such, why dont we test it? Someone here with a Zen 4 lock their CPU to 4GHz, then 5GHz and do an ST R23 run at each, post their results and I'll do the same. Then we can see if the stock boost clocks are off by enough to make up the difference I showed. I've seen no evidence from my own 9900X that supports the notion that its not hitting a steady 5.6GHz in the R23 ST test.

Joe NYC · Thursday at 12:22 AM

TLDR: a lot of what we heard before about Zen 6 CCD:
- 12 cores
- 46 MB L3 on CCD
- V-Cache die will have 96 MB SRAM
- multiple layers of V-Cache possible, which would mean 240 MB L3 with 2 layers.
- mentions Zen 6 FP IPC uplift 7-9% (ok, but who cares about FP uplift?)
- 32 core Zen 6c cores clocking very well. All core clocks higher than boost clock on Turin) at 4 GHz all core on 32 CCD. (the biggest news in the video)

More than 1/2 of the video is Tom's tariff retardation

adroc_thurston · Thursday at 12:45 AM

Joe NYC said:
(ok, but who cares about FP uplift?)

me.
It's funny they even got anything after Zen5 turbojuiced it.

Joe NYC · Thursday at 1:03 AM

adroc_thurston said:
me.
It's funny they even got anything after Zen5 turbojuiced it.

Yeah, I didn't really expect anything from FP IPC, just some clock speed increases under load.

But the Zen 6c clocks - HUGE.

Joe NYC · Thursday at 1:07 AM

adroc_thurston said:
me.
It's funny they even got anything after Zen5 turbojuiced it.

BTW, it is possible that very little FP IPC uplift comes from the core itself and most of it comes from better memory bandwidth and latency.

adroc_thurston · Thursday at 1:12 AM

Joe NYC said:
BTW, it is possible that very little FP IPC uplift comes from the core itself and most of it comes from better memory bandwidth and latency.

You can't separate these so idk.

Io Magnesso · Thursday at 2:07 AM

Joe NYC said:
BTW, it is possible that very little FP IPC uplift comes from the core itself and most of it comes from better memory bandwidth and latency.

Of course, if it's a single thread, it doesn't really matter...
In the case of multi-threading, when comparing the same core number of Turin EPYC and Ryzen9 9950X due to the bandwidth relationship
There was also a result that EPYC had a much higher score.

branch_suggestion · Thursday at 3:14 AM

Joe NYC said:
TLDR: a lot of what we heard before about Zen 6 CCD:
- 12 cores
- 46 MB L3 on CCD
- V-Cache die will have 96 MB SRAM
- multiple layers of V-Cache possible, which would mean 240 MB L3 with 2 layers.
- mentions Zen 6 FP IPC uplift 7-9% (ok, but who cares about FP uplift?)
- 32 core Zen 6c cores clocking very well. All core clocks higher than boost clock on Turin) at 4 GHz all core on 32 CCD. (the biggest news in the video)

More than 1/2 of the video is Tom's tariff retardation

The relevant material:

6-8% FP IPC when Z5 already pushed it so far is a good sign for INT IPC, as it has far more low hanging fruit and a good xtor budget to improve it.
2-hi has been possible for some time, but cost benefit is likely too small to justify shipping it, it adds cost, time and lowers final yield which is painful waste.
13-16% server workload PPC could mean all sorts of things, is it per thread, per core or per socket PPC?
That matters a lot eg. 96c Turin socket perf is 40% higher than 96c Genoa but 1T perf is not much more than 20% thanks to higher clocks and 16% IPC.
Could also be Z5c vs Z6c which is a better case for Z6 due to 2x higher L3/core.
I will lean to INT IPC being at least 13% higher though, this is all arguments based on this and the 70% socket perf number AMD gave.
Finally clocks are probably the highest confidence bit of info here, ~4Ghz all core boost with near 4.5Ghz maximum boost for Z6c is huge, which is a leading indicator that clocks will likely make up the majority of the overall performance uplift, 1T and nT.
I think this is quite literally another Zen 4 situation, that got 29% perf/core at the end of the day, this I would lean to 25%.
Add that together and you get ~66% more socket perf Venice-D vs Turin-D, the rest of the gap being interconnect, uncore and memory for a comfortable 70%.

yuri69 · Thursday at 4:22 AM

branch_suggestion said:
Finally clocks are probably the highest confidence bit of info here, ~4Ghz all core boost with near 4.5Ghz maximum boost for Z6c is huge, which is a leading indicator that clocks will likely make up the majority of the overall performance uplift, 1T and nT.

Keep in mind the multi-thread clock boost is partially allowed by the SP7 raising the TDP form 500W to 600W (or more with cTDP?). This doesn't apply to AM5, which has been on the design limit for quite a while.

branch_suggestion · Thursday at 4:53 AM

yuri69 said:
Keep in mind the multi-thread clock boost is partially allowed by the SP7 raising the TDP form 500W to 600W (or more with cTDP?). This doesn't apply to AM5, which has been on the design limit for quite a while.

It really depends on the comparison.
256c Venice Dense is strictly less power per core than 192c Turin Dense even with 20% more power.
You do save a bit of power from the packaging, but the bump to 16ch memory is a negative offset.
And Z5c has a pretty strict upper clock limit no matter how much power you throw at it, the Turin Dense version is the best case of it being N3E and all, but sadly nobody has really benched it properly AFAIK.
Now how this translates with the vanilla cores is really hard to say, it is a bigger node bump in this case, but when you reach 6Ghz it gets really hard to scale further.

Josh128 · Thursday at 7:54 AM

Joe NYC said:
Yeah, I didn't really expect anything from FP IPC, just some clock speed increases under load.

But the Zen 6c clocks - HUGE.

Well, we KNOW that Zen 6C is 2nm silicon. We dont know what everything else is using (we just dont). So if non-C comes out using some form of N3, everything fueling the 3x microwave oven clock frequencies hypetrain could be blown to bits. Best to stay grounded at this time. This did appear to have the trappings of a quality leak though.

About floating point uplift, 6-8% is pretty damned good if true. A hypothetical 11900X (I swear I'll puke if they really call it that) boosting to 6.2GHz would score approximately 2640 ST in R23. Thats heroic ST levels folks. If an SKU boosts to 6.5Ghz, its about 2750. Face melting perf. All aboard!!

igor_kavinski · Thursday at 8:01 AM

branch_suggestion said:
Now how this translates with the vanilla cores is really hard to say, it is a bigger node bump in this case, but when you reach 6Ghz it gets really hard to scale further.

Yes. This is what has me confused about some posters being so optimistic about scaling beyond 6 GHz. I can accept 6.0 or 6.2 GHz max (because Intel's gotten there so we have real world proof that it can be done) but at the same time, Intel doing it was kind of a do-or-die thing because they desperately wanted an edge over the competition. It wasn't something that they got working easily and running 14900KS with that clockspeed consistently isn't guaranteed due to thermal issues (you really need the CPU to be delidded and custom cooled).

And while we do see some Zen 5 samples hitting 6 GHz, it is very telling that AMD didn't bin any CCDs with that speed. It means that it's a very hard thing to achieve. Why wouldn't they want, for example, a 9995X3D SKU that advertises 6 GHz turbo on the box? TSMC N2 may allow up to 6.2 GHz with lower Vmax but trying to get more speed out of that will likely involve overvolting for stability which will lead to hotspots and insane cooling requirements. With no evidence so far, I think roughly 6 GHz only will be possible on air cooling and 6.2 GHz may be possible in short bursts with good AIOs while custom cooled rigs will be the only ones to reap the maximum benefits from the new process. Even these may struggle to get to 6.5 GHz. Again, this is all based on what's out there right now. If N2 is a magical process that uses innovative materials to bypass physical limits, then obviously we are all in for a treat.

inquiss · Thursday at 8:07 AM

igor_kavinski said:
Yes. This is what has me confused about some posters being so optimistic about scaling beyond 6 GHz. I can accept 6.0 or 6.2 GHz max (because Intel's gotten there so we have real world proof that it can be done) but at the same time, Intel doing it was kind of a do-or-die thing because they desperately wanted an edge over the competition. It wasn't something that they got working easily and running 14900KS with that clockspeed consistently isn't guaranteed due to thermal issues (you really need the CPU to be delidded and custom cooled).

And while we do see some Zen 5 samples hitting 6 GHz, it is very telling that AMD didn't bin any CCDs with that speed. It means that it's a very hard thing to achieve. Why wouldn't they want, for example, a 9995X3D SKU that advertises 6 GHz turbo on the box? TSMC N2 may allow up to 6.2 GHz with lower Vmax but trying to get more speed out of that will likely involve overvolting for stability which will lead to hotspots and insane cooling requirements. With no evidence so far, I think roughly 6 GHz only will be possible on air cooling and 6.2 GHz may be possible in short bursts with good AIOs while custom cooled rigs will be the only ones to reap the maximum benefits from the new process. Even these may struggle to get to 6.5 GHz. Again, this is all based on what's out there right now. If N2 is a magical process that uses innovative materials to bypass physical limits, then obviously we are all in for a treat.

It's just a new node with new knobs to turn

Josh128 · Thursday at 8:11 AM

igor_kavinski said:
think roughly 6 GHz only will be possible on air cooling and 6.2 GHz may be possible in short bursts with good AIOs while custom cooled rigs will be the only ones to reap the maximum benefits from the new process. Even these may struggle to get to 6.5 GHz. Again, this is all based on what's out there right now. If N2 is a magical process that uses innovative materials to bypass physical limits, then obviously we are all in for a treat.

This is the way. Expecting more than 6.2 is just a recipe for disappointment. If it happens, its just a welcome bonus. I'd like to reiterate that we still dont have any concrete evidence that anything but Zen 6C will be using 2nm.

LightningZ71 · Thursday at 8:39 AM

two things:
1) It's possible that at least a portion of the improvement to FP IPC MAY be related to AMD addressing a latency regression for one or more instructions in FP for Zen5. If they do that much, then the remaining percentage of improvement could just be further tweaks to existing functional units.

2) This will be AMD's first consumer CCD to use nanoflex/finflex. I don't think that people are grasping the importance of that sort of thing when it comes to core clocks. If they can get fine grained enough with it, they can apply different transistor fin arrangements to speed critical sections while keeping the rest of it to more efficient and compact arrangements. Just doing that alone can be worth several hundred Mhz of peak boost clock frequency. I don't think that 6.5Ghz is completely out of the question here. I CERTAINLY don't think that it'll be a sustained clock, even in strictly 1T scenarios unless they are VERY light weight.

MS_AT · Thursday at 9:25 AM

LightningZ71 said:
It's possible that at least a portion of the improvement to FP IPC MAY be related to AMD addressing a latency regression for one or more instructions in FP for Zen5. If they do that much, then the remaining percentage of improvement could just be further tweaks to existing functional units.

After Mystical had a chat with AMD engineers, he had low hopes it could be easily fixed but well, maybe they have found the way

As a remainder the problem is related to the fp schedulers, in AMD's own words:

The floating point schedulers have a slow region, in the oldest entries of a scheduler and only when the scheduler is full. If an operation is in the slow region and it is dependent on a 1-cycle latency operation, it will see a 1 cycle latency penalty.
There is no penalty for operations in the slow region that depend on longer latency operations or loads.
There is no penalty for any operations in the fast region.
To write a latency test that does not see this penalty, the test needs to keep the FP schedulers from filling up.
The latency test could interleave NOPs to prevent the scheduler from filling up.

the quote is from publicly available Zen5_Instruction_Latencies excel, that is packed together with Software Optimization Guide for Zen5 available from AMD.

So if you know how to measure, you can show that these instructions did not regress in latency, at least in theory

igor_kavinski · Thursday at 9:46 AM

Josh128 said:
I'd like to reiterate that we still dont have any concrete evidence that anything but Zen 6C will be using 2nm.

Yeah. Don't see any reason why AMD would skip N3P or N3X.

Joe NYC · Thursday at 10:01 AM

Josh128 said:
So if non-C comes out using some form of N3

Why would you say that if all leaks say it will be some version of N2?

Didn't we have like 10 pages of the same argument a couple of weeks ago?

Joe NYC · Thursday at 10:03 AM

igor_kavinski said:
Yes. This is what has me confused about some posters being so optimistic about scaling beyond 6 GHz. I can accept 6.0 or 6.2 GHz max

Intel got there with their 10nm++++ node. AMD will be using N2.

igor_kavinski · Thursday at 10:13 AM

Joe NYC said:
AMD will be using N2.

Wouldn't N2 be severely capacity constrained?

inquiss · Thursday at 10:16 AM

igor_kavinski said:
Wouldn't N2 be severely capacity constrained?

Only the top tier parts have it. The bulk of mass production is on n3 class nodes. All they need to produce is diy desktop and server. They're also the first and only customer initially.

Joe NYC · Thursday at 10:29 AM

igor_kavinski said:
Wouldn't N2 be severely capacity constrained?

Since AMD was the first to place its orders for N2 - no.

Question Zen 6 Speculation Thread

Lifer

Lifer

Elite Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Senior member

Senior member

Senior member

Golden Member

Lifer

Senior member

Golden Member

Platinum Member

Senior member

Lifer

Diamond Member

Diamond Member

Lifer

Senior member

Diamond Member