Question Zen 6 Speculation Thread

Page 148 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

OneEng2

Senior member
Sep 19, 2022
603
849
106

It would appear that real world HPC loads thrive on Zen 5 and AVX512 on a 16 core 32 thread 9950X. It doesn't appear to be bandwidth limited.
did you mean Venice
Yes.
... ooor you can use the memory it's actually designed for, and use DDR12800 MCRDIMM.
MRDIMM 12800 with 16 channels doesn't get you 1.7Tb/sec (my math could be wrong here). You are right though. DDR8000 doesn't make that number either.

Even MRDIMM 8800 gets you 2.25Tbyte/sec of bandwidth so that doesn't seem right either.

Does anyone have some math that makes sense for 1.7Tb/sec?
 

LightningZ71

Platinum Member
Mar 10, 2017
2,236
2,739
136
Is it possible to that we see 3D cache offered on Zen6c Epyc? I can't imagine that anything is going to find 16MB L3 roomy on a 16 core CCX. I also can't imagine it's any more per CCX as that's already 32MB per CCD (2 x CCX) and SRAM isn't exactly scaling amazingly...
 
Reactions: Tlh97 and Joe NYC

Joe NYC

Diamond Member
Jun 26, 2021
3,094
4,506
106
Woot! Still, this doesn't mean that all flavors of Venice are N2 (although it could mean that).

4 CCD's per IOD and max 4 IOD -> each CCD is 16 cores.

I heard somewhere that SP7 could support up to 1000W. I fully expect that Venice will draw more power than Zen 5 due to more cores and much more powerful IOD.

See my math above. I am pretty sure we are looking at a 16c CCD.

Oh, dear, looks like a new round of trolling...
 

Joe NYC

Diamond Member
Jun 26, 2021
3,094
4,506
106
The linked baidu post also shows CCD--IOD arrangements on SP7 (pictured: 2 IODs, 8 32c CCDs) and SP8 (pictured: 2 IODs, 8 12c CCDs).

The part of the baidu post that does not make sense to me is that SP8 is shown with classic Zen 6, 12 core CCD (high performance cores) but memory bandwidth is only 1/2.

I guess, with much faster memory, and less than 1/2 the cores (96c vs 256c), is it possible that AMD decided that for classic Zen 6, 8 memory channels are sufficient?
 

Joe NYC

Diamond Member
Jun 26, 2021
3,094
4,506
106
Is it possible to that we see 3D cache offered on Zen6c Epyc? I can't imagine that anything is going to find 16MB L3 roomy on a 16 core CCX. I also can't imagine it's any more per CCX as that's already 32MB per CCD (2 x CCX) and SRAM isn't exactly scaling amazingly...

32 core Zen 6c (unlike Zen 5c and Zen 4c) has full 4MB L3 per core, or 128 MB L3 per CCD. That's a big improvement. Given that AMD doubled L3 on main CCD, I doubt they will offer V-Cache version of Zen 6c.

But the classic Zen 6 will have V-Cache capability in every implementation (according to MLID), which is not a stretch, since it is likely the same die. So from desktop to premium notebook to server.

It is Zen 7 that is supposed to go full 3D.
 

adroc_thurston

Diamond Member
Jul 2, 2023
5,785
8,108
96

StefanR5R

Elite Member
Dec 10, 2016
6,506
10,122
136
@OneEng2, whether or not a computer is main memory bandwidth constrained very much depends on the dataset size (the hot part of it, that is), and on the access patterns of the algorithm. (And more; e.g. in case of a multi-CCX CPU on whether program threads are scheduled on same or different CCXs; one or the other way makes a difference under some circumstances.) Hence, the tersest correct answer to "Is memory channel count N × memory bus speed M sufficient?" is neither "Yes" nor "No" but always "It depends".

@Joe NYC, as for SP8's channel count: I consider it not a regression from SP5, but a progression from SP6 and continuity from SP3; also continuity from the many SP5 based servers which implement only 8 channels per socket too. (Apropos, will any new CPUs be offered for the SP6 platform, or will it remain on Zen 4 until eol?)
 
Reactions: Tlh97 and Joe NYC

OneEng2

Senior member
Sep 19, 2022
603
849
106
You sure you have the same calculator as the rest of the world?

8GB/s per 1GT/s
x
12.8GT/s
x
16 channels

= 1638.4GB/s
Possibly not!

If it's 8bytes per transfer I see your math. Thanks.

So Venice D gets 6.4 GB/sec/core.

Does this come at a latency penalty over DDR8000?
 

StefanR5R

Elite Member
Dec 10, 2016
6,506
10,122
136
Does [MR-DIMM] come at a latency penalty over DDR8000?
MR-DIMM/MCR-DIMM promoters are quick to point out that there is lower latency in load scenarios near the bandwidth limit. I haven't seen discussions of latency during low-to-medium bandwidth utilization.

In theory, the fact that you need additional buffers and multiplexers means that there must be additional latency. In practice, I don't know if this matters given DRAM's inherent latency and the sort of latencies which we have in R-DIMMs and, notably, in LR-DIMMs today. — Edit: Are there any DDR5 LR-DIMMs out there? Available DDR4 LR-DIMMs have the same latencies (on paper) as DDR4 R-DIMMs with same transfer rate.

(BTW, MR-DIMM apparently copies LR-DIMM's advantage of larger memory size per DIMM, by way of more possible ranks per module, which is good for those who need RAM capacity.)

Another property of MR-DIMM to keep in mind is that you only enjoy the increased bandwidth in memory accesses which are a multiple of 128 Bytes (compared to 64 Bytes with [R-]DIMM = one traditional cache line).

So to be clear, asr we saying each channel of memory has 2 banks of 6400MT/s memory giving each channel 12800MT/s?
Yep.
 
Last edited:

Joe NYC

Diamond Member
Jun 26, 2021
3,094
4,506
106
It switched like last year when Turin-D was N3e and $$$.

I wonder how well it does vs. Arm hordes such as Graviton, in TCO.

Disregard everything not actual product roadmaps.

That particular tidbit makes sense though, especially in the light of what you just posted, that the Dense is now the premium product - that it would be the first to feature full 3D design.
 

Joe NYC

Diamond Member
Jun 26, 2021
3,094
4,506
106
@Joe NYC, as for SP8's channel count: I consider it not a regression from SP5, but a progression from SP6 and continuity from SP3; also continuity from the many SP5 based servers which implement only 8 channels per socket too. (Apropos, will any new CPUs be offered for the SP6 platform, or will it remain on Zen 4 until eol?)

I have not paid enough attention. Has there been no Zen 5 announced for the socket? Given that Turin Dense is more expensive chip, and the platform is supposed to be low cost, it becomes a little contradictory...
 

DrMrLordX

Lifer
Apr 27, 2000
22,596
12,484
136
bad example since Neoverse is communist thirdworldism for hyperscalers. It's not gonna last much longer.

Considering how much $$$ Amazon makes off Graviton, I can't see that being true. Is ARM going to cut them off or make them pay an unacceptably large sum to keep it going?
 

Joe NYC

Diamond Member
Jun 26, 2021
3,094
4,506
106
Yeah they make money at ARM's expense.

Yeah, it's gonna be "pay real licensing costs or shop our merchant Si" kinda deal.

That's what the ARM intentions / agenda seems to be. I wonder if ARM can pull it off.

ARM had a setback suing QCOM. Different basis of lawsuit vs. ARM's issues with Amazon. Only the same in that ARM wants a lot more money from licensing than before.
 

OneEng2

Senior member
Sep 19, 2022
603
849
106
EPYC D and normal EPYC do not target the same workloads IMO.


There are certainly compute intensive loads that would favor Zen with lots of cache vs. Zen will hardly any cache (the difference between Zen c and Zen cores). Of course, if the normal Venice (not D) does have only 92 cores, it wont be competitive in many DC applications compared to Diamond Rapids. Also it is completely non-sensical to imagine that standard Venice would be a huge step back from standard Turin.

Additionally, considering the massive uplift in bandwidth from 576MB/sec (DDR6000) to 1.6TB/sec (~%300), I don't see the logic in Venice not supporting even higher core counts. In fact, considering the die size that the 32c variant is at (I am guessing ~170mm2), it would not be impossible for AMD to create a 48c CCD which would be more like 255mm2.

As with all other discussions we have had on this subject, the question really boils down to .... do they need it?

Clearwater forest is rumored to be 288 cores. Unless those darkmont cores have some pretty spiffy upgrades to have SMT (which I personally believe would require a ground-up redesign), then it is likely that, like today, a single Zen 6c will = 1.5 Darkmont cores placing the performance of a 256 core Venice D well out of performance range of Clearwater forest.

Now, Diamond Rapids from what I can gather isn't looking as impressive with rumors still floating near 128c and 12 channels of memory.

Still, I would think that a 128c Panther Cove X would best a 92 core Zen 6 all day long (assuming it did have SMT).
 

adroc_thurston

Diamond Member
Jul 2, 2023
5,785
8,108
96
EPYC D and normal EPYC do not target the same workloads IMO
They do now.
Additionally, considering the massive uplift in bandwidth from 576MB/sec (DDR6000) to 1.6TB/sec (~%300), I don't see the logic in Venice not supporting even higher core counts. In fact, considering the die size that the 32c variant is at (I am guessing ~170mm2), it would not be impossible for AMD to create a 48c CCD which would be more like 255mm2.
The core count is determined by what CSPs want in term of socket core density and core@DRAM ratios.
Clearwater forest is rumored to be 288 cores.
CWF is DOA with no customer wins.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,045
15,992
136
EPYC D and normal EPYC do not target the same workloads IMO.


There are certainly compute intensive loads that would favor Zen with lots of cache vs. Zen will hardly any cache (the difference between Zen c and Zen cores). Of course, if the normal Venice (not D) does have only 92 cores, it wont be competitive in many DC applications compared to Diamond Rapids. Also it is completely non-sensical to imagine that standard Venice would be a huge step back from standard Turin.

Additionally, considering the massive uplift in bandwidth from 576MB/sec (DDR6000) to 1.6TB/sec (~%300), I don't see the logic in Venice not supporting even higher core counts. In fact, considering the die size that the 32c variant is at (I am guessing ~170mm2), it would not be impossible for AMD to create a 48c CCD which would be more like 255mm2.

As with all other discussions we have had on this subject, the question really boils down to .... do they need it?

Clearwater forest is rumored to be 288 cores. Unless those darkmont cores have some pretty spiffy upgrades to have SMT (which I personally believe would require a ground-up redesign), then it is likely that, like today, a single Zen 6c will = 1.5 Darkmont cores placing the performance of a 256 core Venice D well out of performance range of Clearwater forest.

Now, Diamond Rapids from what I can gather isn't looking as impressive with rumors still floating near 128c and 12 channels of memory.

Still, I would think that a 128c Panther Cove X would best a 92 core Zen 6 all day long (assuming it did have SMT).
I can't find anyplace that says venice MIGHT have 92 cores. Only 96 and 128.
I also don't think the 96 core would be in compition for Panther lake, only the 128 core varieties, and I think Zen 6 would beat them. We will see.
 
Reactions: Tlh97 and OneEng2

MS_AT

Senior member
Jul 15, 2024
677
1,368
96
It would appear that real world HPC loads thrive on Zen 5 and AVX512 on a 16 core 32 thread 9950X. It doesn't appear to be bandwidth limited.
You either do not understand what we are talking about or trolling. To validate your claim you would need to compare 9950X against a 16C Epyc F chip (F to minimize clock difference) and then compare their performance in those benchmarks. Places where Epyc will win are memory bound. Assuming the software is CCD aware and things like that.

The fact that 9950x won in the comparisons you posted does not mean it is not memory bandwidth limited. It means that either these workloads are not bound by memory bandwidth or from chips that participated in the test it had the best bandwidth to compute ratio.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |