Discussion RDNA4 + CDNA3 Architectures Thread

DisEnchantment · Mar 23, 2022

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits

History for llvm/lib/Target/AMDGPU - llvm/llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. - History for llvm/lib/Target/AMDGPU - llvm/llvm-project

github.com

Or Phoronix

More AMD "GFX940" Enablement Work Landing In LLVM - Phoronix

www.phoronix.com

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.

Previous thread on CDNA2 and RDNA3 here

Question - Speculation: RDNA3 + CDNA2 Architectures Thread

Man I have been dying to make this one for a while now. First rumours for RDNA3 are here so new thread time! Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3...

forums.anandtech.com

Ajay · Oct 12, 2023

TESKATLIPOKA said:
You didn't provide any evidence, just your guesswork with some glaring mistakes, that's all.

I think he's just having fun speculating.

TESKATLIPOKA · Oct 12, 2023

Ajay said:
I think he's just having fun speculating.

That's ok, I also like to speculate. I just don't agree with what he wrote.

jpiniero · Oct 12, 2023

TESKATLIPOKA said:
BTW, 40CU is a low amount for a 250mm2 N4P GPU.

If you took the 7700XT's configuration as is, it would be 311 mm2 (albeit with 60 CUs phyiscally). You're not really getting much of a shrinkage with N4P compared to N5 in logic... and not much of a cache shrinkage from N6 either. And you figure AMD is going to spend a bit of transistors on RT at the very least.

More than 40 is possible I think but 60's too much unless it's closer to 300 mm2.

TESKATLIPOKA · Oct 12, 2023

jpiniero said:
If you took the 7700XT's configuration as is, it would be 311 mm2 (albeit with 60 CUs phyiscally). You're not really getting much of a shrinkage with N4P compared to N5 in logic... and not much of a cache shrinkage from N6 either. And you figure AMD is going to spend a bit of transistors on RT at the very least.

More than 40 is possible I think but 60's too much unless it's closer to 300 mm2.

I just said 40CU is too little.
250mm2 vs 311mm2 would be a 20% die shrink.
Of course N4P doesn't provide the necessary scaling, but let's not forget that AMD's chiplet design also uses extra die space for the interconnects both in MCDs and GCD. That should also save some space.

Ajay · Oct 12, 2023

TESKATLIPOKA said:
I just said 40CU is too little.
250mm2 vs 311mm2 would be a 20% die shrink.
Of course N4P doesn't provide the necessary scaling, but let's not forget that AMD's chiplet design also uses extra die space for the interconnects both in MCDs and GCD. That should also save some space.

The new chips will now need a direct connection to memory - so I don't see any space saving. Sadly.

TESKATLIPOKA · Oct 12, 2023

Ajay said:
The new chips will now need a direct connection to memory - so I don't see any space saving. Sadly.

MCD had PHYs inside and now It will be inside the monolithic die, so nothing extra will be added.
You just get rid of the interconnect connecting MCDs to GCD.

Ajay · Oct 12, 2023

TESKATLIPOKA said:
MCD had PHYs inside and now It will be inside the monolithic die, so nothing extra will be added.
You just get rid of the interconnect connecting MCDs to GCD.

Ah wording. The difference between using 'and' and 'to'.

moinmoin · Oct 12, 2023

Heartbreaker said:
So it appears most government Supercomputer use NVidia GPUs. Where is this supposed lockout?

Exascale supercomputers curiously seem to miss Nvidia GPUs so far.

Also this is about avoiding lock ins into Nvidia's ecosystem, something AMD along Intel and many other industry players are pushing for. What "lockout" are you talking about?

Ajay said:
I don't get this. Nvidia isn't going anywhere. So it is sustainable, because Nvidia has established itself as a reliable partner. The risk to Nvidia, is as more competitors arrive on the scene, that would threaten their 'walled garden' - if those competitors opt for open source alternatives.

Can't help you if you really want to make yourself completely dependent on the proprietary ecosystem of another company. While Nvidia is no Microsoft or Google (both prone to hype then cancel tech all the time) it's still a very costly and risky area to go all in without plan B alternatives.

Heartbreaker said:
Recent AMD Supercomputer wins, is no indication of some kind of government policy blocking NVidia.

It's not about blocking Nvidia, it's about building and preserving choices, something Nvidia obviously wants to avoid happening with its proprietary ecosystem.

And AMD has the fitting solutions for alternative plans without lock ins, that's all.

Heartbreaker · Oct 12, 2023

moinmoin said:
Exascale supercomputers curiously seem to miss Nvidia GPUs so far.

Also this is about avoiding lock ins into Nvidia's ecosystem, something AMD along Intel and many other industry players are pushing for. What "lockout" are you talking about?

You said "Nvidia is still a complete non-starter in areas where this matters" when I asked where you said: "All the government supercomputer contracts"

That Implies that NVidia was somehow blocked from these contracts, but NVidia was used for many of them.

krawcmac · Oct 13, 2023

Heartbreaker said:
You said "Nvidia is still a complete non-starter in areas where this matters" when I asked where you said: "All the government supercomputer contracts"

That Implies that NVidia was somehow blocked from these contracts, but NVidia was used for many of them.

Why do you discuss Nvidia's ability/inability to compete for gov. contracts in RDNA4 thread? Maybe you should start a new thread.

moinmoin · Oct 13, 2023

Heartbreaker said:
You said "Nvidia is still a complete non-starter in areas where this matters" when I asked where you said: "All the government supercomputer contracts"

That Implies that NVidia was somehow blocked from these contracts, but NVidia was used for many of them.

This being an AMD thread I'm referring exclusively to AMD, so "all the government supercomputer contracts AMD is currently getting". And that's the area where Nvidia does appear to be a complete non-starter, at least I haven't yet seen any government exascale supercomputers that include Nvidia (not that Nvidia should care with the current AI HPC hype going on, but this is an AMD thread).

Whether Nvidia was outright blocked from those or one of the prerequisites was to have an open source driver or ecosystem I don't know (though the latter would perfectly explain Nvidia suddenly started opening up little parts of its drivers last year). Maybe you can find out for us in an Nvidia thread.

darkswordsman17 · Oct 13, 2023

Tigerick said:
It is business decision, my friend. AMD is expecting N32 & N33 to have shelf life of two years. Especially with N32 is only available for desktop variant....

Anyhow, below is my speculation and price estimates of RDNA4 & RDNA5 lineups for reference:

SRP 2023 N5/N6 2024 N4P 2025/2026 N3E
$999 7900XTX 24GB 384-bit GDDR6 96 CU N51 24GB 384-bit GDDR7 210 CU
$899 7900XT 20GB 320-bit GDDR6 84 CU N51L 20GB 320-bit GDDR7 ?
$699 N52 16GB 256-bit GDDR7 140 CU
$599 N52L ? ?
$499 7800XT 16GB 256-bit GDDR6 60 CU N53 12GB 128-bit GDDR7 70 CU
$449 7700XT 12GB 192-bit GDDR6 54 CU
$399 ? ?
$299 ? ?
$269 7600 8GB 128-bit GDDR6 32 CU

I know it is pre-mature to list the prices, let alone WGP numbers, but we got to start somewhere. When AMD started new design, they will think about price, position, process node and memory bandwidth requirements. Here comes the table; I could be wrong but I could also modify later, that's how we learned.

We don't know much about RDNA5 but at least we know they are going to support GDDR7. And since AMD going to employ chiplet design, the numbers of CU should be multiply of base number. That number is hard to know, but cause of 270/3 number appeared before, I could make a wild guess if that number is divided by 4, we get maximum "theoretical" CU numbers above. The CU number also aligns with my calculation of bandwidth's improvement that I listed on my frontpage of Blackwell series. Feel free to disagree with me if you have different numbers in mind.

There is a wide gap between $500 and $900 without any GPU from AMD, I think I just found out how AMD going to fill it with N52, yes that's mean AMD going to price upcoming 8800XT with 16GB to $699 if my calculation is correct.

If NV going to drop the price of RTX4080 to $999 next year and latest rumor is NV going to release RTX5000 series end of next year; we could see sharp price drop of N31 series, remember price cut of RX6950XT from $1,099 to $599??? Let's hope AMD won't drop down like this but if AMD is not releasing RDNA5 until end of 2025.......Screwup and delay do have consequences. Hmm, maybe that is the reason debaited guy is leaving????

PS: I am confused about AMD's RDNA4 lineup now, so let's focus on RDNA5

Based on that, and assumption RNDA3 will see price drops (guessing 7700XT goes to $399, and I expect the 7900XT goes to $799 or maybe $749, and 7600 probably goes to $249), seems to me the smart move for RDNA4 would be a 72CU at $649 or $599 and a 46 or 48 CU at $375 or $349. The lower one could be 8600 and the upper one be 8800, clearly not topping the 7700XT or 7900XT, but beating a 7600 and 7800.

This is wishful thinking, but maybe there's a darkhorse 3rd option for memory config. It'd be a way for AMD to get some return out of trialing an alternative in trying to get GPU chiplets working properly: HBM. It would let them skip the large infinity cache, but also would net benefits over GDDR, so they can skip the apparently expensive GDDR7 (enough that even Nvidia supposedly is gonna widen their memory bus and stick with GDDR6X). It would let them hit efficiency improvements, which could be important for mobile (where they basically have the two tiers and theoretical 8800 and 8600 would work well). It would also be suitable for certain pro workloads where they can offer higher memory configs in a still compact package. They could offer Nano style cards on desktop, which I think could be popular if the performance is good. They can do a 4high stack of HBM3 for 12GB on the lower one, and then a 6 high stack for 18GB. Then Pro models aimed at large modeling could do 24GB or 36GB from single stacks. I think it could also open up a potential 3rd use for it as well. Just a 1-2 high stack in an APU config, where they do a 20-24 CU GPU chiplet, and a CPU chiplet (12-16 Zen 5c cores with HBM again letting them get by with less cache). Or if they go for higher stack or 2 stacks, 16GB+ would enable them to skip the normal DDR memory as well. It would easily beat the current stuff and would be perfect for the handheld gaming devices.

Heartbreaker · Oct 13, 2023

krawcmac said:
Why do you discuss Nvidia's ability/inability to compete for gov. contracts in RDNA4 thread? Maybe you should start a new thread.

I'm not the one that brought it up. If someone is going to make unsupported claims, they should be prepared to back them up.

moinmoin said:
Whether Nvidia was outright blocked from those or one of the prerequisites was to have an open source driver or ecosystem I don't know (though the latter would perfectly explain Nvidia suddenly started opening up little parts of its drivers last year). Maybe you can find out for us in an Nvidia thread.

So no evidence at all that there is an open source requirement, or anything else blocking NVidia. You just made that up.

moinmoin · Oct 13, 2023

Heartbreaker said:
So no evidence at all that there is an open source requirement, or anything else blocking NVidia. You just made that up.

No, I said...

Ajay said:
The best thing AMD has going for it

moinmoin said:
...actually having very decent open source solutions where it matters. Nvidia is still a complete non-starter in areas where this matters.

In the current round of exascale supercomputer government contracts only Intel (incidentally also pushing open source as part of its oneAPI) and AMD won something while winners of the previous gens IBM (which has no GPU tech) and Nvidia (which had no open sources strategy to speak of until last year) didn't.

And those exascale supercomputer government contracts matter to AMD as they are what formed its roadmap through its dry years with little financial flow with the help of DARPA's numerous Forward programmes supporting exascale supercomputer efforts from a decade ago onward (Intel, IBM and Nvidia and many more were also supported as part of it).

So yes, I completely stand by what I wrote.

jpiniero · Oct 13, 2023

I don't think it's an Open Source requirement at all. The one thing nVidia lacks is a Server CPU, which they are working on... but I think it's more that they are a PITA to deal with.

AND and even Intel are much better partners.

Heartbreaker · Oct 13, 2023

moinmoin said:
No, I said...

In the current round of exascale supercomputer government contracts only Intel (incidentally also pushing open source as part of its oneAPI) and AMD won something while winners of the previous gens IBM (which has no GPU tech) and Nvidia (which had no open sources strategy to speak of until last year) didn't.

And those exascale supercomputer government contracts matter to AMD as they are what formed its roadmap through its dry years with little financial flow with the help of DARPA's numerous Forward programmes supporting exascale supercomputer efforts from a decade ago onward (Intel, IBM and Nvidia and many more were also supported as part of it).

So yes, I completely stand by what I wrote.

The fact that Intel won with a GPU with no track record and still hasn't delivered likely means the real reason for GPU choice here is price, and Intel and to a lesser extend AMD were will to sacrifice margin more than NVidia.

Europe which is notably more socialist and more likely to demand Open Source, is building it's first Exascale Supercomputer with NVidia GPUs:

Europe's First Exascale Supercomputer Will Run on ARM Instead of x86

The Jupiter supercomputer will have Rhea ARM processors and Nvidia GPUs.

www.extremetech.com

Ajay · Oct 13, 2023

Heartbreaker said:
The fact that Intel won with a GPU with no track record and still hasn't delivered likely means the real reason for GPU choice here is price, and Intel and to a lesser extend AMD were will to sacrifice margin more than NVidia.

Europe which is notably more socialist and more likely to demand Open Source, is building it's first Exascale Supercomputer with NVidia GPUs:

Europe's First Exascale Supercomputer Will Run on ARM Instead of x86

The Jupiter supercomputer will have Rhea ARM processors and Nvidia GPUs.

www.extremetech.com

The US government/ military is basically funding advanced research in US semiconductor companies with these projects to maintain our edge. They like to spread the money around as much as they can. That’s why Intel won (plus heavy lobbying). The technobabble is just CYA really.

Ajay · Oct 13, 2023

moinmoin said:
Can't help you if you really want to make yourself completely dependent on the proprietary ecosystem of another company. While Nvidia is no Microsoft or Google (both prone to hype then cancel tech all the time) it's still a very costly and risky area to go all in without plan B alternatives.

I don’t want it or not want it. It’s just that right now, companies can ramp up faster with Nvidia HW/SW. The Open source solutions need more funding and engineering talent devoted to them to catch up. I hope they succeed - competition is really vital in tech, and most other businesses.

TESKATLIPOKA · Oct 13, 2023

I think you guys mistook the thread. Make another one where you can discuss It. Thanks.

darkswordsman17 said:
Based on that, and assumption RNDA3 will see price drops (guessing 7700XT goes to $399, and I expect the 7900XT goes to $799 or maybe $749, and 7600 probably goes to $249), seems to me the smart move for RDNA4 would be a 72CU at $649 or $599 and a 46 or 48 CU at $375 or $349. The lower one could be 8600 and the upper one be 8800, clearly not topping the 7700XT or 7900XT, but beating a 7600 and 7800.

I also was thinking about such configuration, but with a lot higher clocks compared to RDNA3.
If the clockspeed was the same, then the cutdown variant from 72CU->64CU would be too close to 7800XT.

branch_suggestion · Oct 13, 2023

Heartbreaker said:
The fact that Intel won with a GPU with no track record and still hasn't delivered likely means the real reason for GPU choice here is price, and Intel and to a lesser extend AMD were will to sacrifice margin more than NVidia.

Europe which is notably more socialist and more likely to demand Open Source, is building it's first Exascale Supercomputer with NVidia GPUs:

Europe's First Exascale Supercomputer Will Run on ARM Instead of x86

The Jupiter supercomputer will have Rhea ARM processors and Nvidia GPUs.

www.extremetech.com

Atos/SiPearl are the primary contractors, even if most of the FLOPS are NV derived, they are an addition to the project, not the driver.
Not to mention it is mostly a Fugaku clone focusing on high BW/FLOP.
NV lost the exascale CORAL-2 contracts due to no solid CPU choice at the time, and NV not wanting to use Slingshot. Lastly is of course cost, AMD and Intel were willing to go lower.
So they get scraps and memey US/European university supercomputers instead.

soresu · Oct 14, 2023

branch_suggestion said:
Not to mention it is mostly a Fugaku clone focusing on high BW/FLOP.

Fugaku uses a custom ARM64 CPU core (A64FX) designed just for that purpose vs more off the shelf Neoverse V1 IP for Rhea.

The only significant similarities here are that both are ARM64 v8-A iSA, and SVE1 SIMD units.

Beyond that there are significant SIMD capacity differences.

Each A64FX core has 2x 512 bit wide units, while V1 has only 2x 256 bit wide units - though I have no idea how well V1 and A64FX compare beyond that in scalar compute.

tajoh111 · Oct 14, 2023

MI300 is coming out too late.

Particularly if you look at Nvidia latest roadmap.

I suspect most of the mi300 this year is going to EL Capitan.

Meaning most of the MI300 being sold is being done in 2024.

This is going to be compared against Blackwell which is being unveiled in March of 2024. With Nvidia going to a yearly release for datacenter, it gives them the ability to use the fastest Memory, manufacturing tech giving AMD a brutal competition. Rumors point to Blackwell selling in high volume in 2nd of of 2024.

This has other consequences for AMD because this type of volume and money may cause AMD to have supply issues with TSMC.

With Nvidia likely being a 100 billion revenue company next year and net profit in the 40 to 50 billion range(a quarter being greater than AMD total profit in the last decade), they will have the grunt to wipe out AMD's AI data center plans through strangulation in the supply chains, developers support and accelerated road maps.

Hopper is sold out of 2024 which translates into about 80 billion dollars(2 million units at 40k each). Add in other Nvidia revenue like Blackwell and gaming and it is simply a monstrous amount of revenue. This gives Nvidia the financial horsepower to produce a 3nm data center chip in 2024 which is going to be compared against 5nm mi300.

By the time AMD get's to 3nm, Nvidia will be on 2nm. AMD is losing it's position at TSMC.

TSMC Is Sprinting to 2nm to Satisfy Demand From Nvidia, Apple

Getting ready for 2nm trial production and using Nvidia AI for optimized chip floor planning

www.tomshardware.com

This will have consequences for AMD in the rest of it's product roadmaps as Intel uses TSMC more and it loses clout at TSMC. AMD has a hard choice ahead which division is going to be sacrificed in order for the rest of the products to succeed. I think consumer graphics is going to be that item that gets heavy cuts again like in the past. The cancelation of Navi 41 was a prelude to this I think.

AMD needs to have more forward thinking. Nvidia dominated supercomputers in the past and still has 5 of the top 10 super computers in the world.

https://www.top500.org/lists/top500/list/2023/06/

Look at the rest of the list and it is still dominated by Nvidia. Nvidia has just moved on to bigger and better things. A couple super computer from the US Goverment worth 1.2 billion dollars every 5 years is decent money for AMD..... but for Nvidia that's soon to be the weekly sales of H100, with better margins to boot.

TESKATLIPOKA · Oct 14, 2023

tajoh111 said:
MI300 is coming out too late.

Particularly if you look at Nvidia latest roadmap.

I suspect most of the mi300 this year is going to EL Capitan.

Meaning most of the MI300 being sold is being done in 2024.

This is going to be compared against Blackwell which is being unveiled in March of 2024. With Nvidia going to a yearly release for datacenter, it gives them the ability to use the fastest Memory, manufacturing tech giving AMD a brutal competition. Rumors point to Blackwell selling in high volume in 2nd of of 2024.

This has other consequences for AMD because this type of volume and money may cause AMD to have supply issues with TSMC.

With Nvidia likely being a 100 billion revenue company next year and net profit in the 40 to 50 billion range(a quarter being greater than AMD total profit in the last decade), they will have the grunt to wipe out AMD's AI data center plans through strangulation in the supply chains, developers support and accelerated road maps.

Hopper is sold out of 2024 which translates into about 80 billion dollars(2 million units at 40k each). Add in other Nvidia revenue like Blackwell and gaming and it is simply a monstrous amount of revenue. This gives Nvidia the financial horsepower to produce a 3nm data center chip in 2024 which is going to be compared against 5nm mi300.

By the time AMD get's to 3nm, Nvidia will be on 2nm. AMD is losing it's position at TSMC.

TSMC Is Sprinting to 2nm to Satisfy Demand From Nvidia, Apple

Getting ready for 2nm trial production and using Nvidia AI for optimized chip floor planning

www.tomshardware.com

This will have consequences for AMD in the rest of it's product roadmaps as Intel uses TSMC more and it loses clout at TSMC. AMD has a hard choice ahead which division is going to be sacrificed in order for the rest of the products to succeed. I think consumer graphics is going to be that item that gets heavy cuts again like in the past. The cancelation of Navi 41 was a prelude to this I think.

AMD needs to have more forward thinking. Nvidia dominated supercomputers in the past and still has 5 of the top 10 super computers in the world.

https://www.top500.org/lists/top500/list/2023/06/

Look at the rest of the list and it is still dominated by Nvidia. Nvidia has just moved on to bigger and better things. A couple super computer from the US Goverment worth 1.2 billion dollars every 5 years is decent money for AMD..... but for Nvidia that's soon to be the weekly sales of H100, with better margins to boot.

Thanks for presenting Nvidia's superiority in a thread mostly about RDNA4.
I really appreciate your and the others' effort in spamming this thread with pretty much unrelated stuff.

GodisanAtheist · Oct 14, 2023

Nvidia's mindshare is so overwhelming that 75-80% of an AMD thread has to become an NV thread.

Frenetic Pony · Oct 14, 2023

GodisanAtheist said:
Nvidia's mindshare is so overwhelming that 75-80% of an AMD thread has to become an NV thread.

I'm glad AMD finally kicked their (department) head out, they've really needed someone that knows how to compete with Nvidia in mindshare terms and that guy clearly could not.

Anyway, ticking it over in my head, CU counts are probably
40CU/128bit
60CU/192bit
80CU/256bit
120CU/384bit

160 is too big and 100 is too odd, and this setup matches CU to bandwidth perfectly.
I wonder if they'll revisit the X3D for GPU variation, at least on the highest end chip.

SRP	2023	N5/N6	2024	N4P	2025/2026	N3E
$999	7900XTX 24GB 384-bit GDDR6	96 CU			N51 24GB 384-bit GDDR7	210 CU
$899	7900XT 20GB 320-bit GDDR6	84 CU			N51L 20GB 320-bit GDDR7	?
$699					N52 16GB 256-bit GDDR7	140 CU
$599					N52L ?	?
$499	7800XT 16GB 256-bit GDDR6	60 CU			N53 12GB 128-bit GDDR7	70 CU
$449	7700XT 12GB 192-bit GDDR6	54 CU
$399			?	?
$299			?	?
$269	7600 8GB 128-bit GDDR6	32 CU

Discussion RDNA4 + CDNA3 Architectures Thread

Golden Member

Lifer

Platinum Member

Lifer

Platinum Member

Lifer

Platinum Member

Lifer

Diamond Member

Diamond Member

Junior Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Lifer

Diamond Member

Lifer

Lifer

Platinum Member

Senior member

Platinum Member

Senior member

Platinum Member

Diamond Member

Senior member