Discussion RDNA4 + CDNA3 Architectures Thread

DisEnchantment · Mar 23, 2022

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits

History for llvm/lib/Target/AMDGPU - llvm/llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. - History for llvm/lib/Target/AMDGPU - llvm/llvm-project

github.com

Or Phoronix

More AMD "GFX940" Enablement Work Landing In LLVM - Phoronix

www.phoronix.com

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.

Previous thread on CDNA2 and RDNA3 here

Question - Speculation: RDNA3 + CDNA2 Architectures Thread

Man I have been dying to make this one for a while now. First rumours for RDNA3 are here so new thread time! Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3...

forums.anandtech.com

blckgrffn · Tuesday at 9:39 AM

eek2121 said:
Unless future consoles change, the RAM pool will be shared. Consoles previously had a dedicated pool of VRAM that is shared by the CPU and GPU, with some reserved for the OS, so on paper a console may 16gb, however in practice you may have half that.

Note that I have not looked at the PS5 Pro/switch 2, and don’t follow consoles closely in general.

That's not how it's being done in reality though, in terms of percentage.

PS5 has 12.5GB of the shared vram pool for a game. Series X has 13.5GB for a game but there are two pools of ram one that is higher speed (560GB/s) and 10GB and the other pool which is 6GB (336GB/s). It seems like a resourceful developer could likely do some optimization to use the full 13.5 GB of ram on the series X.

So the amount of ram usable for a game on console is more like 80%.

FWIW, the PS5 Pro somehow appears to have 13.7GB of memory exposed to the developers but it doesn't appear the Pro has more memory. It's unclear if a future update would expose more memory on the base PS5.

The Switch 2, from what I have gathered has 12GB total memory and 2GB system reserved.

There are CPU cores and time slices on the GPUs generally reserved as well.

It seems like the hypervisor/system reserved amount won't grow as fast as the console memory, so I would think 24GB of high speed ram with 3-3.5GB reserved by the system (up from ~2.5GB) would net 20+GB of usable memory, which is going to be an ~80% (math?) increase in available memory capacity.

If the next gen consoles increases ram from 16 to 24 GB, and the math holds and they move from ~13GB available to ~21 GB available that will be less than the previous gen on gen increase in (terms of percentage) of 6GB available generally on PS4/Xbox to 13GB (a little more than double) but that trend is on course, as the previous gen consoles had ~256 MB out of 512MB total (more for xbox in that gen) and jumped all the way to 6GB! 20GB (maybe 21?) is also out of the PC hardware comfort zone generally as of right now so that would renew our 8GB thread but for 16GB cards next

32GB consoles would be great, even if they doubled the ram reserved by the system for some reason, it would move the line to ~24GB+. Seems like a 256 bit GDDR7 pool of 24GB would more than double the effective bandwidth of the memory as well, if the same speed was used as the RTX 5080. PS5 is 448 GB/s and the 5080 is 960 GB/s. ~60% more memory and 100% more memory bandwidth paired with a solid Zen 5c cpu bump and like RDNA4+ (some custom stuff in there) seems like it would move the needle. Guess we'll see

marees · Tuesday at 1:16 PM

A few doubts/questions:

Is HPC strictly CDNA or also includes RDNA
Is N4X RDNA 4 chip for AI a possibility
If an RDNA 4 chip is being planned for AI (using gddr7) what could be the memory configurations?
Any leaks / engg samples / shipping manifests/ linux support etc ?

TSMC Details N4X Process for HPC: Extreme Performance at Minimum Leakage

www.anandtech.com

AMD to Make High-Performance Chips at TSMC Arizona Next Year

[Exclusive] Latest development means AMD joins Apple as a client of TSMC's Arizona fab. Making HPC chips there helps bring more of the AI supply chain to the US

www.culpium.com

Vikv1918 · Tuesday at 4:47 PM

I've ranted this before when 9070XT launched and I'm gonna ask this again because 9060XT has launched (yelling into the void) :
Why no AMD laptop dGPUs? There is no reason why AMD can't sell 9060 chip to laptop makers and produce $800 9060M laptops that would easily beat 5060 at iso-power. Don't tell me its lack of profits because nvidia profitably sells millions of $800-$1000 60-class laptops. AMD can go even cheaper because GDDR6 vs GDDR7. Its not like they are suffering from supply bottleneck of 4nm or GDDR6 chips. Gaming Laptops are 50% of dGPU marketshare now. It makes no sense at all to ignore 50% of the market and not even ATTEMPT at taking marketshare. Nothing short of some kind of conspiracy theories makes sense to me. And making dGPU laptops doesn't really hurt their current model of selling x86 handhelds and Strix halos. It's just complementary to that. No one has ever given me a convincing explanation to this.

igor_kavinski · Tuesday at 5:00 PM

Vikv1918 said:
No one has ever given me a convincing explanation to this.

There has to be a really bad experience behind AMD's reluctance to try to penetrate the laptop market. I believe the RX 6400/6500 XT dies were originally meant for laptops? I've also heard from people that their AMD Advantage laptops were not without problems. If you have to return a laptop, you tell your friends who then tell you, oh who told you to go AMD dGPU in a laptop? No wonder it was crap. It's partly AMD's fault for not making good enough problem-free mobile GPUs, part OEMs who don't want to be stuck with unsellable inventory and part consumers who are too accustomed to the green sticker.

By the way, same reason why we don't see a lot of laptops with Intel ARC GPUs. People would be reluctant to spend the same amount of money on them that they spend on green laptops and if they are sold for less, why sell them at all?

marees · Tuesday at 5:05 PM

igor_kavinski said:
There has to be a really bad experience behind AMD's reluctance to try to penetrate the laptop market. I believe the RX 6400/6500 XT dies were originally meant for laptops? I've also heard from people that their AMD Advantage laptops were not without problems. If you have to return a laptop, you tell your friends who then tell you, oh who told you to go AMD dGPU in a laptop? No wonder it was crap. It's partly AMD's fault for not making good enough problem-free mobile GPUs, part OEMs who don't want to be stuck with unsellable inventory and part consumers who are too accustomed to the green sticker.

By the way, same reason why we don't see a lot of laptops with Intel ARC GPUs. People would be reluctant to spend the same amount of money on them that they spend on green laptops and if they are sold for less, why sell them at all?

It must be the system integrators/OEMs. They seem to have an issue with RDNA laptops. AMD seems to have given up this market completely & rather focus on the halo. All eggs on the halo basket, it seems

Kepler_L2 · Tuesday at 5:07 PM

Vikv1918 said:
I've ranted this before when 9070XT launched and I'm gonna ask this again because 9060XT has launched (yelling into the void) :
Why no AMD laptop dGPUs? There is no reason why AMD can't sell 9060 chip to laptop makers and produce $800 9060M laptops that would easily beat 5060 at iso-power. Don't tell me its lack of profits because nvidia profitably sells millions of $800-$1000 60-class laptops. AMD can go even cheaper because GDDR6 vs GDDR7. Its not like they are suffering from supply bottleneck of 4nm or GDDR6 chips. Gaming Laptops are 50% of dGPU marketshare now. It makes no sense at all to ignore 50% of the market and not even ATTEMPT at taking marketshare. Nothing short of some kind of conspiracy theories makes sense to me. And making dGPU laptops doesn't really hurt their current model of selling x86 handhelds and Strix halos. It's just complementary to that. No one has ever given me a convincing explanation to this.

No green sticker no sales.

inquiss · Tuesday at 5:23 PM

marees said:
It must be the system integrators/OEMs. They seem to have an issue with RDNA laptops. AMD seems to have given up this market completely & rather focus on the halo. All eggs on the halo basket, it seems

It's not OEMs that have an issue, it's consumers. Consumers don't buy AMD graphics cards. DIY is your informed minority, well a portion of it. The general population...Nvidia or nothing.

Rigg · Tuesday at 10:57 PM

Rigg said:
The odds seem pretty good that his Monster Hunter RT data for 8 GB cards is 💩.

Looks like the odds might not be as good as I thought. I had a back and forth with a guy on the TPU forums and figured a couple of things out that I wasn't aware of. The MHW texture issue on 8 GB cards that Daniel Owen has been highlighting has a bit of an asterisk. He's mentioned in at least one of his videos that he's using the free benchmark with a mod that removes the cut-scenes. The free benchmark includes the Highest (Hi-Res Textures) option and has it enabled by default when the ultra preset is selected. The hi-res texture pack isn't included in the actual game by default though. It's a free DLC. They have a minimum system requirement of 16 GB clearly displayed on the download page. The texture loading issue on the 8 GB cards seems to be specific to this Highest (Hi-Res Textures) setting. I'd venture a guess that Daniel isn't aware of any of this since he doesn't appear to own the game. I doubt W1z is testing with the texture pack so his data for this game is most likely fine.

basix · 2025-06-11T04:25:29-0400

jpiniero said:
The PS6 is most likely going to have 24. 4 GB chips should be available at some point, which would mean the 128-bit cards have 16.

What about 8-channel LPDDR6? Delivers 1TB/s bandwidth like 256bit with 32Gbps GDDR7. But instead of being limited to probably 24GByte, one could use 32GByte of LPDDR6 (and LPDDR should be cheaper than GDDR7 per Gigabyte)

marees said:
A few doubts/questions:

Is HPC strictly CDNA or also includes RDNA

Is N4X RDNA 4 chip for AI a possibility

If an RDNA 4 chip is being planned for AI (using gddr7) what could be the memory configurations?

Any leaks / engg samples / shipping manifests/ linux support etc ?

TSMC Details N4X Process for HPC: Extreme Performance at Minimum Leakage

www.anandtech.com

AMD to Make High-Performance Chips at TSMC Arizona Next Year

[Exclusive] Latest development means AMD joins Apple as a client of TSMC's Arizona fab. Making HPC chips there helps bring more of the AI supply chain to the US

www.culpium.com

Not sure what you mean there. You mean things like the Radeon AI Pro R9700?
Could be. N4X is compatible with N4P and might give an additional boost. But it would only be a refresh of existing N48 and maybe also N44 chips. The most important thing here would be GDDR7, not a clockrate bump due to N4X. For gamers, the clockrate bump and 12GByte on N44 would be the most interesting aspects
That would be easy: 32GByte or 48GByte on N48 based cards
Besides of the Radeon AI Pro R9700 with 32GByte for professional use cases, which has already been released, not that I am aware of

marees · 2025-06-11T05:14:48-0400

basix said:
Not sure what you mean there. You mean things like the Radeon AI Pro R9700?

Huge gap between 5080 & 5090. The suits at AMD must be twitching ??

DAPUNISHER · 2025-06-11T06:04:59-0400

Rigg said:
Looks like the odds might not be as good as I thought. I had a back and forth with a guy on the TPU forums and figured a couple of things out that I wasn't aware of.

First: Your avatar there is hilarious. He looks like he's thinking about Denny's Nvidia bytes 😆

The screenshots that guy posted with his 4060 look like petroleum jelly is smeared over them; horrible.

Rigg said:
The MHW texture issue on 8 GB cards that Daniel Owen has been highlighting has a bit of an asterisk. He's mentioned in at least one of his videos that he's using the free benchmark with a mod that removes the cut-scenes. The free benchmark includes the Highest (Hi-Res Textures) option and has it enabled by default when the ultra preset is selected. The hi-res texture pack isn't included in the actual game by default though. It's a free DLC. They have a minimum system requirement of 16 GB clearly displayed on the download page. The texture loading issue on the 8 GB cards seems to be specific to this Highest (Hi-Res Textures) setting.

16GB requirement is because it states the game is expected to run at 4k with frame generation. I booted the Bazzite PC I am on, into 11 pro and did a run of the bench. Then I did the same settings run but with Ultra textures and highest texture streaming instead of high.

1. 1080 high - FSR AA - ray tracing low - frame generation on.
2. 1080 high - FSR AA - ray tracing low - frame generation on - ultra textures - highest texture streaming

Both passes blew past 8GB used. Ultra uses an extra 2GB at 1080; pretty brutal. But nowhere near 16GB

Rigg said:
I doubt W1z is testing with the texture pack so his data for this game is most likely fine.

High is enough at 1080 native with a little RT and FG to overflow the framebuffer. I don't have an 8GB card to test with, so I can't prove the textures would still fail to load, but based on the vram usage I saw, it seems likely. EDIT: At least on AMD. Nvidia might squeak in under the limit most of the time? I did see over 10GB in the grassy section of the bench. I'll have to run it on the 3060 12GB and see how much vram it uses compared to the 6800.

Pattern on her jacket at her elbow looks bad on high.

Rigg · 2025-06-11T10:52:52-0400

DAPUNISHER said:
First: Your avatar there is hilarious. He looks like he's thinking about Denny's Nvidia bytes 😆

The screenshots that guy posted with his 4060 look like petroleum jelly is smeared over them; horrible.

16GB requirement is because it states the game is expected to run at 4k with frame generation. I booted the Bazzite PC I am on, into 11 pro and did a run of the bench. Then I did the same settings run but with Ultra textures and highest texture streaming instead of high.

1. 1080 high - FSR AA - ray tracing low - frame generation on.
2. 1080 high - FSR AA - ray tracing low - frame generation on - ultra textures - highest texture streaming

Both passes blew past 8GB used. Ultra uses an extra 2GB at 1080; pretty brutal. But nowhere near 16GB

High is enough at 1080 native with a little RT and FG to overflow the framebuffer. I don't have an 8GB card to test with, so I can't prove the textures would still fail to load, but based on the vram usage I saw, it seems likely. EDIT: At least on AMD. Nvidia might squeak in under the limit most of the time? I did see over 10GB in the grassy section of the bench. I'll have to run it on the 3060 12GB and see how much vram it uses compared to the 6800.

Pattern on her jacket at her elbow looks bad on high.

View attachment 125300
View attachment 125301

LOL!

I'm also running bazzite (gnome desktop version) and don't have an 8 GB card to test with. I agree that the guys screenshots don't look great (blurry fom XXAA+TAA?) but that last set of screen shots he took pretty well convinced me the texture loading issue is specific to the hi-res textures and not necessarily related to exceeding the VRAM buffer. The settings I had him to try would've had to overflow the frame buffer but didn't seem to cause the textures not to load. I'm also confused about the 16 GB requirement since you correctly point out that turning the hi-res texture setting on alone doesn't use anywhere near 16 GB. I think I still have more questions than answers regarding this issue. If anybody here has an appropriate system to do some testing for us that would be awesome.

Jan Olšan · 2025-06-11T13:38:23-0400

Kepler_L2 said:
If you're referring to neural texture compression that's not really "upscaling", it's just a more efficient encoding format using low-precision math.

I wonder how efficient can it be though? The memory savings will be partially offset by extra memory eaten up by the decompression neural network models.
And the textures have to stay in the "neural" format if we want to have have any savings (because decompressing them to regular format on load into GPU memory is not helping at all), so every texture sampling instance will invoke neural network inference (or at least every load of such texture data into caches?). Which is non-trivial overhead/bottleneck compared to regular texture sampling process?

I don't like the idea of baggaging texture data sampling to this degree. The advantage of today's appraoch is that texture sampling is very cheap and also, big textures are relatively cheap way to make game graphics quality better and prettier. It gives a great baseline so that you can then use compute performance on shader effects and postprocessing, or ray tracing. This neural textures sounds like a poor idea that takes away compute performance from better uses, potentially harms the visual qualtiy of the textures and for what gains? 20% texture memory footprint gain? So that they can save 15 $ on a graphics card but then the in-game performance of the GPU drops half a tier which is worth 50-100 $ to the buyer?

Now neural materials could make sense, because they use the neural network to gain more compute performance (the proposition is that neural hallucinating can simulate the material faster/with lesser compute overhead than conventional shaders). Neural textures sacrifice compute performance to somewhat reduce memory usage. I think that's the wrong strategy/tradeoff, it's probably more efficient to give the card a bit more memory so that you can keep using regular fast texture formats and gain performance that way.

Nvidia may be under a "when you have a tensor core, everything looks like a neural network usecase" fallacy here, it wouldn't surprise me if we found out this is mistaken approach.

Ranulf · 2025-06-11T13:58:11-0400

Uh, which thread should I post benchmark stuff to? This one or the 8gb thread? Edit: I'll just post to the 8gb one.

lucasworais · 2025-06-12T11:32:35-0400

Panino Manino said:
In theory Radeon is improving. RDNA is a good architecture, however is AMD really closing in the gap?

Once again there's discussion and angry comments that Nvidia is selling less for a higher price:

View attachment 125211

But seeing these number I realise, each generation renamers a lower tier die as a higher tier card and yet it's enough to compete with AMD.
If Jensen wanted he could really have "killed" Radeon, but the way it is AMD's GPUs are no more than another tool Jensen uses to boost the value of his products and brand.

It's depressing.

It gets even worse when you think that AMD cards doesn’t even beat its Nvidia competitors, they are just a bit cheaper option.
I can only dream of the day when AMD offer a better performing card for the same price as Nvidia.

adroc_thurston · 2025-06-12T12:19:18-0400

lucasworais said:
I can only dream of the day when AMD offer a better performing card for the same price as Nvidia

Why would they do that.

soresu · 2025-06-12T12:58:10-0400

AMD announcing MI350 series....

Instinct MIxxx accelerators are not just in supercomputers, they are talking up partners like Meta using them, as well as the AI company owned by he who will not be named.

gaav87 · 2025-06-12T13:39:41-0400

Ok so i found this
GFX1250 has the same amount of SIMD units as rdna3/4 (2 per CU) and 2 (CU per WGP)
GFX1250 is targeted as "arch, 32" i guess 32bit vs "MI" 64bit throughput GFX950 is likely Instinct MI355X annouced today right ?.
So going by this and my detective skills GFX1250 is consumer GPU not Instinct compute gpu based on CDNA Compute DNA.
RX 9090XT ? xD

gaav87 · 2025-06-12T13:40:38-0400

Oh and also there is gfx1250 and gfx1251
My guess would be cpu gfx for some upcoming cpu's with rdna4+ ?

soresu · 2025-06-12T13:43:17-0400

Continuing post above...

adroc_thurston · 2025-06-12T14:00:58-0400

gaav87 said:
upcoming cpu's with rdna4+ ?

Hasn't existed for 3 years.

gaav87 · 2025-06-12T14:03:14-0400

adroc_thurston said:
Hasn't existed for 3 years.

Then some consumer based dgfx so yeah u hear it first xD

adroc_thurston · 2025-06-12T14:04:01-0400

gaav87 said:
Then some consumer based dgfx so yeah u hear it first xD

It's MI450X.

gaav87 · 2025-06-12T14:04:17-0400

adroc_thurston said:
Hasn't existed for 3 years.

Well they just added it to llvm so it could be 2+ years xD

gaav87 · 2025-06-12T14:05:34-0400

adroc_thurston said:
It's MI450X.

Then why is it arch, 32 previous name navi ? and arch, 64 is MI ?

Discussion RDNA4 + CDNA3 Architectures Thread

Golden Member

Diamond Member

Golden Member

Junior Member

Lifer

Golden Member

Senior member

Senior member

Senior member

Member

Golden Member

Super Moderator CPU Forum Mod and Elite Member

Senior member

Senior member

Platinum Member

Member

Diamond Member

Diamond Member

Senior member

Senior member

Diamond Member

Diamond Member

Senior member

Diamond Member

Senior member

Senior member