Discussion RDNA4 + CDNA3 Architectures Thread

DisEnchantment · Mar 23, 2022

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits

History for llvm/lib/Target/AMDGPU - llvm/llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. - History for llvm/lib/Target/AMDGPU - llvm/llvm-project

github.com

Or Phoronix

More AMD "GFX940" Enablement Work Landing In LLVM - Phoronix

www.phoronix.com

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.

Previous thread on CDNA2 and RDNA3 here

Question - Speculation: RDNA3 + CDNA2 Architectures Thread

Man I have been dying to make this one for a while now. First rumours for RDNA3 are here so new thread time! Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3...

forums.anandtech.com

igor_kavinski · May 3, 2024

It just shows that most humans are risk-averse and change-averse. People pretending to be wise by going with the popular or trendy thing rather than experimenting and hoping to discover something new. Blindly accepting common misconceptions and falsehoods. No wonder humanity has screwed planet Earth so bad.

Joe NYC · May 3, 2024

Mopetar said:
Best way to move more units of big expensive datacenter GPUs is to make smaller consumer GPUs that don't use as many wafers.

What are NVidia customers going to do in that scenario, switch to Intel?

The tradeoff argument would be valid only if TSMC has a shortage of capacity, if TSMC is at full utilization.

I have not listened to their latest investor CC, but we can safely assume that TSMC is not full capacity on the nodes in question (N5, N4).

As far as Intel, Intel is also using TSMC. Their old Arc cards are on N6, on which TSMC has endless capacity...

gdansk · May 3, 2024

igor_kavinski said:
It just shows that most humans are risk-averse and change-averse. People pretending to be wise by going with the popular or trendy thing rather than experimenting and hoping to discover something new. Blindly accepting common misconceptions and falsehoods. No wonder humanity has screwed planet Earth so bad.

It's not like Radeon is LSD it's just a slightly cheaper GPU.

gaav87 · May 3, 2024

Aapje said:
This is not really true, as shaders are compiled exactly so that the final code can be optimized for the architecture of the card. And Nvidia and AMD can handcode parts of the shader for certain games to optimize it further.

An issue is that the compiler for dual issue was really poor and probably only made modest gains. See the compiler section in: https://chipsandcheese.com/2023/01/07/microbenchmarking-amds-rdna-3-graphics-architecture/

How is my statement "Games are often optimized for wave32 execution." false ?

Gains from dual issue were next to none. Check ancient gameplays on YT rx7900gre vs rx 6950xt at the same core and memory speed. Only UE5 games got above 15% performance increase. Rest were +-0. Even some losses. Dual issue on rdna3 was still 128b doesnt matter if two neighbouring simds could do the same calculation still 128b if the result needs to be equal this would only increase performance in dot products or ML/AI as seen in SD 7900gre is like 2x faster then 6950xt.

I think amd separated the RT calculations from the alu's. And they either went 6x 32wide simd's per wgp for 192b of data as two FMAs needs 1536B so each piece of data needs to be reused 8 times instead of 12x as in rdna3. So they can market +50% performance/watt "slides". And the boost clock is around 2750mhz.
Or they went with some crazy 8x simd 16wide combo and the boost clock is 3050mhz.
I do not see N7->N4 clock increase at same iso power to be over 25-30% as [N7->N4p 26% and N7->N4x 30%]
Smaller simd's would allow them to increase the clock speed ? (not sure about that)
Reference rdna2 6800xt or 6800 boost around 2200mhz at 250W and 3050mhz would be around 40-45% clock increase that is crazy.

adroc_thurston · May 3, 2024

gaav87 said:
How is my statement "Games are often optimized for wave32 execution." false ?

They aren't.
Shader compiler does the 'optimization'.

gaav87 said:
Gains from dual issue were next to none.

You're not supposed to have much if any. It's an opportunistic throughput hack.

gaav87 said:
I think amd separated the RT calculations from the alu's. And they either went 6x 32wide simd's per wgp for 192b of data as two FMAs needs 1536B so each piece of data needs to be reused 8 times instead of 12x as in rdna3. So they can market +50% performance/watt "slides". And the boost clock is around 2750mhz.
Or they went with some crazy 8x simd 16wide combo and the boost clock is 3050mhz.

that's really-really not what happened.
They're not doing ALU spam.
I get it, you understand next to nothing about GPUs.
But no need to write essays about it.

gaav87 said:
3050mhz would be around 40-45% clock increase that is crazy.

baby clocks

Aapje · May 3, 2024

gaav87 said:
How is my statement "Games are often optimized for wave32 execution." false ?

You don't seem to have read my reasons so to repeat myself again:
- Games aren't specifically optimized for certain cards, the shaders are compiled for the specific card on the PC of the user
- The shader compiler for RDNA3 is/was bad and often doesn't take advantage of dual issue opportunities (see the link from my previous message that you didn't read)

And what adroc says is correct as well, dual issue is a bit of a hack that works much better for compute and not so well for games. It can only work for very specific things, since both operations have to work on the same data.

gaav87 · May 3, 2024

adroc_thurston said:
They aren't.
Shader compiler does the 'optimization'.

You're not supposed to have much if any. It's an opportunistic throughput hack.

that's really-really not what happened.
They're not doing ALU spam.
I get it, you understand next to nothing about GPUs.
But no need to write essays about it.

baby clocks

1. Known shaders are w32 they can optimize shaders by hand for w64...
2. Thats what i said so why u quoting me ?
3. Maybe not next to nothing but im just, a civil structural engineer and read the white papers for fun as a hobby mr know it all.
You dont need to be rude af.

So you think they changed nothing from rdna3 and magically managed to get a 500-600mhz clock increase ? xD

adroc_thurston · May 3, 2024

Aapje said:
The shader compiler for RDNA3 is/was bad and often doesn't take advantage of dual issue opportunities (see the link from my previous message that you didn't read)

You don't need dual issue when you can just emit w64.

gaav87 said:
1. Known shaders are w32 they can optimize shaders by hand for w64...

They aren't.
Shader compiler compiles them to either w32 or w64.

gaav87 said:
Maybe not next to nothing but im just, a civil structural engineer and read the white papers for fun as a hobby mr know it all.

Well yeah then you gotta read up on GPU programming basics and shader toolchains.

gaav87 said:
So you think they changed nothing from rdna3 and magically managed to get a 500-600mhz clock increase ? xD

RDNA3.5 gets a 500-600Mhz clock increase.
RDNA4 is an unrelated microarchitecture.

gaav87 · May 3, 2024

adroc_thurston said:
RDNA3.5 gets a 500-600Mhz clock increase.

Source ?

Also im sad that instead of saying why my reasoning was wrong you insulted me. Why would 6x 32wide simd's be a bad idea ?

ToTTenTranz · May 3, 2024

adroc_thurston said:
RDNA3.5 gets a 500-600Mhz clock increase.
RDNA4 is an unrelated microarchitecture.

So PS5 Pro's GPU clock ceiling should be 2.7-2.8GHz?

Does RDNA4 clock below or above RDNA3.5?

adroc_thurston · May 3, 2024

gaav87 said:
Also im sad that instead of saying why my reasoning was wrong you insulted me.

I'm telling you to read up on basics.
You need to understand how GPUs work before making some WGP design assumptions.

gaav87 said:
Why would 6x 32wide simd's be a bad idea ?

You need to upsize every other part of the WGP to make that work.
Hurts fmax, major gamble too.

ToTTenTranz said:
So PS5 Pro's GPU clock ceiling should be 2.7-2.8GHz?

Not even the slightest of idea since consoles live in a discrete la-la-land wrt binning targets.

ToTTenTranz said:
Does RDNA4 clock below or above RDNA3.5?

Bout the same.

jpiniero · May 3, 2024

ToTTenTranz said:
So PS5 Pro's GPU clock ceiling should be 2.7-2.8GHz?

Clock is ~2.1 Ghz according to current rumors. Presumably it can still hit the base PS5 GPU clocks if needed with the shaders cut back down.

gaav87 · May 3, 2024

adroc_thurston said:
You need to upsize every other part of the WGP to make that work.
Hurts fmax, major gamble too.

I know that it hurts clocks and you need to upsize everything thats why i said 2750mhz boost clock. Still possible why a gamble ?
And what about 2x 4 16wide simds with shared 2x 64 wave slots so 128 wavefronts ?
Hint-Assisted Wavefront Scheduler would allow selective out-of-order execution.
Idk why you assumed i do not know the basics when my wgp design are possible.

adroc_thurston · May 3, 2024

gaav87 said:
I know that it hurts clocks

Then it shouldn't exist!

gaav87 said:
Still possible why a gamble ?

Because you can sim it only so well and so far.

gaav87 said:
And what about 2x 4 16wide simds with shared 2x 64 wavefronts ?

You don't need more SIMDs.
More isn't better.

gaav87 said:
Hint-Assisted Wavefront Scheduler would allow selective out-of-order execution.

the what? GPUs are strictly in-order.

gaav87 said:
Idk why you assumed i do not know the basics when my wgp design are possible.

You don't since your ideas contradict the way modern shader cores converged on each other.
The biggest big boy shader core to this day is still SMX from Kepler, and guess what? It sucked.

Panino Manino · May 3, 2024

Blame RTG/AMD, not him.
I'll say it again, Radeon is in the gutter, searching for scraps. It's not Alex fault, or even Jensens's fault.

adroc_thurston · May 3, 2024

Panino Manino said:
Blame RTG/AMD, not him.

It's him.

Panino Manino said:
I'll say it again, Radeon is in the gutter, searching for scraps

branch_suggestion · May 3, 2024

B3D™ virus is spreading.

PJVol · May 4, 2024

Aapje said:
- The shader compiler for RDNA3 is/was bad and often doesn't take advantage of dual issue

There's not much room for it to optimize given a number of restrictions on using VOPD

SolidQ · May 4, 2024

igor_kavinski said:
It just shows that most humans are risk-averse and change-averse. People pretending to be wise by going with the popular or trendy thing rather than experimenting and hoping to discover something new. Blindly accepting common misconceptions and falsehoods. No wonder humanity has screwed planet Earth so bad.

gonna show few examples from other forum.

and other alot examples. As you see for people only NV exist.

ToTTenTranz · May 4, 2024

SolidQ said:
gonna show few examples from other forum.

Where is that from? Beyond3D?

Last I saw, there was a guy there trying to make the point that Nvidia invented PC gaming. Unironically and with lots of support from other members.

The thing got completely out of control when DF"s Alex started posting there. Dude was treated like the second coming.

igor_kavinski · May 4, 2024

Reading all that, I'm thinking AT is a kindergarten with well behaved kids

Or AT has comparatively stellar moderation with very efficient mods who nip the crazies in the bud before they have a chance to blossom

GodisanAtheist · May 4, 2024

igor_kavinski said:
Reading all that, I'm thinking AT is a kindergarten with well behaved kids

Or AT has comparatively stellar moderation with very efficient mods who nip the crazies in the bud before they have a chance to blossom

- *Looks over at ATPN* uh yeah keep telling yourself that my guy

Timorous · May 4, 2024

GodisanAtheist said:
- *Looks over at ATPN* uh yeah keep telling yourself that my guy

ATPN is where they house the crazies so they don't infect the rest of the board.

Mahboi · May 4, 2024

Timorous said:
ATPN is where they house the crazies so they don't infect the rest of the board.

I'll go back when I'll feel like utterly wasting my time.
But yeah it's not possible to have a conversation there.

DAPUNISHER · May 4, 2024

Timorous said:
ATPN is where they house the crazies so they don't infect the rest of the board.

The rules are different for the social forums. In the tech forums everyone is expected to maintain a much higher level of decorum. That includes no profanity, personal attacks, and trolling.

branch_suggestion said:
B3D™ virus is spreading.

Anyone that will follow the forum guidelines is welcome here. Brand preference is also perfectly fine. As long as they keep the pom pom shaking and trash talk of other teams to being of the mild variety, and in their favorite vendor's threads, no rules are broken. Going to other vendors threads to antagonize, troll, or trash talk the "other teams" is when the guilty will be punished. If anyone with a brand preference does not like what the "other teams" are saying in their threads? Don't read those threads. Jumping in to defend your team is not permitted. As the Offspring rocked so hard - gotta keep em separated!

GodisanAtheist said:
- *Looks over at ATPN* uh yeah keep telling yourself that my guy

I know you and your sense of humor. I am a fan of it. But here's a heads up. This technically falls under the "no moderator callouts" rule. The only place to question and complain about moderation is the moderation discussions forum. Which ironically perhaps, our corporate overlords have not fixed for all members to access yet. Hence, I will not enforce the rule about PMing mods directly given the situation, and you can PM me any time. Even if you have beef with me, I will get you in group chat with the Administration so your grievances can be addressed.

I hope this post helps out and informs the newer members here. Welcome aboard and happy posting.

Mod DAPUNISHER

Discussion RDNA4 + CDNA3 Architectures Thread

Golden Member

Lifer

Platinum Member

Platinum Member

Junior Member

Platinum Member

Golden Member

Junior Member

Platinum Member

Junior Member

Member

Platinum Member

Lifer

Junior Member

Platinum Member

Senior member

Platinum Member

Member

Senior member

Senior member

Member

Lifer

Diamond Member

Golden Member

Senior member

Super Moderator CPU Forum Mod and Elite Member