Discussion RDNA4 + CDNA3 Architectures Thread

DisEnchantment · Mar 23, 2022

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits

History for llvm/lib/Target/AMDGPU - llvm/llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. - History for llvm/lib/Target/AMDGPU - llvm/llvm-project

github.com

Or Phoronix

More AMD "GFX940" Enablement Work Landing In LLVM - Phoronix

www.phoronix.com

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.

Previous thread on CDNA2 and RDNA3 here

Question - Speculation: RDNA3 + CDNA2 Architectures Thread

Man I have been dying to make this one for a while now. First rumours for RDNA3 are here so new thread time! Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3...

forums.anandtech.com

adroc_thurston · May 2, 2024

SolidQ said:
Because they think, if AMD can't compete with NV, they've out(there enought posts about it), also trusting in Intel to compete vs NV

It's just reddit mind poison and wanting Radeon to die for commiting the sin most grave: not making green slop cheaper.

SolidQ said:
If you read difference gaming forums, there always praise DLSS to heaven. called DLSS "Black Magic"

That's the magic of technically independent shills like Digital Foundry.

SolidQ · May 2, 2024

adroc_thurston said:
It's just reddit mind poison and wanting Radeon to die for commiting the sin most grave: not making green slop cheaper.

That what they don't understand, if there no AMD, RTX 4060 would be cost 1k$

adroc_thurston said:
That's the magic of technically independent shills like Digital Foundry.

Except, love watch some videos about Retro, i'm never take them seriously. Their methodology, it's old and RT games, where their UE5 games when cards comparision?
Even Daniel Owen much better explain. He's literally saying RT doesn't matter in low/mid segments.
Like AMD 26 fps, NV 35fps, both unplayable.

adroc_thurston · May 2, 2024

SolidQ said:
That what they don't understand, if there no AMD, RTX 4060 would be cost 1k$

Not really, NV still gotta move units.

SolidQ said:
Except, love watch some videos about Retro, i'm never take them seriously. Their methodology, it's old and RT games, where their UE5 games when cards comparision?

Point is, the kids do.
They're technically clueless but they work very well at sprinkling magic dust onto whatever NV FOTM 'feature' is.

Mopetar · May 2, 2024

adroc_thurston said:
Not really, NV still gotta move units.

They'd keep their prices the same, but the card you get for that price would be a lot less. Effectively customers end up paying more when there's no competition.

Intel did the same thing keeping their desktop chips at 4 cores even when they could have moved up to at least 6 cores without any big hassle.

adroc_thurston · May 2, 2024

Mopetar said:
They'd keep their prices the same, but the card you get for that price would be a lot less. Effectively customers end up paying more when there's no competition.

Naa they gotta move units.

Mopetar said:
Intel did the same thing keeping their desktop chips at 4 cores even when they could have moved up to at least 6 cores without any big hassle.

That's DIY, no one cares about DIY in client PC.
GPU sales are over half DIY, that's like the difference.

Tup3x · May 2, 2024

SolidQ said:
If you read difference gaming forums, there always praise DLSS to heaven. called DLSS "Black Magic"

Well, compared to FSR2... It's no contest. FSR2 is mediocre as upscaling method and quite bad as an anti-aliasing method. Compared to that almost anything is black magic. I sure hope thing get improved in the latest version since current version looks bad even at native resolution.

adroc_thurston · May 2, 2024

Tup3x said:
Compared to that almost anything is black magic

At a lot higher computational costs!

Saylick · May 2, 2024

adroc_thurston said:
At a lot higher computational costs!

And die costs as well (allegedly). Just look at how much room that Tensor Core takes up. It's like the same size as the CUDA cores.

ToTTenTranz · May 2, 2024

adroc_thurston said:
how many times do people have to say it's not happening

How low can AMD's marketshare on consumer dGPUs go before they bow out of that market, though?
Because I don't remember it ever being as bad as the last couple years. At 10% in mid 2022 they were pretty close to irrelevancy.

I'm specifically talking about consumer dGPUs, because on iGPUs (PC and console SoCs) and server they seem healthy enough. People keep hoping for AMD to get stronger options so they can buy cheaper Nvidia cards. This is a real thing, I've seen it first hand with friends.

adroc_thurston said:
That's the magic of technically independent shills like Digital Foundry.

Alex / Dictator is a massive propagandist for nvidia and it's honestly hard to believe he's not on the payroll, but to be honest I find Oliver Mackenzie and to a point even Richard and John to be fair-er in their assessment.
In reality, I think it's just Alex who's a really poor fit for the team because the man has reduced himself to a talking advertisement.

adroc_thurston · May 2, 2024

ToTTenTranz said:
How low can AMD's marketshare on consumer dGPUs go before they bow out of that market, though?

0%.

ToTTenTranz said:
Because I don't remember it ever being as bad as the last couple years. At 10% in mid 2022 they were pretty close to irrelevancy.

They do it for kicks.

ToTTenTranz · May 2, 2024

Saylick said:
And die costs as well (allegedly). Just look at how much room that Tensor Core takes up. It's like the same size as the CUDA cores.
View attachment 98257

Where did that description come from? Back when we had Turing chips with and without RT and tensor cores (e.g. TU106 vs TU116), it was found that the tensor cores were making the SM ALUs only 14% bigger.
Surely the tensor cores are taking a much larger proportion of the block in Ada than they were in Turing, but I didn't expect those - and the RT cores - to make the SM ALUs almost triple in size.

Timorous · May 2, 2024

ToTTenTranz said:
How low can AMD's marketshare on consumer dGPUs go before they bow out of that market, though?

They make APUs and console stuff. At the very least they will be designing new architectures so quite a chunk of the cost is already sunk so at that point why not spread it across more products to reduce the overhead per unit.

ToTTenTranz · May 2, 2024

Timorous said:
They make APUs and console stuff. At the very least they will be designing new architectures so quite a chunk of the cost is already sunk so at that point why not spread it across more products to reduce the overhead per unit.

Because even if your architecture is done and the driver team already exists for the iGPUs and all they're doing on the dGPU team is adding modules to the simulator, fabbing each new chip costs a ton of money.
And if AMD's marketshare is so low they can't recoup that money, Lisa's fiduciary duty is to shut the whole thing down.

Thunder 57 · May 2, 2024

ToTTenTranz said:
How low can AMD's marketshare on consumer dGPUs go before they bow out of that market, though?
Because I don't remember it ever being as bad as the last couple years. At 10% in mid 2022 they were pretty close to irrelevancy.

I'm specifically talking about consumer dGPUs, because on iGPUs (PC and console SoCs) and server they seem healthy enough. People keep hoping for AMD to get stronger options so they can buy cheaper Nvidia cards. This is a real thing, I've seen it first hand with friends.

Alex / Dictator is a massive propagandist for nvidia and it's honestly hard to believe he's not on the payroll, but to be honest I find Oliver Mackenzie and to a point even Richard and John to be fair-er in their assessment.
In reality, I think it's just Alex who's a really poor fit for the team because the man has reduced himself to a talking advertisement.

It's a very real thing and sad too when the 3050 outsold the far better performaning RX 6600, like the GTX 1050 Ti did to the RX 580 before that.

When bozos are just doing that, you have to ask whether AMD should even try to stay in the desktop market? If I were AMD I'd be tired of the ingrates and be tempted to take me ball and go home.

adroc_thurston · May 2, 2024

ToTTenTranz said:
And if AMD's marketshare is so low they can't recoup that money, Lisa's fiduciary duty is to shut the whole thing down.

Lisa's duty is to win the most MSS at highest margins.

Saylick · May 2, 2024

ToTTenTranz said:
Where did that description come from? Back when we had Turing chips with and without RT and tensor cores (e.g. TU106 vs TU116), it was found that the tensor cores were making the SM ALUs only 14% bigger.
Surely the tensor cores are taking a much larger proportion of the block in Ada than they were in Turing, but I didn't expect those - and the RT cores - to make the SM ALUs almost triple in size.

It came from a Twitter poster who analyzes and annotates a bunch of die shots. I did not perform the annotation myself, hence why I said "allegedly".

https://twitter.com/x/status/1784611359608680563

jpiniero · May 2, 2024

adroc_thurston said:
Lisa's duty is to win the most MSS at highest margins.

They're not exactly doing a good job with it with the dGPUs.

gdansk · May 2, 2024

Thunder 57 said:
It's a very real thing and sad too when the 3050 outsold the far better performaning RX 6600, like the GTX 1050 Ti did to the RX 580 before that.

When bozos are just doing that, you have to ask whether AMD should even try to stay in the desktop market? If I were AMD I'd be tired of the ingrates and be tempted to take me ball and go home.

They have no obligation to buy Radeon. Who would buy an RX 6600 instead of the slightly more expensive RTX 3050? Maybe it looks better on your fancy frametime charts and efficiency graphs but that's meaningless. RTX 3050 has CUDA, DLSS and RTX. Three killer features that make modern AI/gaming possible. That Nvidia is kind enough to sell the RTX 3050 for a measly $200 in 2024 is an act of charity.

adroc_thurston · May 2, 2024

jpiniero said:
They're not exactly doing a good job with it with the dGPUs.

RDNA3 is doing pretty well so no?
Lmao.

gdansk said:
RTX 3050 has CUDA, DLSS and RTX. Three killer features that make modern AI/gaming possible. That Nvidia is kind enough to sell the RTX 3050 for a measly $200 in 2024 is an act of charity.

Bingo, it's the shill™ factor.

gdansk · May 2, 2024

jpiniero said:
They're not exactly doing a good job with it with the dGPUs.

It's sad, I wonder how long Lisa will keep up this charity case. Intel sells 1/15th as many units at even lower ASPs so they should be bailing from the dGPU market approximately three years ago.

cherullo · May 2, 2024

Saylick said:
Interesting. In my admittedly very limited research into understanding BVH structures better, I am of the opinion that BVH8 is inferior to BVH4 so I have to wonder why BVH8 is being pursued here. From what I understand, there is a sweet spot to selecting the branching factor and you get negative returns with higher and higher branching factors, meaning you have to do more ray-box intersection tests before you dig down through to the next layer. Maybe ray-box intersection units are cheaper from a silicon usage perspective over ray-triangle intersection units...

Yes, ray-box intersection is much cheaper than ray-triangle intersection.
Regardless of the specific implementation, the ray-box intersection only has to detect whether there is an intersection and the hitpoint's distance from the ray's origin.
On the other hand, the triangle-ray intersection must generate these, plus the hitpoint's barycentric coordinates and the surface normal.

Well, if the unit can support both BVH4 and BVH8 then maybe the game/driver can choose which to build. Maybe the unit can act as 2xBVH4, checking two independent rays simultaneously.

adroc_thurston · May 2, 2024

cherullo said:
Maybe the unit can act as 2xBVH4, checking two independent rays simultaneously.

Hence why two slots.
:3

Saylick · May 2, 2024

cherullo said:
Yes, ray-box intersection is much cheaper than ray-triangle intersection.
Regardless of the specific implementation, the ray-box intersection only has to detect whether there is an intersection and the hitpoint's distance from the ray's origin.
On the other hand, the triangle-ray intersection must generate these, plus the hitpoint's barycentric coordinates and the surface normal.

Well, if the unit can support both BVH4 and BVH8 then maybe the game/driver can choose which to build. Maybe the unit can act as 2xBVH4, checking two independent rays simultaneously.

adroc_thurston said:
Hence why two slots.
:3

The flexibility to use BV4 or BV8 almost seems analogous with how RDNA can operate in wave32 or wave64 mode. Also explains the doubling in intersection performance over RDNA 2/3, too.

adroc_thurston · May 2, 2024

Saylick said:
Also explains the doubling in intersection performance over RDNA 2/3, too.

really not the selling point

soresu · May 2, 2024

jpiniero said:
What would scream cheap is if they used Samsung.

Samsung may be behind on pitch, but they are ahead on device type as TSMC are still using finFET while Sammy are starting up a 2nd gen nanosheet/GAAFET 3nm process node this year.

Haven't seen much data on how well the 1st gen process compares to TSMC 4nm or 3nm finFET as yet.

Discussion RDNA4 + CDNA3 Architectures Thread

Golden Member

Platinum Member

Senior member

Platinum Member

Diamond Member

Platinum Member

Senior member

Platinum Member

Diamond Member

Member

Platinum Member

Member

Golden Member

Member

Platinum Member

Platinum Member

Diamond Member

Lifer

Platinum Member

Platinum Member

Platinum Member

Member

Platinum Member

Diamond Member

Platinum Member

Platinum Member