Discussion RDNA4 + CDNA3 Architectures Thread

Page 98 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

adroc_thurston

Diamond Member
Jul 2, 2023
3,647
5,269
96
Blackwell is going to smoke AMD and Intel.
No?
DC one is a bit over 2 times the perf at 2 times the Si and that's with DPFP MULs gutted everywhere.
Client will be even more dire.
Why would AMD use slow GDDR6 modules when faster GDDR6 is available?
Cost.
They're cheap mainstream dies.
Very capable, yes. But cheap and mainstream.
 
Reactions: Ghostsonplanets

Saylick

Diamond Member
Sep 10, 2012
3,605
8,075
136
No BVH8 in code yet? As PS5pro have it
There is no mention of BVH8 yet. But all BVH4 code is gone.
Interesting. In my admittedly very limited research into understanding BVH structures better, I am of the opinion that BVH8 is inferior to BVH4 so I have to wonder why BVH8 is being pursued here. From what I understand, there is a sweet spot to selecting the branching factor and you get negative returns with higher and higher branching factors, meaning you have to do more ray-box intersection tests before you dig down through to the next layer. Maybe ray-box intersection units are cheaper from a silicon usage perspective over ray-triangle intersection units...
 

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
Interesting. In my admittedly very limited research into understanding BVH structures better, I am of the opinion that BVH8 is inferior to BVH4 so I have to wonder why BVH8 is being pursued here. From what I understand, there is a sweet spot to selecting the branching factor and you get negative returns with higher and higher branching factors, meaning you have to do more ray-box intersection tests before you dig down through to the next layer. Maybe ray-box intersection units are cheaper from a silicon usage perspective over ray-triangle intersection units...
Without compression BVH8 will spill over multiple cache lines and incur access penalty but with compression you can pack more information in a single cache line and basically improve intersection unit throughput.
I believe AMD will go for compression, it is cheaper to add logic gates than take a trip to the MALL/next level Cache.

Another thing is that the intersection engine can handle the uncompressed BVH4 structure as usual which will maintain compatibility with precompiled BVH structure aka in consoles etc.
See this post before, with link to the patents
This patent below could help you with the box node compression
View attachment 95606

This patent below contains handling both BVH4 and BVH8.
View attachment 95607

I think for prebuilt BVH trees performance gain won't be much. For runtime generated BVH trees it could be a lot of boost.
Cache and memory subsystem will be key.
For BVH4 it fetches two cache lines, for BVH8, it fetches single cache line.



https://twitter.com/NIV_Anteru , who also presented the amazing workgraphs demo recently is inventor of these patents below for generating new kind of BVH trees





Will only help runtime generated trees it looks like to me




Updated way to perform intersection tests too with rotated boxes


I doubt PS5 will get all these with the weak CPU (if the BVH generation is not done on GPU)
 

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
It just looks very different from RDNA2 and 3 RT.
gfx10/11:
View attachment 98209
gfx12:

View attachment 98211
Looks like the per thread reporting of intersection test result is gone, likely the traversal is offloaded but I am wondering what happens to the current precompiled shaders?

Interesting in the few lines below for culling all these boxes and triangles before traversal/intersections tests are being done.
If this is actually what I think it is, then it is indeed very interesting to say the least.
 

KompuKare

Golden Member
Jul 28, 2009
1,196
1,506
136
Headline writers!

Their actual article isn't that bad as the mainly go on about console sales being down which is the main driver of the revenue decline - see the AMD financial results thread - but the headline writers can help the clickbait dGPU title. Not that Radeon sales are great, but even if RDNA4 launches soon it cannot hold up the whole division.

Some of the info from AMD's CFO does imply no PS5 Pro this year.
 
Reactions: Tlh97 and marees

soresu

Diamond Member
Dec 19, 2014
3,273
2,549
136
Reactions: marees

marees

Senior member
Apr 28, 2024
507
565
96
The solution to diminishing revenue is not to withdraw plans for new products.

That just means that the gap between them and the competition will grow even wider across all segments.
The 3 things that make me worried that could delay RDNA 4 are:

1) AMD is readying FSR 4 with AI upscaling & wants to launch RDNA 4 along with it
2) overstock of RDNA 3 (should not have happened if AMD had learnt lessons from RDNA 2 overstock debacle)
3) unsold RDNA 2 — for ex: the 6800 selling for $360 in NewEgg. (Should not be a major concern as most Navi 21 is sold out.)

On the pro side:
1) the 7900 GRE is being discounted to $510. Surely it is more profitable to sell navi 48 for $550-$600 (unless there is tons of 7900 GRE /navi 31 sitting around ??)
2) already the 7700xt has been discounted to $350 a few times. That means AMD can sell both navi 44 (=4060 ti for $300) & navi 48 (=7900xt/4070 ti super for $550 to $600), even if navi 32 remains in stock
 
Jul 27, 2020
20,419
14,087
146
The 3 things that make me worried that could delay RDNA 4 are:

1) AMD is readying FSR 4 with AI upscaling & wants to launch RDNA 4 along with it
2) overstock of RDNA 3 (should not have happened if AMD had learnt lessons from RDNA 2 overstock debacle)
3) unsold RDNA 2 — for ex: the 6800 selling for $360 in NewEgg. (Should not be a major concern as most Navi 21 is sold out.)

1) LOL. Yeah right. As if they released FSR3 on time...

2) Possible, especially if they figured the 7900 GRE would sell well in China and related markets.

3) So what if they are sitting on half a million RDNA2 chips? They are still decent chips and won't get obsolete any time soon. RTG folks are the masters of rebranding old /"not quite fit for purpose" stuff (6500 XT anyone?). Not unless some new DirectX version makes RDNA2 look like a really bad value proposition.
 

jpiniero

Lifer
Oct 1, 2010
15,290
5,804
136
Headline writers!

Their actual article isn't that bad as the mainly go on about console sales being down which is the main driver of the revenue decline - see the AMD financial results thread - but the headline writers can help the clickbait dGPU title. Not that Radeon sales are great, but even if RDNA4 launches soon it cannot hold up the whole division.

Does kind of imply that they aren't expecting much from RDNA4 either.
 

soresu

Diamond Member
Dec 19, 2014
3,273
2,549
136
1) AMD is readying FSR 4 with AI upscaling & wants to launch RDNA 4 along with it
They may announce or preview v4 with the RDNA4 announcement, but I would not expect them to hold the RDNA4 announcement specifically for that given they have announced features with GPUs before and not delivered them on time.

HYPR-X or something like comes to mind.
 

linkgoron

Platinum Member
Mar 9, 2005
2,415
984
136
Does kind of imply that they aren't expecting much from RDNA4 either.
Why would they?

AMD cancelled their RDNA4 high-end because chiplets are a huge success, and Navi 48 will reportedly be at the ~4070ti(s) level, so it's not like anyone is waiting for it for new performance levels. Even if you want to go with AMD, you can already buy a 7900GRE today for that kind of performance. Radeons are not getting much design wins in laptops, and this probably won't change with RDNA4. RTX Ada will arrive relatively soon after RDNA4, so people will probably wait and see where that will fall as well. Really, there's not much going for RDNA4.
 
Reactions: marees

marees

Senior member
Apr 28, 2024
507
565
96
They may announce or preview v4 with the RDNA4 announcement, but I would not expect them to hold the RDNA4 announcement specifically for that given they have announced features with GPUs before and not delivered them on time.

HYPR-X or something like comes to mind.
I was thinking, if FSR 4 was ready, then AMD could upsell the navi 48 for $600

Otoh, if 18Gbps memory speed is true, then that looks pretty unambitious unless AMD expects board partner cards to be overclocked out of the gate 🤔
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |