Discussion RDNA 5 / UDNA (CDNA Next) speculation

Page 13 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

soresu

Diamond Member
Dec 19, 2014
3,850
3,232
136
Sure it is already supported. But if the GPU has enhanced dynamic execution abilities (reordering, out of order, dynamic allocation, dynamic wavefront sizes) it should get more powerful.
True, but that is also true of all (I hope) types of programming models if the µArch is properly designed to be future proof.
 

basix

Member
Oct 4, 2024
122
252
96
I find it interesting, that MI450X shall get its own LPDDR memory-pool. I speculated for a while now, that this could be introduced with CDNA4 or CDNA5 and would lead to a big benefit. 8x LPDDR packages would also fit on a OAM PCB, there is enough space for that.
 

Joe NYC

Diamond Member
Jun 26, 2021
3,103
4,514
106
I find it interesting, that MI450X shall get its own LPDDR memory-pool. I speculated for a while now, that this could be introduced with CDNA4 or CDNA5 and would lead to a big benefit. 8x LPDDR packages would also fit on a OAM PCB, there is enough space for that.

Is that in addition to or as a replacement for HBM?
 

Kronos1996

Member
Dec 28, 2022
69
112
76
CDNA 5 ?

9.0.0 -> 9.0.6 -> 9.0.8 -> 9.0.10 -> 9.4.2 -> 9.5.0 -> 12.5?

This implies it’s built on the RDNA 4 ISA? That would be sooner than expected, I figured RDNA 5 at the earliest. They’ve been adding architecture features useful for AI and datacenter since RDNA 3. I figured that was their game plan, slowly expand the RDNA ISA until it’s ready. If they think it’s ready, who am I to argue?

With the AI race heating up, maybe they decided to speed-run things. A modern ISA should bring very nice PPA improvements and RDNAs cache design is world-leading.
 
Reactions: Gideon

adroc_thurston

Diamond Member
Jul 2, 2023
5,806
8,145
96
That would be sooner than expected, I figured RDNA 5 at the earliest. They’ve been adding architecture features useful for AI and datacenter since RDNA 3. I figured that was their game plan, slowly expand the RDNA ISA until it’s ready. If they think it’s ready, who am I to argue?
It's the opposite, they un-ghetto'd DC CUs.
 

Kronos1996

Member
Dec 28, 2022
69
112
76
It's the opposite, they un-ghetto'd DC CUs.
You’ll have to elaborate on that for me.

My understanding is that they’re driving to a unified modular CU and ISA. Then just insert additional IP as appropriate for the target market. With GFX 9 having so much legacy baggage, it would seem prudent to use RDNA as the basis. I can’t see AMD throwing out everything for a clean-sheet design again.
 

Kepler_L2

Senior member
Sep 6, 2020
832
3,369
136
You’ll have to elaborate on that for me.

My understanding is that they’re driving to a unified modular CU and ISA. Then just insert additional IP as appropriate for the target market. With GFX 9 having so much legacy baggage, it would seem prudent to use RDNA as the basis. I can’t see AMD throwing out everything for a clean-sheet design again.
Seems to be a "fatter" version of RDNA4 with more compute/matrix throughput. RDNA5/gfx13 might not have that much in common.
 

Kronos1996

Member
Dec 28, 2022
69
112
76
They aren't, but ISAs will converge to a point.

What is even legacy baggage here.
GCN had terrible PPA in later iterations which had knock-on effects for efficiency of course. IIRC The memory subsystem was also pretty atrocious and caused a lot of issues getting full theoretical performance. I was under the impression CDNA still had to work around these problems despite improvements. Thanks to chip-lets they more or less brute-forced the PPA problem.

RDNA is the exact opposite. Navi 10 was 25% smaller than Vega 20 while delivering similar gaming performance (HBM still gave the older card an advantage.) That’s a pretty impressive increase in PPA and efficiency due to the new architecture. Then of course RDNA 2 introduced the full realization of the new memory subsystem. AMDs cache design teams are probably the best in the world. Between Infinity cache and 3D cache.
 

adroc_thurston

Diamond Member
Jul 2, 2023
5,806
8,145
96
GCN had terrible PPA in later iterations which had knock-on effects for efficiency of course
No it didn't, baby vegas in Renoir/Cezanne was really dang good.
The memory subsystem was also pretty atrocious and caused a lot of issues getting full theoretical performance
No it was alright, just tricky to scale.
I was under the impression CDNA still had to work around these problems despite improvements
No.
Thanks to chip-lets they more or less brute-forced the PPA problem.
You do understand that MI100 and MI200 are monodie, don't you.
They were super basic products and really competent at their job.
Navi 10 was 25% smaller than Vega 20 while delivering similar gaming performance (HBM still gave the older card an advantage.)
Yeah but Vega20 wasn't a good config for gaming, a 48CU with higher clocks would be less area and would do the same there.
It was an HPC part, the first one since Hawaii.
Then of course RDNA 2 introduced the full realization of the new memory subsystem
It just added MALL.
RDNA1 was the one that introduced the new memory subsystem.
 
Reactions: Tlh97

reaperrr3

Member
May 31, 2024
103
317
96
RDNA is the exact opposite. Navi 10 was 25% smaller than Vega 20 while delivering similar gaming performance (HBM still gave the older card an advantage.) That’s a pretty impressive increase in PPA and efficiency due to the new architecture.
The PPA improvement of RDNA1 was actually quite a disappointment.

VII was only 330mm², even though it had an overkill (for gaming) 4096bit HBM2 interface, half-rate FP64 (which made CUs bigger than they needed to be for a gaming card) and 64 CUs even though the Vega 56 and the Fury cards before that had clearly shown that GCN scales poorly from 56 to 64 CUs (and mediocre from 48 to 56, there were some tests for that, too).

Basically, if you took Vega20, removed half-rate FP64 support, cut the HBM interface in half (but kept L2 the same size and went with the fastest available HBM), reduced the CUs to 56 or maybe even 48 like adroc suggested and clocked the thing just ~150-200 Mhz higher, you'd end up with a chip of similar size and similar gaming perf as N10, at least in the games back then.

N10 should've had 48 CUs and twice as much L2, then it would've been better (the 40 CUs only made up 81mm² of the chip, so that would've only increased N10's size by like 10%).
But the way they configured N10, it's PPA was so-so for a new uArch using N7, not much better than a gaming-focused Vega2 config would've been.
 
Reactions: Tlh97

marees

Golden Member
Apr 28, 2024
1,173
1,691
96
AMD path tracing

Performant Path Tracing: Two patent filings about next level adaptive decoupled shading (texture space shading) that could be very important for making realtime path tracing mainstream; one spatiotemporal (how things in the scene changes over time) and another spatial (focusing on current scene). Both are working together to prioritize shading ressources on the most important parts of the scene by reusing previous shading results and lowering the shading rate when possible. IDK how much this differs from ReSTIR PTGI but it sounds more comprehensive and generalized in terms of boosting FPS.

 

adroc_thurston

Diamond Member
Jul 2, 2023
5,806
8,145
96
I was under the impression that beyond using the significantly less bugged Vega v2 that those iGPUs also used some components from RDNA1?
It didn't, it's a chopped off Vega20 basically with 1/16DPFP.
Also Vega had no "bugs" besides the new internal shader stages. The IP just sucked. Until it didn't!
Kepler seems to be claiming UDNA has 256 ALUs per CU
no such thing as UDNA.
Could be applicable only for CDNA 5 & not RDNA 5, I think
yea.
ALU spam doesn't help you in client, and it helps even less with RTRT.
AMD not shared any stuff about UDNA so far (neither any leaks).
Because it does not exist.
Client and DC shader cores live on completely separate tracks.
The only thing that's happening is un-ghetto-ing of DC parts into a modern ISA with all the party tricks RDNA gained so far.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |