Discussion RDNA4 + CDNA3 Architectures Thread

Page 86 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,615
5,869
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

TESKATLIPOKA

Platinum Member
May 1, 2020
2,364
2,855
136
If they are using 18gbps memory for RDNA4, then It means It's good enough.
Ok, It's a bit surprising, considering RDNA4 should have noticeably higher clocks and N44 even a bit more CU.

Honestly, I am more interested in the amount of Vram than It's speed.
Even If N48 is >25% faster than 7600XT, with only 8GB Vram It will be a FLOP, unless the price is low or there will be a clamshell version with 16GB Vram for a decent price.
 
Last edited:

Mahboi

Senior member
Apr 4, 2024
337
566
91
For a PPA connoisseur like myself I think it is kinda neato.
Gaming segment revenue needs an injection due to consoles sales dropping off a cliff, this does the trick and should increase margins overall, naturally this means a market share gain back to old historical levels to set the foundation for the big boy generation after.
Again, nobody's saying the arch is bad.
The PRODUCT is a completely unambitious, uninteresting thing.
We have waited throughout all RDNA 3 being a dog, an unrepaired, broken dog, only for AMD to quietly pretend it never happened and replace it with RDNA 3.5 in future APUs.
Then RDNA 4 comes out, fixes the power problems, adds a necessary BVH walker, and gets a solid arch going.
And the biggest die in the lineup is a 240mm² one. Even the 5700 xt was 251mm² and lasted from July 2019 till Oct 2020 where the 6900 xt replaced it.

So we've waited for a RDNA 3 fix that didn't come.
Followed by a generation whose most ambitious die is the size of a 5700 xt or 6600 xt and probably will be called "8800 xt" because shrinkflation and ambitions in name only are rife at AMD.
All to wait another year and a half for an actual ambitious gen. If it doesn't also get gimped or cancelled in some way.

The excuse for all of this is "market conditions" which doesn't seem to prevent NV from shipping 610mm² 4090s (even if they're very cut down) or to at least sell a 4080 with its 379mm² die.
The fact is that you can't be chasing the lowest cost, and make a really unambitious product, and go around saying that you're trying to make the best product, all the while the competition is just pumping fat monolithic dies and selling them at probably equal or worse margins (seems to me like there's a lot less cost cutting at Nvidia than AMD), and just sells.

N48 is probably going to wind up being in the same market position as the 6800 or 7800 xt, roughly same pricing and goals. And it will be, in every way, shrinkflation. Same VRAM buffer, same bus, same everything since RDNA 2, but you get a smaller and smaller die every time.
And all it'll do is compete with last gen AMD in Raster, worse than XTX even, and last gen Nvidia in Raytracing. Because just having some actual ambition and making even just a 350mm² die that would eat into 4090 performance territory and gulp 300W is just too much to ask.

I'm not even asking for a really big and expensive die here, just "not penny pinching to the point that you have become the shrinkflation company". They could have absolutely planned for a 300-350mm² die, even monolithic, that would have been $800 or $1000. This gen is just another "take your ticket and wait until we deliver in 2 years". Again.
 
Last edited:
Aug 4, 2023
176
373
96
Again, nobody's saying the arch is bad.
The PRODUCT is a completely unambitious, uninteresting thing.
We have waited throughout all RDNA 3 being a dog, an unrepaired, broken dog, only for AMD to quietly pretend it never happened and replace it with RDNA 3.5 in future APUs.
Then RDNA 4 comes out, fixes the power problems, adds a necessary BVH walker, and gets a solid arch going.
And the biggest die in the lineup is a 240mm² one. Even the 5700 xt was 251mm² and lasted from July 2019 till Oct 2020 where the 6900 xt replaced it.
It does what it needs to do.
So we've waited for a RDNA 3 fix that didn't come.
Followed by a generation whose most ambitious die is the size of a 5700 xt or 6600 xt and probably will be called "8800 xt" because shrinkflation and ambitions in name only are rife at AMD.
All to wait another year and a half for an actual ambitious gen. If it doesn't also get gimped or cancelled in some way.
Time to market is why all but these 2 got canned, gotta reset the execution machine.
The excuse for all of this is "market conditions" which doesn't seem to prevent NV from shipping 610mm² 4090s (even if they're very cut down) or to at least sell a 4080 with its 379mm² die.
The fact is that you can't be chasing the lowest cost, and make a really unambitious product, and go around saying that you're trying to make the best product, all the while the competition is just pumping fat monolithic dies and selling them at probably equal or worse margins (seems to me like there's a lot less cost cutting at Nvidia than AMD), and just sells.
This GPU generation was weak overall, next gen frankly won't be much different due to a lack of things worth upgrading for unless you buy the biggest NV part for the latest memeware.
N48 is probably going to wind up being in the same market position as the 6800 or 7800 xt, roughly same pricing and goals. And it will be, in every way, shrinkflation. Same VRAM buffer, same bus, same everything since RDNA 2, but you get a smaller and smaller die every time.
And all it'll do is compete with last gen AMD in Raster, worse than XTX even, and last gen Nvidia in Raytracing. Because just having some actual ambition and making even just a 350mm² die that would eat into 4090 performance territory and gulp 300W is just too much to ask.
Execution, time to market, capiche?
I'm not even asking for a really big and expensive die here, just "not penny pinching to the point that you have become the shrinkflation company". They could have absolutely planned for a 300-350mm² die, even monolithic, that would have been $800 or $1000. This gen is just another "take your ticket and wait until we deliver in 2 years". Again.
If a HD 4000 style mainstream perf/$ stepchange isn't good enough for you then so be it. Trinity takes a while to build, okay?
 

Mahboi

Senior member
Apr 4, 2024
337
566
91
Watching Radeon is like being a Sonic the Hedgehog fan.

Good into failure into reboot.
Which turns good into failure into reboot.

Polaris to Vega to RDNA.
To RDNA 2 to RDNA 3 to RDNA 4.

Rinse and repeat.
When's the Sonic 2006 of the series that forces a hard reset is my only question.
 

Hans Gruber

Platinum Member
Dec 23, 2006
2,140
1,089
136
I agree with AMD acting or being cheap. Trying to penny pinch and maintain margins is hurting their product line. It seems quite clear that the original speculation about RDNA 3 being broken (having a partial functioning die) when RDNA 3 was released seems correct in 2024. AMD should release an RDNA 3.5 I think AMD would charge hefty prices for late GPU's that should have been.

At least Intel admitted Alchemist was partly broken. Certain things they cannot fix because of hardware design issues. That is to be expected for a 1st generation GPU. Intel is doing software workarounds to make the GPU's function as good as they can be with the existing hardware. I read that Intel would have everything fixed with Battlemage.

AMD has a problem with their graphics cards doing nothing but raster performance well. Nvidia and Intel cards do a lot more than AMD cards when it comes to encoding. Hopefully AMD can turn it around with RDNA 4.
 
Reactions: SmokSmog
Aug 4, 2023
176
373
96
Watching Radeon is like being a Sonic the Hedgehog fan.

Good into failure into reboot.
Which turns good into failure into reboot.

Polaris to Vega to RDNA.
To RDNA 2 to RDNA 3 to RDNA 4.

Rinse and repeat.
When's the Sonic 2006 of the series that forces a hard reset is my only question.
Look back to the ATi days, they actually won outright.
The cycle of so close so far can be broken now because AMD has all the money and resources needed to execute their plans.
I agree with AMD acting or being cheap. Trying to penny pinch and maintain margins is hurting their product line. It seems quite clear that the original speculation about RDNA 3 being broken (having a partial functioning die) when RDNA 3 was released seems correct in 2024. AMD should release an RDNA 3.5 I think AMD would charge hefty prices for late GPU's that should have been.
Forget about the past, this is the last time you are getting a partial gen.
At least Intel admitted Alchemist was partly broken. Certain things they cannot fix because of hardware design issues. That is to be expected for a 1st generation GPU. Intel is doing software workarounds to make the GPU's function as good as they can be with the existing hardware. I read that Intel would have everything fixed with Battlemage.
Nothing about it is broken, just a bad design with terrible PPA. It is basically Vega.
AMD has a problem with their graphics cards doing nothing but raster performance well. Nvidia and Intel cards do a lot more than AMD cards when it comes to encoding. Hopefully AMD can turn it around with RDNA 4.
Non-raster primary workloads remain a small portion of the client dGPU market, that isn't going to change until 10th gen consoles frankly.
Uncore IP has already improved a lot since Xilinx arrived, all cases shall be covered in order of TAM.
 

Mahboi

Senior member
Apr 4, 2024
337
566
91
Nothing about it is broken, just a bad design with terrible PPA. It is basically Vega.
Hold on, you think Vega was bad?
Didn't they keep it going for about 4 years in multiple APUs?
Non-raster primary workloads remain a small portion of the client dGPU market, that isn't going to change until 10th gen consoles frankly.
Agreed, but the PS5 Pro is precisely going to be the stepping stone into that future. Not that it'll change the market per se, but it's a warning shot for Raster being on the way out IMO, even if it'll take another 4-6 years before it really starts being phased out in favour of full RT.
 
Aug 4, 2023
176
373
96
All sorts of configs get tested before final production configs are decided, this was probably the limit of what they could do and decided that the memory wasn't an issue.
Hold on, you think Vega was bad?
Didn't they keep it going for about 4 years in multiple APUs?
Alchemist has a bunch of GCN type limitations regarding stuff like occupancy.
Vega was alright for APUs because they had no real comp, but today it is horrific.
Agreed, but the PS5 Pro is precisely going to be the stepping stone into that future. Not that it'll change the market per se, but it's a warning shot for Raster being on the way out IMO, even if it'll take another 4-6 years before it really starts being phased out in favour of full RT.
Well it is also Sony continuing along the path they established with checkerboarding and the like.
Sony kinda do their own thing, they don't have to obey the PC API lords.
 

Mahboi

Senior member
Apr 4, 2024
337
566
91
It does what it needs to do.

Time to market is why all but these 2 got canned, gotta reset the execution machine.
So let me get the timeline straight with your story here.

In 2016 AMD was 8000 employees (basically got on an extreme survivor diet).
I'm going to assume 2000 or less than that was at RTG.

They made Vega as the holdover before they got into next gen RDNA 1/2/3 and beyond.
Vega wasn't very good.

RDNA 1 was good but small.
RDNA 2 was the first big one since before Vega (before Polaris even). And it was good too.
RDNA 3 was meant to be the ultra cost effective, small CU, many returns gen. It got screwed up.

Where in that storyline did they "need to reset the execution machine" exactly?
From my seat here, it's pretty much a straightforward growth from the starving out pre-Zen era to RDNA 1/2/3. I don't see a moment where some terrible mistake in the machine was injected. It's not like Intel that would be almost better off downsizing by half and would actually work better. Smol AMD went bigger, Smol RTG within it also being a slow growth.

So either you're saying that the "machine" was broken long before this, so we're actually jumping back to Bulldozer/Islands era or even before that, and then the execution machine was pretty much screwed since 2012-2014, either I don't see how the machine required to sacrifice the full post RDNA 3 era + half of the lineup for RDNA 4 because "time to market needed that much, capiche"?

I capiche all you want that the constraints are there, I don't see how you wind up with something so broken that it takes 2+ years worth of work to fix. Not unless the problems were extremely deep and probably older than post Zen 1. RDNA 3 had an "electrical problem" that should've been caught and wasn't, it doesn't sound like a 2 year long problem to fix. And if it was a bigger problem than that, I don't see how it could've been recent when the corpo had already downsized to next to nothing. Unless the management was just terrible at RTG since Vega all the way to RDNA 3's release and RDNA 1/2 were just lucky.

If a HD 4000 style mainstream perf/$ stepchange isn't good enough for you then so be it. Trinity takes a while to build, okay?
Again, not a price/perf problem. An ambition problem.
 
Reactions: Tlh97

leoneazzurro

Senior member
Jul 26, 2016
933
1,478
136
RDNA3, at least by what AMD itself let slide (and some other people, too), missed the clock/power targets substantially. Otherwise it would not have been a bad part in the PPA department. IF (and that is a big IF) Strix Halo would be able to run 16 Zen5 cores + 40 CU of RDNA 3.5 with a clock >>2 GHZ in a 125W envelope, then that means that the idea was not bad, but realization was.
 
Reactions: Tlh97
Aug 4, 2023
176
373
96
Where in that storyline did they "need to reset the execution machine" exactly?
RDNA3, and it is not like the whole thing need to be gutted and rebuilt, just a reset of some things.
I capiche all you want that the constraints are there, I don't see how you wind up with something so broken that it takes 2+ years worth of work to fix.
It took about 1 year to complete the postmortem and make the necessary changes, RDNA3.5/4 were not affected by what went wrong with RDNA3, that was a higher level decision based on competition and the market.
Again, not a price/perf problem. An ambition problem.
MI300 is pretty ambitious, and the same paradigm will come to RDNA so buckle up.
Or they are terrified of what the ASPs are going to be and are penny pinching, esp when it's not as fast as the leakers want it to be.
Day 1 reviews will probably contain some memory overclocking results so the truth will come out. Assuming AMD doesn't lock it down even more.
RDNA3, at least by what AMD itself let slide (and some other people, too), missed the clock/power targets substantially. Otherwise it would not have been a bad part in the PPA department. IF (and that is a big IF) Strix Halo would be able to run 16 Zen5 cores + 40 CU of RDNA 3.5 with a clock >>2 GHZ in a 125W envelope, then that means that the idea was not bad, but realization was.
Good thing 3.5 was a thing before RDNA3 was known to be a dud.
Risk management paid off.
 

Mahboi

Senior member
Apr 4, 2024
337
566
91
Or they are terrified of what the ASPs are going to be and are penny pinching, esp when it's not as fast as the leakers want it to be.
I don't see how that works, it's GDDR6. Is 21.5Gbps GDDR6 that much more expensive than 18Gbps GDDR6?
How much pennies can you even pinch there? 10 dollars per batch of 16Go?
 

leoneazzurro

Senior member
Jul 26, 2016
933
1,478
136
RDNA4 memory controller supports GDDR7 too. "Slow" GDDR6 is a choice (likely cost).
Quite probably cost/benefit analysis. If performances with GDDR7 would have raised performance 5% but cost of the board would have gone up 10%, then probably it was deemed not worth of it. Especially when many people can overclock. Also, this may leave space for some custom version with higher rated GDDR6 or even GDDR7 memory.
 
Aug 4, 2023
176
373
96
RDNA4 memory controller supports GDDR7 too. "Slow" GDDR6 is a choice (likely cost).
Probably a mix of cost and availability of more exotic bins (Samsung moment). GDDR7 support would've been intended for the dead parts.
Everything about this screams simple, affordable and available as soon as possible. If that means leaving a few % of perf on the table due to cheaping out on memory, so it shall be.
 

Mahboi

Senior member
Apr 4, 2024
337
566
91
RDNA3, and it is not like the whole thing need to be gutted and rebuilt, just a reset of some things.
It took about 1 year to complete the postmortem and make the necessary changes,
Jayzuss Etch Craïst.
RDNA3.5/4 were not affected by what went wrong with RDNA3, that was a higher level decision based on competition and the market.
Wut? Are you seriously advancing that AMD looked at what Blackwell could be and thought "mmmmh yes, 240mm² die, that'll do"?
The market hasn't rejected $1000 cards either, they're just annoyed at the cost not being followed by an adequate growth in performance from either side. And on the NV side they plainly ruined the pricing because they wanted to redirect dies to AI.
MI300 is pretty ambitious, and the same paradigm will come to RDNA so buckle up.
Yes MI300 is a monster of engineering, no questioning that, but I'm a little doubtful about the reasoning around it.
It feels to me like a product where Nvidia created an AI market and giant bubble, and AMD's decision was to create a compute battlecruiser with as much as they could cram in it. It works for giant corpos with deep pockets. I just question how much of this paradigm can really be translated to client. Chiplets and advanced packaging have a cost in and out of themselves and more importantly, you can cram 6 150W GCXs to a RX 9950 XTXTXTXTXTX, but you can't easily tell people to get the 1500W PSU, huge watercooling and fan noise that comes with it.
And AMD is already penny pinching with their cards now, I really doubt that they're going to become the ultra ambitious corpo that'll sell monster GPUs to clients. They have the skills, not the will. RDNA 4 is yet another example of it.
 
Aug 4, 2023
176
373
96
Wut? Are you seriously advancing that AMD looked at what Blackwell could be and thought "mmmmh yes, 240mm² die, that'll do"?
The market hasn't rejected $1000 cards either, they're just annoyed at the cost not being followed by an adequate growth in performance from either side. And on the NV side they plainly ruined the pricing because they wanted to redirect dies to AI.
No, the original plan was MI300 lite and down, something that has been shown in detail, but it would've been too late to market.
That lineup would've had the lower end parts priced higher than what they will be now due to the halo effect. RDNA4 now is a better lineup for 80%+ of consumers at the cost of AMD.
Yes MI300 is a monster of engineering, no questioning that, but I'm a little doubtful about the reasoning around it.
It feels to me like a product where Nvidia created an AI market and giant bubble, and AMD's decision was to create a compute battlecruiser with as much as they could cram in it. It works for giant corpos with deep pockets. I just question how much of this paradigm can really be translated to client. Chiplets and advanced packaging have a cost in and out of themselves and more importantly, you can cram 6 150W GCXs to a RX 9950 XTXTXTXTXTX, but you can't easily tell people to get the 1500W PSU that comes with it.
And AMD is already penny pinching with their cards now, I really doubt that they're going to become the ultra ambitious corpo that'll sell monster GPUs to clients. They have the skills, not the will. RDNA 4 is yet another example of it.
MI300 is the first implementation of the Exascale Heterogeneous Processor, something AMD has described in detail for a decade.
This was something derived long before current market situations, 300A was the main part after all.
Spam ALUs, spam systolic arrays, spam SRAM, spam memory, spam SerDes and you win.
The more you can cram per unit area, you win.
Top RDNA5 as you are correct to assume, is not a justifiable product if it was solely for client. It is a GDDR accel card for DC as well as bleeding edge HBM parts are only going to get more horrifically priced.
HBM btw was an AMD initiative, NV's explosive rise wouldn't really be possible without HBM, so yeah, maybe AMD can see the future after all.
Penny pinching is a choice, not necessary like it was in the past.

For the last time, Ms. Su likes her margins, and only having the best will yield said margins. And with the best, everyone will come on board, that is the ethos of AMD. The highest performance compute everywhere is the goal.
 

ToTTenTranz

Member
Feb 4, 2021
40
71
61
If they are using 18gbps memory for RDNA4, then It means It's good enough.
Ok, It's a bit surprising, considering RDNA4 should have noticeably higher clocks and N44 even a bit more CU.

Infinity Cache mixes things up with VRAM bandwidth. We don't know how much of it is in the chip nor how high it clocks, and that's making a world of difference.
One thing the 18Gbps spec tells us is once again another pointer that RDNA4 is going to be cost-focused.

If AMD sells a $500 card with RTX 4070 Ti raster+raytracing performance and 16GB VRAM in Q3 2024, they have a winner in their hands.


RDNA 3 was meant to be the ultra cost effective, small CU, many returns gen. It got screwed up.
I think RDNA3 was meant to actually compete in raw performance, but the voltage/clock curves in gaming workloads came up completely messed up.

My guess is AMD was counting on the N31 and N32 GCDs clocking >20% above what they ended up doing. That's why there were initial plans to put additional Infinity Cache via VCache over the MCDs, to get higher effective bandwidth through additional LLC to properly feed GCDs that were supposed to be much more demanding due to higher clocks.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |