Discussion Nvidia Blackwell in Q4-2024 ?

Page 9 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

MoogleW

Member
May 1, 2022
57
28
61
That's right but I am skeptical it can get there with 64 MB L2 and only 896 GB memory bandwidth. That seems like the 5080 would be 15-20% faster than the 4080 Super.
Have the cache sizes of L0, L1 or L2 been leaked? One of the architectural improvements that have long been speculated (by kopite7kimi for lovelace that didn't pan out and by redgamingtech for this gen) is some sort of new inter SM and inter TPC connections that alleviates some L2$ use, which sounds like DSMEM or an evolution of it, a feature that is currently exclusive to Hopper, which if I understand it right, is supposed to allow SMs to share data of private caches in a cluster, GPC for H100, TPC for GB202 without a round trip to L2 cache.

These sort of architectural improvements plus higher cache clocks and increased L1$ and or increased L2$ would allow GB203 to match the full potential of AD102 or get close. And since 4090 itself is not the full potential of AD102 then it makes sense for 5080 to be slightly faster than 4090 in my opinion
 
Last edited:

Tigerick

Senior member
Apr 1, 2022
679
559
106
AGF, a credential GPU leaker Kopite7kimi followed has something to say about upcoming Blackwell:-



Yeah, no delay. Flagship RTX-5090 and 5080 GPUs are coming by end of the year
 
Last edited:
Reactions: Mopetar and psolord

Tigerick

Senior member
Apr 1, 2022
679
559
106
Here is interesting news..





Yeah, people seem confused about Blackwell memory configuration and interfaces. To me there is no confusion cause once you know how to calculate the memory bandwidth of GDDR7 and GDDR6X with different memory bus, you would know how/why NV will choose for upcoming Blackwell GPUs.

The amount of L2 cache also playing important roles in performance uplifts, as shown in my speculation. Also, some people are wondering why I put RDNA 4/5 GPU beside Blackwell series. AGF mentioned the reason cause Jensen has to know about upcoming RDNA4/5 to make sure Blackwell series are competitive especially we know RDNA5 whole lineup are using GDDR7 as memory choice. In my front page, I listed upcoming Blackwell GPU are having 75% memory bandwidth of RDNA5. Yes, the rest of the lineup will use GDDR6X not GDDR7 cause GDDR6X's total memory bandwidth is faster than GDDR7 as shown in the last table.
 
Last edited:

MoogleW

Member
May 1, 2022
57
28
61
AGF, a credential GPU leaker Kopite7kimi followed has something to say about upcoming Blackwell:-

View attachment 95279

Yeah, no delay. Flagship RTX-5090 and 5080 GPUs are coming by end of the year
I find it hard to believe Nvidia is going to miss a deadline of 2 years when they need less time to come to market with something. They themselves boasted that their new chip floorplan AI tools allow them port ampere to a new node in 1 year. Furthermore all of a sudden when AMD really tried, they are able to get RDNA5 in a year but Nvidia can't bring a planned GPU in 2 years?

Any delay is easier for me to digest as a willful delay rather than a technical issue. And AI won't be a reason to delay GPUs when RTX was meant for AI in the first place. Jensen says so himself when launching Turing in 2018.
 

jpiniero

Lifer
Oct 1, 2010
14,686
5,316
136
These sort of architectural improvements plus higher cache clocks and increased L1$ and or increased L2$ would allow GB203 to match the full potential of AD102 or get close. And since 4090 itself is not the full potential of AD102 then it makes sense for 5080 to be slightly faster than 4090 in my opinion

Cache in general is stupidly expensive on N3E.

I think what will end up happening is that the Super models will have 32-34 gbps 3 GB chips. That will get GB203 to 4090ish performance.
 

Tigerick

Senior member
Apr 1, 2022
679
559
106


I am not sure he refer to full die of AD102 or RTX4090. Based on my calculation, RTX-5080 20GB will close to RTX4090 in rasterization performance. Meanwhile, RTX-5080 24GB should be faster than RTX-4090 in rasterization and RT.
 

SteinFG

Senior member
Dec 29, 2021
458
520
106
With the new info, I want to put my prediction for the blackwell chips
NameSMsWidthMemoryPriceRough perf.
GB202192512 bit
5090172512 bit32GB$1800~4090 + 50%
GB20396256 bit
508096256 bit16GB$1000~4090
5070 Ti78256 bit16GB$800>4080 Super
GB20560 (guess)192 bit
507060192 bit12GB$600~4070 TiS
5060 Ti44192 bit12GB$450~4070 S
It's probably going to be ~$10 per SM. Would be happy if this turns out to be true, and sad if they don't reach performance levels

edited: the bottom of the table
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
14,686
5,316
136
With the new info, I want to put my prediction for the blackwell chips
NameSMsWidthMemoryPriceRough perf.
GB202192512 bit
5090172512 bit32GB$1800~4090 + 50%
GB20396256 bit
508096256 bit16GB$1000~4090
5070 Ti78256 bit16GB$800>4080 Super
GB20560 (guess)192 bit
507060192 bit12GB$600~4070 TiS
GB20640 (guess)128 bit
5060 Ti38128 bit16GB$420~4070
GB20730 (guess)128 bit
506030128 bit8GB$300~4060 Ti
It's probably going to be ~$10 per SM, plus 40$ on top for 5060Ti because of doubled capacity. Would be happy if this turns out to be true, and sad if they don't reach performance levels

The 512 bit product is more likely a 5090 Ti.
I kind of think the 5080 won't be the full GB203 but close. Probably closer to 90 SM.
I could see 60 SMs for GB205 but still think it'll be 48 (lol). Depending on the clock speed it could still be at roughly 4070 Ti S performance. People will be mad of course if it's $799 and only 12 GB. Maybe there will be no 5070 Ti, only just a 5070 and that will be $700.

Also assume no clamshell.
 

MoogleW

Member
May 1, 2022
57
28
61
With the new info, I want to put my prediction for the blackwell chips
NameSMsWidthMemoryPriceRough perf.
GB202192512 bit
5090172512 bit32GB$1800~4090 + 50%
GB20396256 bit
508096256 bit16GB$1000~4090
5070 Ti78256 bit16GB$800>4080 Super
GB20560 (guess)192 bit
507060192 bit12GB$600~4070 TiS
5060 Ti44192 bit12GB$450~4070 S
It's probably going to be ~$10 per SM. Would be happy if this turns out to be true, and sad if they don't reach performance levels

edited: the bottom of the table
All good except rtx 5060ti which should be on GB206

These ae my current guesses assuming they keep the GPC:TPC:SM ratio constant

GB207: 2GPC:8TPC:2SM = 32SM, rtx 5060 30SM, rtx 5050 24SM

GB206: 3GPC:8TPC:2SM = 48SM, rtx 5060ti 44-48SM

GB205: 5GPC:8TPC:2SM = 80SM, rtx 5070ti 76SM, 5070 60SM

GB203:6GPC:8TPC:2SM = 96SM, rtx 5080 84SM

GB202: 12GPC:8TPC;2SM = 192SM, rtx 5090 = 160-172SM

Uplit I guess is once again around 25% per SM through architecture and clocks, 2X in RT as usual
 
Last edited:

TESKATLIPOKA

Platinum Member
May 1, 2020
2,373
2,868
136
With the new info, I want to put my prediction for the blackwell chips
NameSMsWidthMemoryPriceRough perf.
GB202192512 bit
5090172512 bit32GB$1800~4090 + 50%
GB20396256 bit
508096256 bit16GB$1000~4090
5070 Ti78256 bit16GB$800>4080 Super
GB20560 (guess)192 bit
507060192 bit12GB$600~4070 TiS
5060 Ti44192 bit12GB$450~4070 S
It's probably going to be ~$10 per SM. Would be happy if this turns out to be true, and sad if they don't reach performance levels

edited: the bottom of the table
There is too large of a gap between 5080 and 5090.
 

SteinFG

Senior member
Dec 29, 2021
458
520
106
There is too large of a gap between 5080 and 5090.
Nvidia learned that people who spend over 1000 dollars will just buy the best GPU, and won't bother looking at the 1200 dollar option (even if they have similar perf/$). 5080 will cost 999, while 5090 will rise in price. That's how the big gap emerges. And it's not like AMD will answer to it with anything. Top RDNA4 is rumored to be below 4080 in performance (So probably on par with 5070 or trading few % here and there)
 
Reactions: Tlh97

MoogleW

Member
May 1, 2022
57
28
61
ehhhh, definitely nope
AD104 has 5GPC:6TPC:2SM
I am saying 5GPC:8TPC:2SM

We know that the ratio of TPC to GPC is increased in both datacenter and in gaming GPUs while keeping the number of GPC constant. GB202 has 12GPC:8TPC:2SM ratio

So 5*8*2=80SM makes sense to me
 
Last edited:

TESKATLIPOKA

Platinum Member
May 1, 2020
2,373
2,868
136
Nvidia learned that people who spend over 1000 dollars will just buy the best GPU, and won't bother looking at the 1200 dollar option (even if they have similar perf/$). 5080 will cost 999, while 5090 will rise in price. That's how the big gap emerges. And it's not like AMD will answer to it with anything. Top RDNA4 is rumored to be below 4080 in performance (So probably on par with 5070 or trading few % here and there)
I disagree. Your reasoning about guys willing to spend >1000 dollars only buying the best GPU would be untrue the moment NVidia released 4090Ti, because then RTX 4090 wouldn't be the best but second best.
The same would apply to your table, 5090 is a cutdown version of GB202.
RTX 4080 was just bad value compared to RTX 4090 or RTX 4070Ti, that's why It didn't sell well.
 
Reactions: Tlh97

SteinFG

Senior member
Dec 29, 2021
458
520
106
I disagree. Your reasoning about guys willing to spend >1000 dollars only buying the best GPU would be untrue the moment NVidia released 4090Ti, because then RTX 4090 wouldn't be the best but second best.
And 4090 Ti won't be released most likely. 5090 is less than 9 moths away. Even if it did release, there's almost no one in the market for it. Most who are willing to spend 1800+ dollars on a gpu already got a 4090 a year ago, and/or waiting for 5090
The same would apply to your table, 5090 is a cutdown version of GB202.
The ful die will be workstation-only card, like it is right now
RTX 4080 was just bad value compared to RTX 4090
not really, it has almost the same $/frame
 
Last edited:

Tigerick

Senior member
Apr 1, 2022
679
559
106
Why there is no GB204 in between GB203 and GB205?

The same reason why there is no AD105 in between AD104 and AD106.

Good hint, Jensen !!!
 

Aapje

Golden Member
Mar 21, 2022
1,434
1,954
106
not really, it has almost the same $/frame
That is poor value, since the flagship commands a premium and it also had 8 GB more.

Furthermore, the value of FPS is actually not linear, given that people have a minimum acceptable framerate. The 0-60 FPS range is worthless for most people, so a card that does 40 FPS in a game is not actually half as good as a card that does 80 FPS. It is much less good or even just worthless to them (as in, the owner would probably get rid of it and get an acceptable card).

If you remove 60 FPS from the HUB results and calculate the cost per frame, you get $23.5 per frame for the 4080 and exactly $20 per frame for the 4090.

Now, of course this kind of calculation is not objective since it depends on what settings and frame rates you consider to be acceptable, but it does illustrate that the real cost per frame that matters to you is going to be less the weaker the card is, than a simple calculation dividing the price by the FPS.

This is why weaker cards have to actually be cheaper relative to their performance than more powerful cards, to offer the same value. And this is how GPU makers have always priced their cards, until recently.
 

Tigerick

Senior member
Apr 1, 2022
679
559
106


NV has decided to go for GDDR7 for top 3 dies, then we should be seeing 512-bit GDDR7 after all:-

  1. RTX-5080 20GB GDDR7 320-BIT 80 MB
  2. RTX-5080 24GB GDDR7 384-BIT 96 MB
  3. RTX-5090 32GB GDDR7 512-BIT 112 MB ?
Pending confirmation...
 
Reactions: Mopetar
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |