Discussion Modern GPU designs

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

biostud

Lifer
Feb 27, 2003
18,280
4,801
136
I'm no hardware engineer so it is just based on a layman's observations.

SLI/CF is now virtually dead, and we have returned to monolithic (and now chiplet) designs. SLI/CF gave os the possibility to increase our performance usually around 70-80% (if it worked), but at double cost and power consumption. But with it, problems with microstutter and not all games supporting it also existed.

With the 4090 and 7900 series presented we see nearly a triple number of transistors used compared to last generation, but raster performance is not even doubled. As I understand it, we run into the problem with total power budget for the GPU, as it cannot increase in the same amount as the number of transistors.

So what I'm wondering is why is the performance/transistor decreasing so drastically with each generation? It doesn't seem like an economical design. Wouldn't it be better (if possible) to make the die smaller and run with a higher frequency?

If most of the performance/watt increase comes from change in proces nodes, then can we ever expect better performance/watt gen. to gen. than it allows for?
 
Reactions: Vattila

Hitman928

Diamond Member
Apr 15, 2012
5,392
8,278
136
Here's a fun part, nowadays new "node" can also mean no shrinking. You can gain some density if you figure out how to avoid quantum tunneling and current leakage. Der8auer made a video about how manufacturers "lie" about node sizes (they don't lie, they ran out of Moore's Law like scaling, improved manufacturing in many other ways and don't know how to best define their advances without doing bad PR job).
As to why those Navi chips perform differently it's more complex answer and frankly not determined by process node as it is the same. Thing is is bigger chip, which is clocked lower will not need as much voltage to achieve same level of performance as smaller die clocked higher. The reason for this is the rule of power usage (https://en.wikipedia.org/wiki/Processor_power_dissipation):

Here P is power usage in watts, C is current in amps, V is voltage in volts and f is frequencies in Hertz. And another rule is that for each unit of frequency to stabilize you more or less need square of voltage. However, it's true that due to current leakage big chips use more amps while idle and due to quantum leakage for their are size they also have more leakage of current, but unlike voltage amperage scale linearly, meanwhile voltage squares. So that's why smaller chip can't crank clock speeds high enough to compensate reduction in die size, well at least not at some power consumption or long term durability.

The ‘C’ in your quoted equation stands for capacitance, not current. This is also dynamic power which is different than the leakage you are referring to.
 

biostud

Lifer
Feb 27, 2003
18,280
4,801
136
For GPUs, generally yes, because they aren't exactly general purpose processors and their performance scales extremely well with core count, due to the nature of their tasks. And since we ignore reality, I suppose we can ignore higher silicon failure of bigger die chips and higher activation voltage required for bigger chips, as well as higher idle wattage as result of higher amperage and leakage of bigger dies.


Yes, it's difficult to cool down high heat density chips, however it's usually preffereable to have higher heat density than bigger die, because bigger dies suck more power at idle and bigger dies may clock lower due to their inherent higher voltage requirements, which arise from having to maintain same signal integrity across bigger area. As to which is preferable depends on particular design and needs
And besides failure rates, another reason why a chiplet approach can be preferred?
 
Jul 27, 2020
16,802
10,743
106
And besides failure rates, another reason why a chiplet approach can be preferred?
Also a smaller chiplet may be easier to clock much higher than a big die. Unfortunately, AMD is having trouble doing that on the 7900XTX chiplets. My hunch is that the cache dies are generating considerable heat which is preventing the GCDs from clocking higher.
 

biostud

Lifer
Feb 27, 2003
18,280
4,801
136
Also a smaller chiplet may be easier to clock much higher than a big die. Unfortunately, AMD is having trouble doing that on the 7900XTX chiplets. My hunch is that the cache dies are generating considerable heat which is preventing the GCDs from clocking higher.
Or maybe the 3Ghz has always been the Navi32, we need fresh rumors
 
Aug 16, 2021
134
96
61
Also a smaller chiplet may be easier to clock much higher than a big die. Unfortunately, AMD is having trouble doing that on the 7900XTX chiplets. My hunch is that the cache dies are generating considerable heat which is preventing the GCDs from clocking higher.
Maybe or they are just smirking at burning nVidia cards and decided to do some PR.
 

alcoholbob

Diamond Member
May 24, 2005
6,271
323
126
We've run into a lot of CPU bottlenecks as well, as GPUs have been getting 30-50% faster every 2 years but CPUs have been increasing at less than half that speed. Even at 4K, you have barely much of a performance gain between 4090 and 3090 in many instances because CPU load has become the bottleneck in big open world games.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
We've run into a lot of CPU bottlenecks as well, as GPUs have been getting 30-50% faster every 2 years but CPUs have been increasing at less than half that speed.

CPUs are by definition general purpose processors. You cannot improve something that does everything as well as something that's more purpose oriented.

Or maybe the 3Ghz has always been the Navi32, we need fresh rumors

I think the "leakers" are the problem not AMD.

What was it? 2.5x compute? 4GHz units? Better perf/unit all at 300W? Solves all the issues of the world while costing under $500?
 

biostud

Lifer
Feb 27, 2003
18,280
4,801
136
CPUs are by definition general purpose processors. You cannot improve something that does everything as well as something that's more purpose oriented.



I think the "leakers" are the problem not AMD.

What was it? 2.5x compute? 4GHz units? Better perf/unit all at 300W? Solves all the issues of the world while costing under $500?

The question is if a SoC/console/Mx approach at some point will enter the PC market.
 
Jul 27, 2020
16,802
10,743
106
The CPU has to do the calculations for many of the decisions that shape the game world. AI being one of the most important. Then the drivers also need to ensure that the shaders are translated into a form that will run best on the particular GPU architecture being used (so called shader optimizations). Until these things are ready, the GPU will sit there waiting. So the CPU needs to be really fast and nothing else should be interrupting it.

I wish I could see how many times a second a critical game process is interrupted by context switching where the OS freezes that thread so some other stupid thread can run in its place while the game's thread is shifted to a different physical or virtual core. I would really prefer an option in Windows that says "DO NOT BOUNCE THIS PROCESS'S THREADS AROUND!".
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
I would really prefer an option in Windows that says "DO NOT BOUNCE THIS PROCESS'S THREADS AROUND!".

The bouncing of the thread is way back before we had Turbo and many core CPUs because they found performance improves by 1-3% compared to when you have it all running on one core. Turbo changed that but they didn't change that aspect of Windows I guess.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |