greater than 100% scaling? really?

Borealis7

Platinum Member
Oct 19, 2006
2,901
205
106
hi all,

often when I read Crossfire or SLI reviews i look at the scaling numbers and sometimes i see they are above 100% and i think to myself if that's even possible. (assuming no O/C was done on either test)
logically, it isn't possible. how can 2 cards perform better than twice the max performance of a single card? especially when there is MORE computing overhead when considering a multi-GPU setup.

are the reviewers wrong? did they read the numbers incorrectly/drink heavily during the results compilation?

what can explain this? i doubt its a CPU issue since a multi-GPU setup should task the CPU more. or is it the complete opposite explanation that a single card doesn't utilize enough CPU resources which leaves it with idle cycles, but then the single GPU is at 100% utilization so it doesn't answer my question.

maybe someone can enlighten me?
 

MrK6

Diamond Member
Aug 9, 2004
4,458
4
81
Have a link? My first guess is benchmarker error. If not, the drivers can do funky things sometimes.
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
Have a link? My first guess is benchmarker error. If not, the drivers can do funky things sometimes.

No there are plenty of websites showing the new and improved crossfire getting > 100% scaling in a limited number of situations/games. It's actually really amazing and has made me do a double take WTF several times.
 

Seero

Golden Member
Nov 4, 2009
1,456
0
0
The reviews are not wrong, and it is theoretically possible. The total processing power did not exceed 2, but the processing time may decrease more than half. The theoretical maximium of dual play is double I/O * double memory * double processing power = 8 times.
 

WelshBloke

Lifer
Jan 12, 2005
32,582
10,757
136
The reviews are not wrong, and it is theoretically possible. The total processing power did not exceed 2, but the processing time may decrease more than half. The theoretical maximium of dual play is double I/O * double memory * double processing power = 8 times.

I'm not convinced thats true at all. I can see that under some conditions sli/cf could overcome some bottleneck and maybe get slightly over 100% scaling but not what you're suggesting.
 

Seero

Golden Member
Nov 4, 2009
1,456
0
0
I'm not convinced thats true at all. I can see that under some conditions sli/cf could overcome some bottleneck and maybe get slightly over 100% scaling but not what you're suggesting.
Keyword "Theoretical".
 

Daedalus685

Golden Member
Nov 12, 2009
1,386
1
0
The reviews are not wrong, and it is theoretically possible. The total processing power did not exceed 2, but the processing time may decrease more than half. The theoretical maximium of dual play is double I/O * double memory * double processing power = 8 times.

I'm not sure i buy that at all.. There are cases where the scaling is not at all linear and increasing a limiting factor may actually result in super linear gains. But if everything is equally balanced and you double everything, you do not improve performance by 2^things, you improve it by 2. Doubling a single aspect only results in double the performance if that aspect was an absolutely impossible bottleneck (like if you down clocked the memory speed by a factor of 50 and then doubled it from there).

You don't necessarily need double the memory throughput to use double the processing power. By doubling both you may switch the bottleneck from waiting on memory and utilizing only 9/10 of the shading power, to only requiring 9/10 of the now doubled throughput to use the now doubled shading power. So there may be games that are relatively held back by 1 cards memory (or other aspect) yet do not require twice that performance to flip flop the bottleneck so you get >100% scaling because you are better utilizing what is at hand. This is still exceptionally impressive as the overheads from CF/SLI used to knock off 20% + scaling so you'd never notice if this occurred.

As for crossfire/sli, only the memory read throughput and processing power are doubled. Each frame buffer stores an exact mirror of the others. So akin to Raid 1 you can theoretically draw info from memory twice as fast (each GPU has its own pool to take from what it needs) but writing still takes exactly the same time, and the volume does not increase as whatever is written is now written twice.
 
Last edited:

-Slacker-

Golden Member
Feb 24, 2010
1,563
0
76
What if, during the sli/crossfire test, the individual cards just happened to perform slightly better than in the single card test? Benchmark results are almost never constant, sometimes you get more fps, sometimes less...
 

GaiaHunter

Diamond Member
Jul 13, 2008
3,695
387
126
This simply happens because the benchmarks have margins of error, accuracy and precision and so, in situations where the real world scaling is close to 100%, it can cause this abnormal benchmark results.
 

Seero

Golden Member
Nov 4, 2009
1,456
0
0
I'm not sure i buy that at all.. There are cases where the scaling is not at all linear and increasing a limiting factor may actually result in super linear gains. But if everything is equally balanced and you double everything, you do not improve performance by 2^things, you improve it by 2. Doubling a single aspect only results in double the performance if that aspect was an absolutely impossible bottleneck (like if you down clocked the memory speed by a factor of 50 and then doubled it from there).

You don't necessarily need double the memory throughput to use double the processing power. By doubling both you may switch the bottleneck from waiting on memory and utilizing only 9/10 of the shading power, to only requiring 9/10 of the now doubled throughput to use the now doubled shading power. So there may be games that are relatively held back by 1 cards memory (or other aspect) yet do not require twice that performance to flip flop the bottleneck so you get >100% scaling because you are better utilizing what is at hand. This is still exceptionally impressive as the overheads from CF/SLI used to knock off 20% + scaling so you'd never notice if this occurred.

As for crossfire/sli, only the memory read throughput and processing power are doubled. Each frame buffer stores an exact mirror of the others. So akin to Raid 1 you can theoretically draw info from memory twice as fast (each GPU has its own pool to take from what it needs) but writing still takes exactly the same time, and the volume does not increase as whatever is written is now written twice.
Nothing is simple, but can be expressed in simple terms. That particular formula only meant to serve as a boundary. Theoretical max means there is no possible way that the performance can exceed 8 times and apparently you agree. I would have put one assumption, no other bottleneck and bottlenecks are dependent to each other, but that doesn't change the theoretical max as that assumption itself is the best case.

OP doesn't understand how, in abstract, 1+1>2 in terms of performance and believe it is impossible. However, if the formula is io scaling * ram size scaling * gpu scaling, then it isn't hard to see why it is possible for the result be > 2. Of course, again, the formula is abstract and you can argue why it can't be 8, and I won't disagree. My intention is to show that, if doubling ram = 150% performance increase and dual GPU = 150% performance increase, then 2 cards can exceed 1*1.5*1.5 = 225%, where GPU scaling is no where near 200%.
 
Last edited:

Throckmorton

Lifer
Aug 23, 2007
16,829
3
0
The reviews are not wrong, and it is theoretically possible. The total processing power did not exceed 2, but the processing time may decrease more than half. The theoretical maximium of dual play is double I/O * double memory * double processing power = 8 times.

Isn't that like saying if you put 2 engines in a car it can theoretically accelerate 8x as fast because it has 2x engines, 2x fuel lines, and 2x air intakes?
 

Triggaaar

Member
Sep 9, 2010
138
0
71
The theoretical maximium of dual play is double I/O * double memory * double processing power = 8 times.

Isn't that like saying if you put 2 engines in a car it can theoretically accelerate 8x as fast because it has 2x engines, 2x fuel lines, and 2x air intakes?
We're all against you Seero I totally understand that a theoretical limit can be far from reality, so no problem with your maths there, but I don't think multiplying different factors has anything to do with the theoretical limit. If there are, as in your example, 3 functions for a graphics card, doubling the speed of each should give you a total performance increase of 100%, not 800%. For example, it it took 2 seconds to send something through I/O, 2 seconds to process it, and 2 seconds moving data around the memory, the total time would be 6 seconds (assuming only one thing would happen at a time for ease of comparison). Doubling the speed of each process would give a time of 3 * 1 second.

As for why scaling can be greater than 100%:
Could this be to do with memory bottlenecks? Different parts of a card (and machine) work at the same time, but if the is a memory bottleneck more work has to be done, so it takes more than twice as long with the slower setup.
 

badb0y

Diamond Member
Feb 22, 2010
4,015
30
91
I can't wait for Antilles benchmarks and possibly the GTX 595 benchmarks, I hope they get the boat moving on the Quad-fire/Quad-SLI scaling too to make them better options for enthusiasts. Good times ahead.
 

NoQuarter

Golden Member
Jan 1, 2001
1,006
0
76
Margin of error in fps tests is something like 3% either way. You can run the same bench 3 times in a row and get different results everytime. Highly possible one of the benchmarks on the single card was a low result and the SLI/CF test was a high result.
 

Seero

Golden Member
Nov 4, 2009
1,456
0
0
We're all against you Seero I totally understand that a theoretical limit can be far from reality, so no problem with your maths there, but I don't think multiplying different factors has anything to do with the theoretical limit. If there are, as in your example, 3 functions for a graphics card, doubling the speed of each should give you a total performance increase of 100%, not 800%. For example, it it took 2 seconds to send something through I/O, 2 seconds to process it, and 2 seconds moving data around the memory, the total time would be 6 seconds (assuming only one thing would happen at a time for ease of comparison). Doubling the speed of each process would give a time of 3 * 1 second.

As for why scaling can be greater than 100%:
Could this be to do with memory bottlenecks? Different parts of a card (and machine) work at the same time, but if the is a memory bottleneck more work has to be done, so it takes more than twice as long with the slower setup.
The thing about theoretical max is that there are no ways of surpassing that number. You may believe that there exist a smaller theoretical max, but I believe, there is no way you can get more than 8 times in terms of performance. If we can acquire more information about the relationship between i/o, memory, and number of cores, then we can come up with a better theoretical max, but for now, I don't think any one of you can give me an example where it is possible to scale up more than 8 times.

If you believe that it is too big, then how much smaller should it be? 7 times? How do you get this number? 6 times? How do you get this number? "Duh, 1+1=2, and therefore 2 times!" Well, either the number lies, badly constructed or tested, or the formula is wrong.

I guess instead of trying to disagree each other, lets try to find common ground that we can agree upon. Do you agree that doubling the number of processors may increase performance? If so, do you agree that it can go up to 200%? If you agree about the increase, but disagree about the number, what should the number be?

Do you agree that doubling the size of memory may increase performance? If so, do you agree that it can go up to 200%? If you agree about the increase, but disagree about the number, what should the number be?

Do you agree that doubling the number of i/o channel may increase performance? If so, do you agree that it can go up to 200%? If you agree about the increase, but disagree about the number, what should the number be?

Now I am not saying 1+1 = 8(lol), or dual cards = 800% increase in performance(I am saying that it CAN NOT get more than 800% in performance.) I said, given each of these variable will increase performance independently, it is possible to exceed 200% performance. For example, let say i/o doesn't scale (100%), size of memory increase performance under some odd cases (say 145%), and the fact that there are 2x GPU, performance should increase under some odd cases (say 145%), then the total performance gain can be 100%*145%*145% = 210.25% under some extremely odd cases, which is totally possible. No?
As a reminder, maximum(max) means the highest possible number you can possibly get, not the number that you will likely get.
 

NoQuarter

Golden Member
Jan 1, 2001
1,006
0
76
If all 3 components (I/O, memory, GPU) equal 100% performance, you would have to double all 3 to double performance to 200%. If you double *just* I/O (and assuming it has equal weight) you would be at 133% performance, not 200%.
 

Throckmorton

Lifer
Aug 23, 2007
16,829
3
0
Seero I think your thinking is flawed. If 2x memory bandwidth means 2x performance, that means mem bandwidth was the bottleneck before. And it still is, otherwise the improvement would be less than 2x.

So if you do 2x bandwidth and get 2x performance, what happens when you then double GPU power? You can't physically get 2x performance again, because that would mean the GPU is bottlenecking the mem by half. But we just determined that the memory was the bottleneck before and still is.
 
Dec 30, 2004
12,553
2
76
The reviews are not wrong, and it is theoretically possible. The total processing power did not exceed 2, but the processing time may decrease more than half. The theoretical maximium of dual play is double I/O * double memory * double processing power = 8 times.
I'm not convinced thats true at all. I can see that under some conditions sli/cf could overcome some bottleneck and maybe get slightly over 100% scaling but not what you're suggesting.
Keyword "Theoretical".
Yeah I'm saying I don't think theres an 8x theoretical improvement there.

You cant just multiply all the different factors together.

don't have 2x the IO or memory, because each card still has to hold the same memory that the other card is holdng.
 

Seero

Golden Member
Nov 4, 2009
1,456
0
0
Seero I think your thinking is flawed. If 2x memory bandwidth means 2x performance, that means mem bandwidth was the bottleneck before. And it still is, otherwise the improvement would be less than 2x.

So if you do 2x bandwidth and get 2x performance, what happens when you then double GPU power? You can't physically get 2x performance again, because that would mean the GPU is bottlenecking the mem by half. But we just determined that the memory was the bottleneck before and still is.

don't have 2x the IO or memory, because each card still has to hold the same memory that the other card is holdng.
I do not disagree with you guys. I didn't say double the bandwidth, I said double the size, or I disagree about the bandwidth being doubled. Yes, 2 1gb video card <> 2gb vram, but after loading the data, each card has its own memory space to work with I believe. I also didn't say you will physically get 2x performance, I said, in theory, it is possible.

Memory on multi-cards is always difficult to understand and I am not claiming I know, so I can't disagree. However, if the theoretical max scaling on memory alone isn't 200&#37;, then what is it? 199%? 150%? 101%?

It is like I am saying that "the tallest human is no taller than 100 feet" and you say "there is no way a human can be that tall," which isn't contradicting me. Unless you can contradict my claim with an example or another theory of a human taller than 100 feet, I am not wrong. However, you can say "No, the tallest human can't possibly be taller than 20 feet," and if there are no possible contradict theory or an example, than your theoretical max is more accurate than mine.

So if 200% is too much, what is your theoretical max on doubling the size/bandwidth of memory?

As to I/O, it sits on 2 PCIe slots plus a SLI/CF bridge on the top. Yes, since all PCIe shares the same bus, it isn't scaling. However, maybe one day each PCIe will have their own PCIe bus. When it does, then I/O will scale. But even then, my claim is, it won't exceed 200%(unless proven otherwise).

As to memory, I believe each card does store a set of data which is common, and a set of data which isn't. Now the size of that uncommon data may split into 2 cards, and therefore is possible that such data may be processed in parallel. Here there are 2 cases, a) 1 GPU vs 2 GPUs, and b) larger memory space on a single GPU. I am saying that both a) and b) can, in theory, increase performance with a theoretical max of 200%.
 
Last edited:

Kenmitch

Diamond Member
Oct 10, 1999
8,505
2,250
136
It's a Jedi mind trick being pulled off by the AMD driver team....Must be some kind of cheat

Just noticed this must be the only AMD related thread that Happy and Wreckage hasn't posted in....What gives?
 

Triggaaar

Member
Sep 9, 2010
138
0
71
The thing about theoretical max is that there are no ways of surpassing that number.
Yes we all know that.
for now, I don't think any one of you can give me an example where it is possible to scale up more than 8 times.
Ok, theoretically you could have a situation where a game just wouldnt run with 1 card, but it would with 2, so your limit is suddenly infinity (a bit more than 8). But with cards that are capable of running a game properly on their own, I'm not sure off the top of my head why you would be able to more than double performance with two cards (will need to check benchmark results, and allow for margin of error).

If you believe that it is too big, then how much smaller should it be? 7 times? How do you get this number? 6 times? How do you get this number?
Basically 2 times, with the odd anomaly (which we need more input on).

I guess instead of trying to disagree each other, lets try to find common ground that we can agree upon.
By all means
Do you agree that doubling the number of processors may increase performance? If so, do you agree that it can go up to 200&#37;? If you agree about the increase, but disagree about the number, what should the number be?
As Throck said, if the processor is the only thing holding the card back (wouldn't be the case, but hypothetically), then that could lead to an increase of performance of up to 100% (= go up to 200% in your terms).
Do you agree that doubling the size of memory may increase performance? If so, do you agree that it can go up to 200%? If you agree about the increase, but disagree about the number, what should the number be?
Same answer as above - but, you can't have a situation where the processor is the only thing holding the card back whilst at the same time the memory is holding the card back.

And the same arguement goes for i/o.

performance should increase under some odd cases (say 145%), then the total performance gain can be 100%*145%*145% = 210.25% under some extremely odd cases, which is totally possible. No?
No
As a reminder, maximum(max) means the highest possible number you can possibly get, not the number that you will likely get.
Honestly, we get that.

If all 3 components (I/O, memory, GPU) equal 100% performance, you would have to double all 3 to double performance to 200%. If you double *just* I/O (and assuming it has equal weight) you would be at 133% performance, not 200%.
+1
This is as per my example in my first post:
"For example, it it took 2 seconds to send something through I/O, 2 seconds to process it, and 2 seconds moving data around the memory, the total time would be 6 seconds (assuming only one thing would happen at a time for ease of comparison)."
That's 2+2+2 = 6. Improving one element (processor) 100% would not bring that down to 3 seconds, it would only improve the time related to processing, so it would give you 1+2+2 = 5 seconds.

The fact that we've split this into 3 sections (memory, processing, I/O) is arbitrary. There could be 10 little components that make up a piece of electronics, so using your method you'd say 2 pieces of euipement could (theoretically) improve performance by a factor of 2 to the power of 10 = 1024. So 2 cards would be up to 1024 times faster than 1 card. That's obviously beyond daft, but hopefully it explains the flaw in your calculation.
 
Last edited:

aka1nas

Diamond Member
Aug 30, 2001
4,335
1
0
The reviews are not wrong, and it is theoretically possible. The total processing power did not exceed 2, but the processing time may decrease more than half. The theoretical maximium of dual play is double I/O * double memory * double processing power = 8 times.

Effective graphics memory is not doubled on current SLI or Crossfire platforms. Framebuffer data has to be mirrored across the cards(this is what is getting transferred across the bridge). You are also miscalculating performance as others have noted.

What would be a more likely explanation for the superlinear increase is that with a single card some of these games were being bottlenecked to the point that other game-related processing was being held up (most likely due to poor programming).
 

Ben90

Platinum Member
Jun 14, 2009
2,866
3
0
The reviews are not wrong, and it is theoretically possible. The total processing power did not exceed 2, but the processing time may decrease more than half. The theoretical maximium of dual play is double I/O * double memory * double processing power = 8 times.

Just make sure you buy 8 times as much soapy water to professionally clean those cards with.
 

Seero

Golden Member
Nov 4, 2009
1,456
0
0
Ok, theoretically you could have a situation where a game just wouldnt run with 1 card, but it would with 2
I never seen a case like that, I saw lots of cases where 1 card works, 2 doesn't.
, so your limit is suddenly infinity (a bit more than 8). But with cards that are capable of running a game properly on their own, I'm not sure off the top of my head why you would be able to more than double performance with two cards (will need to check benchmark results, and allow for margin of error).
Again, you said you understand what theoretical max is, but you keep coming back to "How do that get that max" instead of "Is there are way to get over that." Again, it is like I said there is no way to get beyond 800&#37; and you keep saying that "I can't think of a way to get to 800%." If you think 8 is too large, than give me another formula and then try to support it.


Basically 2 times, with the odd anomaly (which we need more input on).
It isn't hard to see the theoretical max is much higher than 2 times. Again, we have benchmarks showing that under varies cases sli/cf scales better than 200%. Each of these benchmarks runs for several minutes and can be reproduced, so it isn't like a off the norm thing. Of course you may argue that the method of which the statistic is computed may be iffy, but we have been using those benchmarks to test v-cards for a while now.


As Throck said, if the processor is the only thing holding the card back (wouldn't be the case, but hypothetically), then that could lead to an increase of performance of up to 100% (= go up to 200% in your terms).
Let say 1x i/o * 1xmem * 1xGPU = 100%, you agreed that, individually, i/o, mem, and gpu can, in theory scale linearly(200%). So assume that they are indenpendent, can't we have a case where each of them scales by 133%, making the formula 1.33*1.33*1.33 = 235%?

suppose i/o doesn't scale, but doubling the mem and cpu individually scales performance by 145%, why not?
The fact that we've split this into 3 sections (memory, processing, I/O) is arbitrary. There could be 10 little components that make up a piece of electronics, so using your method you'd say 2 pieces of euipement could (theoretically) improve performance by a factor of 2 to the power of 10 = 1024. So 2 cards would be up to 1024 times faster than 1 card. That's obviously beyond daft, but hopefully it explains the flaw in your calculation.
I pick 3 components because even you agreed that, individually, they can, in theory scale performance linearly. The number of screws are doubled, but doubling the number of screws does not scale performance. So if you can find a 4th independent factor that does scale performance and is changed by adding a new card, share it. If we know enough, the formula will become very long and complicated as some of the variables are dependent, some are independent, some scale in reverse, and many other types of dependencies.

Effective graphics memory is not doubled on current SLI or Crossfire platforms. Framebuffer data has to be mirrored across the cards(this is what is getting transferred across the bridge). You are also miscalculating performance as others have noted.

What would be a more likely explanation for the superlinear increase is that with a single card some of these games were being bottlenecked to the point that other game-related processing was being held up (most likely due to poor programming).
We can introduce more variables to the table, or we can first manage the ones on the table first. Yes, there is are common data which is required to be send to all v-cards because at initial phase it doesn't know what is or not necessary. Let say you and I both bought the same game. Each of us will need to install the game into our own storage, so if either of the storage doesn't have sufficient space, we can't proceed.

Now suppose the game install fine for both of us and we starts to play, the data generated from the game, enough is the same game, can be different. To top it off, the reminding storage space does impact performance.

Again, I don't understand why it is so complicated. OP stated that it is "Logically impossible" to have 2x performance, and I say, in theory, it can be much higher than that and gave an example. I don't know the specifics within the video card and definitely layman in terms of SLI/CF, so I simply multiply 3 independent factors which, you guys agreed, can in theory scales performance linearly, and then state that it is the theoretical max. That is all.

So far you guys are saying the number is too high, but ain't able to give me a lower theoretical max or find a contradiction to a possibility that it can indeed scale higher than my prediction.

If our goal is to see whether or not it is "Logically possible" to go beyond 2, and you guys continously, and repeatedly come back and say there is no way 8.

Guys, that is actually what I am saying, that it can't go beyond 8 times. You can only say that I am wrong if you argue that it can go beyond 8 because I said it can NOT be done. Keep giving examples of how you can't get beyond 8 only further proving my point.


Just make sure you buy 8 times as much soapy water to professionally clean those cards with.
I'm sorry Ben, but what exactly is your input to this thread? If you want to make yourself useful, report yourself.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |