I understand it perfectly, you are applying it incorrectly. You remind me of my highschool precalc teacher who once tried to show me with a limit function how over an infinite duration a 1 liter container will leak out 3 liters of water and when I pointed out that it fails basic sanity check thought I was trying to claim the limits don't work (I wasn't, I was pointing out that she is using the limit incorrectly).
You can solve an equation correctly and still be wrong if you set it up incorrectly or even used the wrong equation. This is exactly what you are doing, you are simply setting up the equation all wrong, and this is why you are getting a theoretical maximum with 2 GPUs of 8x of the performance of a single GPU instead of a little under 2x the performance of a single GPU.
You preception to my post changes from "nonsense" to "highschool precalc teacher" while the post didn't change. It appears that your understanding to the post has changed, although still way off, you are on the right track.
First, I believe you reconstructed what your teacher was trying to tell you back then with your own understanding to the material, and since your understanding is limited (the fact that you were a student at the time), therefore the reconstructed story may have absolutely nothing to do with what your teacher was trying to tell you at the time. It can't be what your teacher was trying to say. I wasn't in your class at the time or have any other information about that, but based upon what you said, unless you are in some handicap school, you won't be the only one you spotted the problem.
Questioning is good, denial is bad when it comes to learning. If you question me, then I will try to answer you. I may not be capable of giving you the correct answer, but eventually, you will get your answer. This is called learning. Please do respect your teacher because if it wasn't him/her, you will never ask the right question. However, it isn't any of your teacher's fault that you don't ask questions or simply stop listening.
Maybe trying to look at the scenario where benchmark > 2x performance instead of tyring to dig your memory for something to put me down, you can actually end up gaining something from this thread.
Let P(t) represents the performance of 1 video card, let Q(t) represents the performance of 2 video cards, let t represents any given point in time during the benchmark, where 0<=t<=N, where N is the total duration of the benchmark.
You believe that Q(t) cannot possibly be bigger than 2*P(t). So 0<=Q(t)<=2*P(t). So
Summation of Q(t) over all t between 0 to N <= summation of 2*P(t) over all t between 0 to N. Okay, but we now have a case where
Summation of Q(t) over all t between 0 to N > summation of 2*P(t) over all t between 0 to N knowing that there exist some t there Q(t) <= P(t).
Yes, some may simply claim that the statistic/benchmark is wrong and call it a day, but some may simply challenge the fact that it is possible that Q(t) > 2*P(t) at some t. If that is the challenge, than what is an upperbound of Q(t) with respect to P(t)?
You can re-read the threads of how I come up with 8. If you believe that 8 is too big, then share your PoV of another theoretical maximum. So far, no one stated that it can even be possible get close to that, so it is a good sign, but doesn't mean that it is indeed the max as I could have missed some other independent factors which a)gets increased by adding a video card and b)has possitive impact on performance.
Some agrued that memory did not get doubled as data gets reprecated to the memories of video cards. However, the data does not assume all the memory on those video cards.
For example, let say X is the amount of data that are sent to video card, Y is the capacity of memory of the video card, and C is the available(unused) memory. When there is one card, c1 = y-x. When there are 2 cards, then c2 = 2y - 2x. Clearly c2 = 2*c1. Yes, there may be a W, which only occurs when there are more than one card to make SLI/CF to work. I can't say whether or not W has any relation to X, but theoretically speaking both W and X can be 0.
I am not saying 2xmem size = 2xperformance, I am saying that doubling memory does not scale more than 200% when it comes to performance theoretically.
Some claim that doubling each individual factor can only have 133% performance at best, not 200%. I didn't say that is wrong, because I really don't know enough about it. However, if the same rule applies to mem size and number of GPUs, then each can individually increase performance by up to 33%, which in total 133% * 133% * 133% = 235.26%. If 235.26% is indeed a theoretical maximum, meaning that there is no possible implementation which performance can exceed that, then it is a better theoretical max from mine. However, I believe he simply got the number by 1/3 and didn't realized the number he is looking for is cuberoot of 2. Maybe he believed that since those factors are dependent, and therefore should be addition instead of multiplication. I will let him explain.
My take on his comment is rather simple. Suppose i/o takes 1 cycle to send 1 unit of data and GPU takes 1 cycle to process data and require 2 units of data to begin the process, then clearly GPU can only be working 50% efficiency, and doubling i/o will double GPU's efficiency. Suppose GPU requires Y units of data to process, then its efficiency will be 1/Y, and doubling i/o will bring it up to 2/Y, or scaled 200%. I am not saying 200% is what we will get all the time, I am saying that, theoretically, it is the best it can get.