- Mar 3, 2017
- 1,623
- 5,894
- 136
If we can't come to an agreement, let's take the average of 5.7 and 5.85, which is 5.75! Is everybody happy?Then the whole discussion side tracked into a sub-discussion of whether 5.7 or 5.85 Ghz was correct to use. This despite that it only leads to 2,6% difference, which is too little to matter for the original topic being discussed as I understand it.
Enough of this imaginary differences between imagined benchmark numbers - this sequence of Twitter posts is actually quite interesting:
View attachment 97324
View attachment 97325
View attachment 97326
View attachment 97327View attachment 97328View attachment 97329
Then the whole discussion side tracked into a sub-discussion of whether 5.7 or 5.85 Ghz was correct to use. This despite that it only leads to 2,6% difference, which is too little to matter for the original topic being discussed as I understand it.
Holy image spam. Care to at least summerize what is being said?
I agree it's important for correctness sake to use the right numbers. However in this case I'm not sure which are the correct ones, and there seems to be differences of opinion about that. But it seems like all agree on that the difference is also so small at 2,6% that it should not affect the conclusion much.The correct numbers are clearly the measured ones from the review in question. It may not make much difference or change the conclusion but if you have accurate data to use then use it rather than fudging it.
I agree it's important for correctness sake to use the right numbers. However in this case I'm not sure which are the correct ones, and there seems to be differences of opinion about that. But it seems like all agree on that the difference is also so small at 2,6% that it should not affect the conclusion much.
But let's say we use the numbers you prefer, then what is the conclusion from the original topic? I.e. the relationship between the SPECint and Cinebench measurements?
Ok, let’s go for that then. So using the numbers you said are the correct ones, what’s the conclusion regarding the relationship between the SPECint and Cinebench measurements?It is not really a difference of opinion. One is using the number measured by the reviewer for a single threaded workload, the other is using fmax.
Clearly the measured figure is the correct one to use when you are using that reviews single core Specint score and normalising to get a relative IPC delta.
Just because it has been measured doesn't mean that it sustained that frequency through the benchmark run.It is not really a difference of opinion. One is using the number measured by the reviewer for a single threaded workload, the other is using fmax.
Clearly the measured figure is the correct one to use when you are using that reviews single core Specint score and normalising to get a relative IPC delta.
Conclusion is that if the PPC improvement in SPECint is, for argument's sake, 10%, it is very unlikely that the PPC improvement in Cinebench wouldn't be in the same ballpark.Ok, let’s go for that then. So using the numbers you said are the correct ones, what’s the conclusion regarding the relationship between the SPECint and Cinebench measurements?
Ok, let’s go for that then. So using the numbers you said are the correct ones, what’s the conclusion regarding the relationship between the SPECint and Cinebench measurements?
Cinebench R23 | SPECint 2017 | |
Zen+ -> Zen 2 | 19.3% | 16.8% |
Zen 2 -> Zen 3 | 14.9% | 19.7% |
Zen 3 -> Zen 4 | 5.1% | 5.6% |
Cinebench IPC increase relative to SPECint 2017 IPC increase | |
Zen+ -> Zen 2 | 14.9% |
Zen 2 -> Zen 3 | -24.4% |
Zen 3 -> Zen 4 | -8.93% |
Zen 4 Cinebench (ST) IPC prediction based upon previous Cinebench to SPECint IPC error calculation | |
Zen+ -> Zen 2 | 45.96% |
Zen 2 -> Zen 3 | 30.24% |
Zen 3 -> Zen 4 | 36.43% |
Cinebench IPC increase relative to SPECint 2017 IPC increase | |
Zen+ -> Zen 2 | 14.9% |
Zen 2 -> Zen 3 | -24.4% |
Zen 3 -> Zen 4 | -34.6% |
Zen 5 Cinebench (ST) IPC prediction based upon previous Cinebench to SPECint IPC error calculation | |
Zen+ -> Zen 2 | 45.96% |
Zen 2 -> Zen 3 | 30.24% |
Zen 3 -> Zen 4 | 26.16% |
Wow, somebody missed their basic statistics class in school. Instead of meandering with silly mafs you could have simply calculated the correlation coefficient.First, my original reply on this subject was simply because one of the numbers looked suspiciously different than the generally accepted increase for SPECint. So I asked how it was calculated which led to a whole series of deflection and defensive posts and exposed a rather casual attitude toward data accuracy which makes me not trust any of the numbers, tbh (which is the more important point than exactly how much it changes the results). However, even if we take the numbers at face value, let's just look at how the two tests matchup.
Cinebench R23 SPECint 2017 Zen+ -> Zen 2 19.3% 16.8% Zen 2 -> Zen 3 14.9% 19.7% Zen 3 -> Zen 4 5.1% 5.6%
If we take the SPECint numbers as the baseline, here is the "error" margin in how well Cinebench represents IPC gain in SPECint.
Cinebench IPC increase relative to SPECint 2017 IPC increase Zen+ -> Zen 2 14.9% Zen 2 -> Zen 3 -24.4% Zen 3 -> Zen 4 -8.93%
If we assume that the 40% SPECint increase is true for Zen 5 and use this possible range of values, here are the predicted Cinebench IPC numbers.
Zen 4 Cinebench (ST) IPC prediction based upon previous Cinebench to SPECint IPC error calculation Zen+ -> Zen 2 45.96% Zen 2 -> Zen 3 30.24% Zen 3 -> Zen 4 36.43%
So, it seems that given the presented numbers (which have questionable accuracy for at least one entry), Cinebench is not a great predictor for SPECint IPC increases. There will be some correlation, obviously because you are improving the CPU which will effect both results, but that is very different than saying one is an accurate predictor of the other.
Edit:
If you use the corrected Zen3 -> Zen 4 SPECint 2017 number (7.8%), you get the following:
Cinebench IPC increase relative to SPECint 2017 IPC increase Zen+ -> Zen 2 14.9% Zen 2 -> Zen 3 -24.4% Zen 3 -> Zen 4 -34.6%
Zen 5 Cinebench (ST) IPC prediction based upon previous Cinebench to SPECint IPC error calculation Zen+ -> Zen 2 45.96% Zen 2 -> Zen 3 30.24% Zen 3 -> Zen 4 26.16%
So, as you can see, even though the absolute number only changes from 5.6% to 7.8%, the error margin actually grows significantly. So yes, getting the right number here does make a big difference in terms of the subject at hand. If the argument is that Anandtech's sample might not have sustained the 5.75 GHz frequency, then that only makes the error margin worse as the frequency will drop and the SPECint IPC number will go up, as will the error margin.
Edit: Fixed tables from Cinebench ST score to IPC prediction. Also, as I stated in reply to the original tweet, the scores shown in the tweet are literally just a random poster's speculation from a Chinese forum, so none of this actually matters in regards to Zen 5 performance.
Wow, somebody missed their basic statistics class in school. Instead of meandering with silly mafs you could have simply calculated the correlation coefficient.
Here, I did it for ya -
With 5.6% change it is 0.940594.
With 7.8% change, it is 0.926675.
Across the entire data set - for the three cases each for Intel and AMD.
Cinebench R23 | SPECint 2017 | |
Zen+ -> Zen 2 | 19.3% | 16.8% |
Zen 2 -> Zen 3 | 14.9% | 12.97% predicted / 19.7% actual |
[...] has never observed a set of different computing algorithms running on a set of different microarchitectures.Wow, somebody [...]
Correlation coefficient is NOT used for prediction.I know how to calculate the correlation coefficient but that doesn't tell us what you seem to think it does. You can't use strong correlation to mean small variance between data sets (e.g., that the trend of one can be replaced by the value of the other).
To visualize this, take the following data sets:
View attachment 97350
They have perfect correlation (coefficient = 1). However, one does not represent the growth of the other and the data sets diverge significantly over their range (they just do so linearly which is why they have a correlation coefficient of 1). What you are claiming cannot be supported by calculating the correlation coefficient, especially when the data set is ridiculously small. I thought showing the projected Cinebench IPC numbers based upon past generations would make that obvious, but I guess not. Maybe if we look at it from already known values.
Let's take the first two generations calculated and pretend like Zen 2 hadn't been tested with SPECint yet. If Cinebench is a good representative of SPECint IPC increases, then we should be able to predict Zen 2 SPECint IPC based upon the Zen+ to Zen2 Cinebench numbers.
Cinebench R23 SPECint 2017 Zen+ -> Zen 2 19.3% 16.8% Zen 2 -> Zen 3 14.9% 12.97% predicted / 19.7% actual
But the actual value of Zen 2 -> Zen 3 IPC increase was 19.7%. That means the predicted value was off by 34.2% (absolute). That's a terrible error percentage, it was more than 1/3 off from the actual value.
Feel free to continue to believe how you want, but the numbers show that Cinebench is a terrible predictor of SPECint IPC.
This discussion has nothing to do with CPU architectures or benchmarks.[...] has never observed a set of different computing algorithms running on a set of different microarchitectures.
(PS, actually even different implementations of one and the same microarchitecture show different performance scaling with power, thermals, memory subsystem etc. for different algorithms. PPS, envisioning a correlation between INT and FP takes the cake though.)
Correlation coefficient is NOT used for prediction.
Correlation coefficient is used to accept or reject the null hypothesis.
In this case, the p-value is roughly 0.025, which is less than 0.05 for 95% confidence interval.
So you have to reject the null hypothesis.
Doesn't look too good for "40% core-for-core faster than Zen 4 in SPECint 2017" claims, if [Cinebench numbers are] true.
So use the individual scores, across each of SPECint and SPECfp, throw in the individual scores of Geekbench, add every benchmark that spits out a number that implies the higher the better, get the dataset size to acceptable standards for the analysis, and you will still find that what I showed remains unchanged across CPU architectures.Sure, but the entire analysis is question due to the tiny sample size.
This discussion has nothing to do with CPU architectures or benchmarks.
Feel free to continue to believe how you want, but the numbers show that Cinebench is a terrible predictor of SPECint IPC.
And the AVX instructions in that graph are mostly scalar additions and multiplications. Very little vector instructions.
My daughter's friend's uncle's newphew said Zen 5 > Arrowlake by a good chunkOnly 43 days left until the AMD keynote at Computex on June 3 and still not any solid leak. I wonder if we’ll get any credible and precise performance numbers or pricing info before that, or if AMD will manage too keep everything secret until we hear it from Lisa Su directly. Any opinions, based on e.g. past track record?
Solid, credible and precise. 🙃My daughter's friend's uncle's newphew said Zen 5 > Arrowlake by a good chunk