Since you used the generic term 'AVX' - I'm curious, does that include AVX512 or are you just reference the first implementation of AVX (Sandy Bridge)?
Not entirely.
lowp is not necessary fp16 - IOS and Android doesn't guarantee fp16 - it can be between fp10 and fp16 depending on the hardware implementation.
The same goes for highp which can be between fp24 and fp32 depending on the hardware.
Most compilers/drivers on mobile devices can...
I have a hard time to follow you logic here. What have 'Usually doubled compute power resulted in ~30% improvement' anything to do with the precission of the fp calculations?
Are you implying that there's a difference of 30% between using f16 and fp32 for a general pixel shader in the iPad...
That would be great, but will never happen because there's no business case in providing such a benchmark. The amount of work associated with delivering of highly hand optimized code for each CPU family is just too much for any 3rd party. What should the benchmark test? Intel would love heavy...
And most developers doesn't even use SSE efficiently - 98% rely on compilers and couldn't care less. - they won't even look at ISPC. I do agree that AVX (and specifically AVX512) would have been great on all CPUs.
Then you're not able to benchmark different architectures against eachother - I can understand that you accept the use of GPGPU resources on one platform and CPU on another for benchmarking different architectures. I don't.
If you experience any changes in the behaviour you would need to decompile and analyze the new code - you do not have symbols and debug information. Afterwards you create a solution - you'll have to compile and test on all other supported platforms - Android, OSX, Win and Linux - before release.
How would that help? 'Just rewrite' - Are you aware of the resources needed for contant rewriting on multiple platforms?
You're uploading the next best thing after source code - you have no longer control over the specifics of your benchmark. There can be a large performance difference between...
Does this include bitcode delivery via Xcode? It's default, but still optional for now - this will be a requirement in the near future. How will you prevent behind the scene optimizations?
Which workloads use NEON/SSE SIMD instructions? We do talk about SIMD instructions and not scalar instructions, right?
So Win version of GB4 will use VS2015?
Thank you for participating.
According to your own test - Sharpen filter etc.
Will GB4 include any SIMD processing?
Since GB4 will use Apple Xcode 7 for IOS, will desktop include the latest Intel Compiler?
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.