Blender on ARM

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

jhu

Lifer
Oct 10, 1999
11,918
9
81
Updated with Snapdragon 801. Had to underclock to 422 MHz because of throttling issues of my phone. Rather impressed that performance/clock is similar to Core 2! Not what I expected at all.
 

Nothingness

Diamond Member
Jul 3, 2013
3,292
2,360
136
Updated with Snapdragon 801. Had to underclock to 422 MHz because of throttling issues of my phone. Rather impressed that performance/clock is similar to Core 2! Not what I expected at all.
It looks very wrong to me, there is no way S801 could be 3x more efficient than the Cortex-A15 in Exynos 5250 especially on FP. I guess it wasn't running at 422MHz .
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
It looks very wrong to me, there is no way S801 could be 3x more efficient than the Cortex-A15 in Exynos 5250 especially on FP. I guess it wasn't running at 422MHz .

It is anomalous. Unfortunately the app (No Frills) says it's at 422 MHz. Is there a better way to check CPU frequency on phones?

Thing is, when I turn the CPU frequency to 422 MHz, the phone UI is slow as molasses when Blender is running. When I turn it back to normal, the UI response returns to normal when Blender is running. Also the phone doesn't warm up at 422 Mhz whereas it does at normal frequencies.
 
Last edited:

Nothingness

Diamond Member
Jul 3, 2013
3,292
2,360
136
It is anomalous. Unfortunately the app (No Frills) says it's at 422 MHz. Is there a better way to check CPU frequency on phones?
Getting frequency under Android seems to be a pain. Did you try CPU Z for Android?
 

wilds

Platinum Member
Oct 26, 2012
2,059
674
136
I've been using TinyCore to monitor CPU core 0 frequency in the system bar. Really accurate and updates/polls quickly.
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
Updated with OMAP 4470. Removed Snapdragon result because I can't tell what the actual speed of the processor is. I've narrowed it down to the following more plausible numbers (40057 samples/s single core only):

1.267 GHz - 31615 samples/s/GHz
1.498 GHz - 26740 samples/s/GHz
1.574 GHz - 25449 samples/s/GHz
1.728 GHz - 23181 samples/s/GHz
 

Nothingness

Diamond Member
Jul 3, 2013
3,292
2,360
136
The Exynos build was most likely compiled without any sort of feature set identification meaning VFP/NEON optimisations were left out. Every other processor on the list had some sort of SSE or AVX flag set when the binaries were being compiled.
VFP is used but NEON isn't. And in fact NEON is not very good for FP since it's not IEEE compliant and anyway can only be used for single precision. ARMv8 64-bit NEON fixes these issues at last.
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
BTW, if anyone is running Ubuntu 14.04 on a Haswell Core i3 or Core i7, I'd lIke to get your results too.
 

Nothingness

Diamond Member
Jul 3, 2013
3,292
2,360
136
BTW, if anyone is running Ubuntu 14.04 on a Haswell Core i3 or Core i7, I'd lIke to get your results too.
You don't need results on 4770k with no HT, no OC and Fedora 19 Blender 2.68a?

EDIT: Time 2:13.90
 
Last edited:

jhu

Lifer
Oct 10, 1999
11,918
9
81
You don't need results on 4770k with no HT, no OC and Fedora 19 Blender 2.68a?

EDIT: Time 2:13.90

That seems wrong. An i7 4770k stock shouldn't be slower than a Core i5 4570 @ 3.4 GHz turbo (time 2 minutes 3.7 seconds). Unless you really are running Fedora 19 with Blender 2.68a...
 
Last edited:

Nothingness

Diamond Member
Jul 3, 2013
3,292
2,360
136
That seems wrong. An i7 4770k stock shouldn't be slower than a Core i5 4570 @ 3.4 GHz turbo (time 2 minutes 3.7 seconds). Unless you really are running Fedora 19 with Blender 2.68a...
My post clearly states my configuration
 

Nothingness

Diamond Member
Jul 3, 2013
3,292
2,360
136
4C/4T 4770K

2.72b: 2:08.76
2.71: 2:01.09

EDIT: I guess this doesn't use AVX as my temps only went up to 45°C, while AVX heavy programs tend to go higher than 65°.
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
4C/4T 4770K

2.72b: 2:08.76
2.71: 2:01.09

EDIT: I guess this doesn't use AVX as my temps only went up to 45°C, while AVX heavy programs tend to go higher than 65°.

Could you try it with HT enabled? That'd be more informative (already have a 4C/4T Haswell result).
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
With HT enabled 1:31.28 for 2.71.

Note my RAM is OC to 2400.

Thanks, much appreciated. It's about in line with what I expected.

By my calculations, a stock 5960X should do this in about 56 seconds!
 
Last edited:
Dec 30, 2004
12,553
2
76
That would be a different type of comparison. The Povray compilation via ICC and GCC is just to show which one is better on FX. Still, I haven't tried the Open64 compiler that AMD is supporting (which I don't know why they don't just funnel support into GCC and LLVM instead). Also haven't tested Intel's MK libraries either since I'm more interested in rendering performance.

my mistake, I read your comment as a rebuttle to mine
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
4C/4T 4770K

2.72b: 2:08.76
2.71: 2:01.09

EDIT: I guess this doesn't use AVX as my temps only went up to 45°C, while AVX heavy programs tend to go higher than 65°.

You are correct. The official binary has these build flags:

-DWITH_FREESTYLE -pipe -fPIC -funsigned-char -fno-strict-aliasing -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -fopenmp -DNDEBUG -O2 -msse -msse2 -DWITH_MOD_FLUID -DWITH_MOD_OCEANSIM -D__LITTLE_ENDIAN__ -DWITH_AUDASPACE -DWITH_AVI -DWITH_OPENNL -DHAVE_STDBOOL_H

I think there was some talk about adding AVX or AVX2 support eventually, maybe. You can always compile your own and see what happens. Thus far, I can't get my compiled binaries to run on Linux because they keep seg faulting.
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
Updated with custom compiled binaries. These are faster than the precompiled binaries. Other speed improvements would probably come from compiling the other libraries that Blender is linked to also (particularly python). And, of course, the major speed improvement would be using NVidia GPUs. I'd actually like someone to do that comparison.
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
Why not -march?

Refuses to compile with -march=core-avx2

It just stops at ~12% with an error.

What happens? Anyway I don't think it would bring a significant speedup.

Refuses to compile with NEON support (-mfpu=neon -funsafe-math-optimizations); Same as above: stops at ~12% with an error.

BTW, someone on reddit figured out how to set and keep clockspeeds on the Android devices (mainly disable /system/bin/mpdecision), so I've put up the Snapdragon 801 results again. Now working the Snapdragon S4 Pro.
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
Updated with Snapdragon S4 Pro (APQ8064) results. Why would Qualcomm design processors slower than the stock ARM ones?
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Updated with Snapdragon S4 Pro (APQ8064) results. Why would Qualcomm design processors slower than the stock ARM ones?

It's not like Blender is very representative of common loads for an apps processor that's deployed almost entirely in mobile (phones and tablets). In the apps space here double precision FP is rarely used very heavily. I don't know what else Blender really stresses - I doubt it's just doubles or Saltwell wouldn't perform that well either - but I do know that it's an application that isn't popular for the platform in general.

That said, Krait 200 came in products about 10 months before Cortex-A15 did, so it's not like Qualcomm had it as an alternative. As far as Cortex-A9 goes, Krait 200 usually beats it, although not always. Especially when clocked more at its peak frequencies and not at the low frequency you have it clocked at. Krait 300 and 400 improve things a little further, but maybe not as much as Qualcomm would have hoped. The performance is really all over the place vs the competition. It does seem to have some pretty big glass jaws, like small L1 caches, a fairly high L1 dcache latency when the L0 cache is missed (and some loads will probably miss from it pretty frequently), a very high L2 cache latency, and some weird decoding penalties - see here:

http://www.7-cpu.com/cpu/Krait.html

Now that's just looking at performance, where power efficiency and area are also huge factors. So it's hard to judge it purely on that basis alone.

I think going with Cortex-A57 in their current flagship (810) is a way of conceding that their uarch has fallen too far behind. Not that adding 64-bit support is trivial, but if it came down to only that I think they could have managed it in time.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |