Which DC projects/subprojects use avx-512 ?

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,639
14,629
136
I thought it was one of the Primegrid tasks and possibly more, but now I can't find it documented anywhere. Anything that supports it, I will run for the team and leave my medical for the competition.
 

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
The LLR2 application makes use of AVX-512. The following projects are built on LLR2:
  • All of the PrimeGrid LLR subprojects,
  • all SRBase subprojects except the GPU project "TF",
  • the LLR2 testing subproject at Private GFN Server.
The PRST application, which is similar to LLR2 and is run at Private GFN Server's subproject of the same name, supports AVX-512 too.
As does the genefer22 application, which is used by
  • PrimeGrid GFN-15…GFN-22, if they are run on CPUs instead of a GPU.¹
All of these projects are concerned with finding primes, or with proving/ disproving conjectures connected with primes. For now I haven't heard of other active Distributed Computing projects whose applications benefit from AVX-512.

Perhaps Folding@Home's CPU-only FAHCore_a8 uses AVX-512, perhaps not. The older FAHCore_a7 most likely does not. Both are based on GROMACS which offers AVX-512 support, but the GROMACS builds in the F@H cores might not have it. If you enable a F@H CPU slot on an AVX-512 capable computer (probably with at most 64 logical CPUs to suit FAHCore_a8's limitations, IIRC), the client log will probably show you which SIMD flavor is being used.

________
¹) edited: GFN-15 is based on genefer22 too, since April.
 
Last edited:

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,639
14,629
136
The LLR2 application makes use of AVX-512. The following projects are built on LLR2:
  • All of the PrimeGrid LLR subprojects,
  • all SRBase subprojects except the GPU project "TF",
  • the LLR2 testing subproject at Private GFN Server.
The PRST application, which is similar to LLR2 and is run at Private GFN Server's subproject of the same name, supports AVX-512 too.
As does the genefer22 application, which is used by
  • PrimeGrid GFN-16…GFN-22, if they are run on CPUs instead of a GPU.
All of these projects are concerned with finding primes, or with proving/ disproving conjectures connected with primes. For now I haven't heard of other active Distributed Computing projects whose applications benefit from AVX-512.

Perhaps Folding@Home's CPU-only FAHCore_a8 uses AVX-512, perhaps not. The older FAHCore_a7 most likely does not. Both are based on GROMACS which offers AVX-512 support, but the GROMACS builds in the F@H cores might not have it. If you enable a F@H CPU slot on an AVX-512 capable computer (probably with at most 64 logical CPUs to suit FAHCore_a8's limitations, IIRC), the client log will probably show you which SIMD flavor is being used.
Thanks Stefan !
 

mmonnin03

Senior member
Nov 7, 2006
218
221
116
I've been running the PRST tasks at Private GFN for WUProp hours. On my 7950x with 6x concurrent tasks they run 5-6 hours. Running 32x tasks they slowed down quite a lot to 12-13 hours. each. With 6x tasks, running the remaining threads on SPT also slowed down those 6 tasks. I'm guessing the caches were being filled. Can these still be mt to fill up the threads with fewer tasks?

rebirther updated that PRST isn't matching the results of llr so testing has stopped.
 

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
It's possible that the applications which we are talking about here could see another large speed increase going from AMD Zen 4 to Zen 5, per core and per clock. (Not sure if AVX2 applications, i.e. ones with 256 bit vectors, will be addressed by these core upgrades too. From how AMD handled such things in the past, I'd say yes, but who knows.)

But given that there will only be a minor update to the manufacturing node, such a speedup would also come at the cost of almost proportional increase of power consumption.
 
Last edited:

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
I got this message from Asteroids@home talking about an AVX512 update :-


Asteroids@home: New AVX512 application released
We are very proud to announce our new set of optimized applications that will utilize AVX512 instruction set capable engines or to be precise those, which support AVX512dq instructions!

These applications are built to support both Linux and Windows 64bit architecture OS. The development of this version was possible thanks to the great help provided by ahorek's team !

Unfortunately it turns out that BOINC client applications for Windows still do not report all processor options to the server correctly. It is because of a known bug and even after a lot of discussions in BOINC's channels it's still there. The good news is that thanks to ahorek's team a bugfix was already accepted and merged into the BOINC's repository and the fix will be applied when client version 7.26.0 is released. Till then in order to run the AVX512 application you might need to switch to the Anonymous platform.

We'd like to remind you that while the Boinc server is capable of finding the best performing application for every particular system taking into account multiple factors, after a while it will start sending the right one for every particular system. Which means that even if your CPU supports AVX512dq instructions it still might receive FMA or AVX tasks and there is nothing to be concerned about. In such a case you might want to give a try to the so-called Anonymous platform where your client will explicitly request the AVX512 application.

Happy crunching and thank you for your support!
Asteroids@home's team
More info here - https://asteroidsathome.net/boinc/forum_thread.php?id=988
 

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
Other posters in the CPU subforum have linked to it before, but anyway:
mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > Zen4's AVX512 Teardown
An analysis by the author of y-cruncher. One of the points to take home: Although theoretical peak throughput of Zen3/Zen4 AVX-256 and Zen4 AVX-512 is the same clock-for-clock, moving an application to AVX-512 on Zen4 can reduce bottlenecks of the CPU's frontend ( = utilize the execution units better), and also reduce energy spent in the CPU's frontend ( = spend respectively more of the overall power budget in the execution units and elsewhere).
The Genefer application for CPUs has got a command line switch which toggles between different instruction sets:
-x <implementation> set a specific implementation (i32, sse2, sse4, avx, fma, 512)
I tried the default AVX512 and also the Zen-3-style FMA3 (-x fma) on EPYC 9554P @ 400W with genefer -n 20 -b 2615062. This workunit gets 34,066.53 credit.
FMA3:
Code:
tasks x threads, affinity |  avg. task duration   | tasks/day | points/day | power | efficiency
--------------------------+-----------------------+-----------+------------+-------+------------
8x8, ascending            |   5:54:07 =   21247 s |      32.5 |  1,108,218 | 475 W | 2,330 PPD/W
AVX512:
Code:
tasks x threads, affinity |  avg. task duration   | tasks/day | points/day | power | efficiency
--------------------------+-----------------------+-----------+------------+-------+------------
8x8, ascending            |   5:09:43 =   18583 s |      37.1 |  1,267,104 | 474 W | 2,670 PPD/W
So in this specific case, AVX512 gives +14.3 % throughput and +14.6 % power efficiency over AVX2 FMA3.

(Same workunit on RTX 4090 with Kaby Lake and Z270 PC: 1,771,460 PPD; 380 W; 4,700 PPD/W)
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |