Anandtech vs Tom's Hardware Folding@Home Coronavirus Race thread

Page 32 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,671
14,676
136
I have one of 5 2080ti's that constantly gets hung. Tried the new version, and the latest nvidia drivers, no help.

19:08:05:WARNING:WU00:FS01:FahCore returned: BAD_FRAME_CHECKSUM (112 = 0x70)
19:08:05:WARNING:WU00:FS01:Fatal error, dumping
19:08:05:WU00:FS01:Sending unit results: id:00 state:SEND error: DUMPED project:14436 run:251 clone:0 gen:8 core:0x22 unit:0x0000000a03854c135e9a7910c31dd511
19:08:05:WU00:FS01:Connecting to 3.133.76.19:8080
19:08:05:WU01:FS01:Connecting to 65.254.110.245:80
19:08:06:WU00:FS01:Server responded WORK_ACK (400)
19:08:06:WU00:FS01:Cleaning up
 
Last edited:

Mike_F

Junior Member
Jan 27, 2020
17
26
51
sudo nvidia-smi -pl ### [-id #]
Replace ### with the desired board power target in Watts.
If you want to set separate limits for each card, first find out the IDs with nvidia-smi -L, then use above command with the extra -id argument.


Yes. This does not persist through reboots, nor through driver unload/ reload.
Thank you, Sefan.
However, I get:
sudo nvidia-smi -pl ### [-id #]
Replace ### with the desired board power target in Watts.
If you want to set separate limits for each card, first find out the IDs with nvidia-smi -L, then use above command with the extra -id argument.


Yes. This does not persist through reboots, nor through driver unload/ reload.

First of all, peace to you, StephanR5R, and thank you.
However, I get:
folder@Complicare:~$ sudo -n nvidia-smi -i 0 --pl=110
sudo: a password is required
folder@Complicare:~$ -n nvidia-smi -i 0 --pl=110
-n: command not found
folder@Complicare:~$ sudo nvidia-smi -pl 110 [-id 0]
[sudo] password for folder:
Invalid combination of input arguments. Please run 'nvidia-smi -h' for help.

folder@Complicare:~$ sudo nvidia-smi -pl 110 -id 0
Invalid combination of input arguments. Please run 'nvidia-smi -h' for help.

Who's arguing?! I'm conflict adverse!
Seriously, what am I doing wrong?
(I'm a mechanic/hardware guy. I have no software training at all and virtually no experience in Linux. I only know enough to cause knowledgeable people headaches.)
 

StefanR5R

Elite Member
Dec 10, 2016
5,633
8,107
136
Sorry, I gave you a slightly wrong syntax. It's -i, not -id, for the identification of an individual card.

First, check which one of your two cards is the first one (number 0), and which is the second (number 1) with
nvidia-smi
or with
nvidia-smi -L
which is less verbose than the former.

Then, set the power limit of the first card with
sudo nvidia-smi -pl 123 -i 0
and of the second card with
sudo nvidia-smi -pl 123 -i 1
or of both cards at once, if you want them to have the same power limit, with
sudo nvidia-smi -pl 123
Of course, instead of 123 enter the Wattage which you desire (but one within whatever limits which the firmware allows).
 
Reactions: 1979Damian

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
Aww cute!
Looks just like my friend's cat (Sophie), ginger tabbie (not sure of the right name), and instantly jumps on my lap when I get there, lol.
 
Reactions: mopardude87

mopardude87

Diamond Member
Oct 22, 2018
3,348
1,575
96
I finally rebooted by "accident" yesterday night cause we had issues with a fuse and its not labeled one bit. Fun thing it picked right up on the wu prior to shut down and well its still going. Second rig down cause buddy is home watching his hate mongering Fox News so i guess if he leaves again i could put the 1060 back to work. Hmm i could do the HD630 trick but with my luck he is gonna wanna game and getting a secondary gpu to run games can be a chore? Idk if the games are smart enough to run directly off the secondary or not even? I run usually one game and it ran off hd630 and ran fine. Game runs on half a potato.

No desktop lag when i got hd630 for desktop usage, only get the lag if i use the 1080ti by itself. Same lag on his so hmm maybe. Usually hate to tinker with other peoples stuff but this is for a reason though. Will have to bring it up later.
 

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
Well I found out something interesting with my main rig & F@H yesterday, maybe it's a known point by more experienced folders, but it's news to me.
I always thought that we needed to keep a spare CPU core for GPU folding, otherwise it's output would be significantly reduced.

But after having to enable all 12 threads for Rosetta to try to get WUs done before the deadline, I didn't see any real change in F@H output. But knowing that the ppd figures varies a lot anyway (not to mention running out of WUs), I thought I'd test that looking at average GPU load & average GPU chip power draw instead (using GPU-Z).
Unfortunately this test is with 2 different WUs, I'm not sure how much difference that makes, but so far I've not been able to catch a WU at the right time to test both scenarios.
Anyway, over ~1.5hrs with 1 CPU core free, av. GPU load was 87% & av. GPU power draw was 86w.
With all 12 threads crunching Rosetta for ~1.2hrs, av. GPU load was 86% & av. GPU power draw was 89w.
So no real difference!
FYI my AMD RX 580 is underclocked to 1275 MHz (a 5.5% drop from 1350) & undervolted to 1050 mV (an 8.7% drop from 1150), that's dropped GPU temps by ~8C! Allowed the fan to run much slower & quieter, & cut power draw by ~25w! .
So again, for my rig, it would seem I don't need to keep a CPU core spare . I would like to test this again with the same WU though.
Also I do wonder whether some WUs would be more affected than others?

What have other people found with their rigs & different WUs?
 
Last edited:

Endgame124

Senior member
Feb 11, 2008
956
669
136
Well I found out something interesting with my main rig & F@H yesterday, maybe it's a known point by more experienced folders, but it's news to me.
I always thought that we needed to keep a spare CPU core for GPU folding, otherwise it's output would be significantly reduced.

But after having to enable all 12 threads for Rosetta to try to get WUs done before the deadline, I didn't see any real change in F@H output. But knowing that the ppd figures varies a lot anyway (not to mention running out of WUs), I thought I'd test that looking at average GPU load & average GPU chip power draw instead (using GPU-Z).
Unfortunately this test is with 2 different WUs, I'm not sure how much difference that makes, but so far I've not been able to catch a WU at the right time to test both scenarios.
Anyway, over ~1.5hrs with 1 CPU core free, av. GPU load was 87% & av. GPU power draw was 86w.
With all 12 threads crunching Rosetta for ~1.2hrs, av. GPU load was 86% & av. GPU power draw was 89w.
So no real difference!
FYI my AMD RX 580 is underclocked to 1275 MHz (a 5.5% drop from 1350) & undervolted to 1050 mV (an 8.7% drop from 1150), that's dropped GPU temps by ~8C! Allowed the fan to run much slower & quieter, & cut power draw by ~25w! .
So again, for my rig, it would seem I don't need to keep a CPU core spare . I would like to test this again with the same WU though.
Also I do wonder whether some WUs would be more affected than others?

What have other people found with their rigs & different WUs?
Nvidia GPUs use CPU polling to see if the card needs the CPU to pull something out of memory (the CPU constantly asks the GPU if it needs anything).

AMD GPUs are event driver (when they need data, they notify the CPU). I have found that I can run 2 (really old) AMD cards without notable impact to CPU usage.
 

Endgame124

Senior member
Feb 11, 2008
956
669
136
Nice answer @Endgame124 , I was about to answer that Nvidia is more dependent on CPU across all DC projects, but I didn't understand the 'why' of it all. Now I do. Thanks!
Also worth noting, the Nvidia GPUs only using polling for Open CL (Open Compute Language is the GPU programming language that is open source and can be used by both Nvidia and AMD). Since OpenCL is cross platform, distributed computing clients prefer it because they only need to write the code once and it can be used for both Nvidia and AMD cards. Also, as far as I understand it, the linux scheduler is less resource intensive for polling, which in turn is part of the reason why Linux is faster with Nvidia cards.

CUDA, the Nvidia proprietary GPU language, is Event Driven, so if a project used it your PC would have very low CPU usage running GPU projects. However, it requires Nvidia cards, and it requires licensing to use by the developing institution (as I understand it). Since CUDA is specific to NVidia, it can use features of NVidia GPUs not available in OpenCL and all other things being equal, CUDA is generally faster, but the draw backs are high enough I don't expect it to be widely implemented in DC usage.

Finally, F@H is working on a CUDA client which has now gotten extra attention from Nvidia due to Covid, so there will likely be a CUDA client soon for F@H. In this case, I would highly recommend Nvidia users switch to the CUDA client when it becomes available, as it will likely only use 1% or less of CPU to feed your GPUs.
 

StefanR5R

Elite Member
Dec 10, 2016
5,633
8,107
136
Also, as far as I understand it, the linux scheduler is less resource intensive for polling, which in turn is part of the reason why Linux is faster with Nvidia cards.
From what I read, the Windows driver stack causes frequent memory copies which don't happen on Linux. Supposedly this is an enabler for features like live recovery from driver crashes and live driver updates on Windows. In any case, it is evident that several GPGPU applications in the DC world, including FahCore_21 (haven't checked _22 yet in this regard) involve much more memory transfers in their Windows port than in their Linux port on identical hardware. This is presumably the major reason why several GPGPU applications in the DC world — most notably the one among them which needs host bus bandwidth the most: FahCore — perform better on Linux than on Windows.

CUDA is generally faster, but the draw backs are high enough I don't expect it to be widely implemented in DC usage.
There are several CUDA applications around in the BOINC world.

Finally, F@H is working on a CUDA client which has now gotten extra attention from Nvidia due to Covid, so there will likely be a CUDA client soon for F@H. In this case, I would highly recommend Nvidia users switch to the CUDA client when it becomes available,
I'd expect the respective FahCore to be downloaded automatically to capable hosts. Edit, of course we'll have to see how this is implemented once it goes public.

as it will likely only use 1% or less of CPU to feed your GPUs.
Hmm, extrapolating from what I have observed with various CUDA applications, I'd say this remains to be seen.
 

Endgame124

Senior member
Feb 11, 2008
956
669
136
There are several CUDA applications around in the BOINC world.
Its very difficult to tell from the project lists if a project supports CUDA and Open CL, or just Open CL when it lists both AMD and Nvidia support. You are certainly more knowledgeable than I on that, especially given the number of projects not listed on the boinc project list.

Hmm, extrapolating from what I have seen from various CUDA applications, I'd say this remains to be seen.
You're probably right, as implementation will mean everything. I'm only speaking from having moved from openCL to CUDA on some ML applications at work (note, I don't actually do ML programming, I just test performance of other people's code and architecture). It's entirely possible I'm overly optimistic when it comes to CUDA's performance vs Opencl.
 

TennesseeTony

Elite Member
Aug 2, 2003
4,218
3,647
136
www.google.com
Most DC projects are NOT known for having efficient, well coded programs. But, from the CUDA DC apps I have seen, the Nvidia GPU needs an entire thread to function. I am running a GPUGRID task to demonstrate (or prove myself wrong, perhaps). Pic will be added in 30 minutes or so. (edit, done already)

For OpenCL, Nvidia doesn't appear too bad on Milkyway, but Nvidia still requires substantially more CPU than AMD. For Einstein it is back to a full thread for Nvidia.

Milkyway



Einstein (RTX 2080Ti) (FGRPopenclTV-nvidia)


Einstein using an AMD R9-280X (credit pending)


GPUGRID
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,633
8,107
136
Re OpenCL vs. CUDA usage at Nvidia-GPU enabled DC projects, I clicked through Orange Kid's list:
OpenCL —
  • Amicable Numbers
  • Collatz Conjecture
  • Einstein@Home
  • Folding@home
  • MilkyWay@home
  • distributed.net/ Moo! Wrapper
  • NumberFields@home (Windows application)
CUDA —
  • Asteroids@home
  • GPUGrid
  • distributed.net/ Moo! Wrapper again, with older but still active application versions
  • NumberFields@home (Linux application)
  • PrimeGrid PPS Sieve
  • SRBase
PrimeGrid AP27 Search and PrimeGrid Genefer are listed as "OCL CUDA". Which of the categories is this? %-) OpenCL, I guess.
 

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
Interesting stuff, following up from my earlier tests, I managed to test across a single long WU (nearly 8hrs!, project 14416, 189k est credit!).
Running on my RX 580 :-
10 CPU threads on R@H, av. GPU load 95%, av. GPU chip power 90.5w (runtime 2.4 hrs)
12 CPU threads on R@H, av. GPU load 94%, av GPU chip power 90w (runtime 2.5 hrs)
So a negligible difference on F@H output .
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,671
14,676
136
Well, I am now getting work most of the time, but I am having a serious delay sending them, and its killing my ppd. 10 million instead of 18 million, just from sending delays ????
 

TennesseeTony

Elite Member
Aug 2, 2003
4,218
3,647
136
www.google.com
Perhaps give GPUGRID a go on BOINC. It's medical research too. I mean, not forever, just until interest wanes in F@H (although I hope it doesn't wane too much).
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |