milkyway_nbody bogus credits?

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
On a whim, I started running MilkyWay@Home's "N-Body Simulation" today.

On a dual-socket EPYC 7452 (2x 155 W PPT), I am running 4-threaded tasks, all SMT threads used, and CPU affinity of tasks set to be aligned with CCXs. Average core clocks are 2.6 GHz. Edit, for the time being, the host runs 50% MilkyWay and 50% Asteroids (2 client instances with MW@H with socket affinities, one client instance with A@H without affinities).

On a single-socket EPYC 9554P (360 W PPT), I am running 8-threaded tasks, all SMT threads used, and CPU affinity of tasks set to be aligned with CCXs. Average core clocks are 3.6 GHz. A small part of the current workload on this host consists of Asteroids@home.

(I configured the thread count per task via app_config.xml. There is no setting for this on the MW@H web site. By default, the application would start either 16 threads or as many threads as the number of logical CPUs which BOINC is allowed to use, whichever is less. The thread count refers to the number of computational worker threads; each task launches one more thread in addition but this consumes only a sub-seconds amount of CPU time i.e. just sleeps all the time.)

Results from client instance on socket 0 of the dual-7452:
Run time
(sec)
CPU time
(sec)
Credit
CPU time per
Run time
Credit per
Run time
Credit per
CPU time
4630.24​
14132.19​
493.01​
3.05​
0.106​
0.035​
4467.86​
13834.82​
494.6​
3.10​
0.111​
0.036​
4511.08​
13898.48​
585.07​
3.08​
0.130​
0.042​
4350.68​
13644.57​
487.95​
3.14​
0.112​
0.036​
4653.54​
14272.68​
475.13​
3.07​
0.102​
0.033​

Results from client instance on socket 1 of the dual-7452:
Run time
(sec)
CPU time
(sec)
Credit
CPU time per
Run time
Credit per
Run time
Credit per
CPU time
4427.66​
14053.83​
498.26​
3.17​
0.113​
0.035​
4615.23​
14315.8​
491.47​
3.10​
0.106​
0.034​
4392.79​
13786.02​
486.87​
3.14​
0.111​
0.035​
4372.72​
13903.69​
486.58​
3.18​
0.111​
0.035​
4662.52​
14655.39​
476.96​
3.14​
0.102​
0.033​

Results from the 9554P:
Run time
(sec)
CPU time
(sec)
Credit
CPU time per
Run time
Credit per
Run time
Credit per
CPU time
1749.13​
11345.92​
207.24​
6.49​
0.118​
0.018​
4988.38​
35653.47​
700.93​
7.15​
0.141​
0.020​
5045.94​
36031.08​
626.63​
7.14​
0.124​
0.017​
1536.01​
10843.42​
215.54​
7.06​
0.140​
0.020​
6499.84​
46624.73​
845.3​
7.17​
0.130​
0.018​
4866.2​
34857.28​
700.06​
7.16​
0.144​
0.020​
1533.58​
10750.01​
194.52​
7.01​
0.127​
0.018​
1488.21​
10427.97​
191.67​
7.01​
0.129​
0.018​
6778.63​
48483.06​
946.37​
7.15​
0.140​
0.020​
5048.96​
36133.04​
756.07​
7.16​
0.150​
0.021​
4931.97​
35358.92​
721.99​
7.17​
0.146​
0.020​
6762.51​
48365.44​
846.41​
7.15​
0.125​
0.018​
4966.59​
35623.22​
776.77​
7.17​
0.156​
0.022​
4900.25​
35102.64​
671.11​
7.16​
0.137​
0.019​
4909.32​
35267.84​
680.48​
7.18​
0.139​
0.019​
6688.28​
47963.89​
871.52​
7.17​
0.130​
0.018​
1517.63​
10711.25​
192.42​
7.06​
0.127​
0.018​

The first three columns are cut-and-paste from the results tables on the MW@H web site.

The fourth column, CPU time per Run time, corresponds very well with the average CPU utilization which I am seeing with "top" or "htop". As you can see, scaling isn't very good even at low thread count.

Now to the subject of this thread:
  • As you can see, the 3.6 GHz Zen 4 host gets only about half the credit per CPU time as the 2.6 GHz Zen 2 host. Instead, the Zen 4 host should get more credit per CPU time than the Zen 2 host, due to faster CPUs.
  • Likewise, the Zen 4 host gets merely 1.23 times the credit per run time as the Zen 2 host, even though the Zen 4 host throws double the amount of CPUs, and faster CPUs, onto each task.
I'll switch the Zen 4 host to 4 threads per task too and wait and see where this will be going.

Edit, in case that 4-threaded tasks on the Zen 4 host won't work out, I will have to test without Asteroids@Home in the mix.
 
Last edited:

Kiska

Golden Member
Apr 4, 2012
1,017
290
136
Now to the subject of this thread:
  • As you can see, the 3.6 GHz Zen 4 host gets only about half the credit per CPU time as the 2.6 GHz Zen 2 host. Instead, the Zen 4 host should get more credit per CPU time than the Zen 2 host, due to faster CPUs.
  • Likewise, the Zen 4 host gets merely 1.23 times the credit per run time as the Zen 2 host, even though the Zen 4 host throws double the amount of CPUs, and faster CPUs, onto each task.

I would think this is the effect of CreditScrewNew, but I could be wrong in assuming that is the cause. Also if it is CreditNew then have you run the benchmark, it does take that into consideration
 
Reactions: StefanR5R

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
BTW, the foremost reason why I looked at credits/seconds is because I wanted to figure out if the # tasks/socket and # threads/task which I chose are OK-ish, without having an Nbody benchmark like I do have for LLR2 and Genefer.

[For a benchmark, I would like the application to report its progress percentage, so that I don't have to run it until full completion. This application reports progress to the BOINC client, but not to stderr (when run in BOINC, I haven't tried running it standalone yet which might change its output).]

Here are results from almost a day later, now with a 4-core Haswell added (seems to perform not so well compared to Zen 2 and newer, which I also saw when I looked at other users' hosts), with Asteroids side load diminished on Epyc Rome and with Epyc Genoa switched to 4-threaded tasks:
loaded with one 3-threaded Nbody task, one Asteroids task, a desktop GUI, and at times with Firefox gone rampant.
Run time
(sec)
CPU time
(sec)
Credit
CPU time per
Run time
Credit per
Run time
Credit per
CPU time
18848.28​
48150.75​
700.32​
2.55​
0.037​
0.015​
28967.08​
71828.37​
929.35​
2.48​
0.032​
0.013​
BOINC benchmark result known to the MW@H validator: 5.48 / 23.24 billion FP/INT ops/s
loaded with 30 4-threaded Nbody tasks (w/ affinity) and 8 Asteroids tasks (w/o affinity)
socket 0:
Run time
(sec)
CPU time
(sec)
Credit
CPU time per
Run time
Credit per
Run time
Credit per
CPU time
4420.16​
16384.31​
226.9​
3.71​
0.051​
0.014​
4504.43​
16661.59​
246.56​
3.70​
0.055​
0.015​
4436.29​
16401.34​
238.69​
3.70​
0.054​
0.015​
4597.74​
16995.54​
225.45​
3.70​
0.049​
0.013​
19043.64​
71337.95​
1006.79​
3.75​
0.053​
0.014​
18840.05​
70656.27​
926.77​
3.75​
0.049​
0.013​
19067.7​
71348.53​
1026.22​
3.74​
0.054​
0.014​
4370.07​
16160.86​
270.78​
3.70​
0.062​
0.017​
14084.18​
52709.73​
789.49​
3.74​
0.056​
0.015​
4275.55​
15903.83​
207.39​
3.72​
0.049​
0.013​
BOINC benchmark result known to the MW@H validator: 5.05 / 13.66 billion FP/INT ops/s

socket 1:
Run time
(sec)
CPU time
(sec)
Credit
CPU time per
Run time
Credit per
Run time
Credit per
CPU time
4471.59​
16614.05​
203.98​
3.72​
0.046​
0.012​
4315.8​
15982.16​
220.43​
3.70​
0.051​
0.014​
14200.75​
53211.68​
756.79​
3.75​
0.053​
0.014​
3039.32​
11152.6​
189.74​
3.67​
0.062​
0.017​
14287.59​
53464.39​
693.2​
3.74​
0.049​
0.013​
19394.93​
72649.95​
970.28​
3.75​
0.050​
0.013​
267.74​
802.3​
15.51​
3.00​
0.058​
0.019​
11441.35​
40730.73​
688.09​
3.56​
0.060​
0.017​
19492.5​
72988.49​
1006.38​
3.74​
0.052​
0.014​
19063.6​
71460.87​
1000.92​
3.75​
0.053​
0.014​
BOINC benchmark result known to the MW@H validator: 5.08 / 13.64 billion FP/INT ops/s
loaded with 31 4-threaded Nbody tasks (w/ affinity) and 3 Asteroids tasks (w/o affinity)
Run time
(sec)
CPU time
(sec)
Credit
CPU time per
Run time
Credit per
Run time
Credit per
CPU time
2866.34​
10560.05​
257.32​
3.68​
0.090​
0.024​
1.01​
0.02​
0.16​
0.02​
0.158​
8.000​
9176.4​
34201.24​
720.81​
3.73​
0.079​
0.021​
8872.48​
29967.97​
714.69​
3.38​
0.081​
0.024​
9167.52​
34161.78​
676.97​
3.73​
0.074​
0.020​
9442.43​
35138.96​
731.79​
3.72​
0.078​
0.021​
2468.48​
9137.94​
190.44​
3.70​
0.077​
0.021​
2760.29​
10219.83​
226.69​
3.70​
0.082​
0.022​
9466.38​
35170.91​
855.54​
3.72​
0.090​
0.024​
9452.48​
35133.66​
774.39​
3.72​
0.082​
0.022​
9491.44​
35264.34​
762.17​
3.72​
0.080​
0.022​
9165.48​
34126.04​
707.19​
3.72​
0.077​
0.021​
12815.3​
47786.69​
1073.76​
3.73​
0.084​
0.022​
9529.45​
35378.11​
759.8​
3.71​
0.080​
0.021​
2837.27​
10451.12​
233.28​
3.68​
0.082​
0.022​
9393.2​
34961.86​
717.2​
3.72​
0.076​
0.021​
12789.17​
47522.46​
931.42​
3.72​
0.073​
0.020​
1.02​
0​
0.27​
0.00​
0.265​
∞​
12799.35​
47639.35​
1032.06​
3.72​
0.081​
0.022​
9473.44​
35203.73​
709.28​
3.72​
0.075​
0.020​
BOINC benchmark result known to the MW@H validator: 6.55 / 23.83 billion FP/INT ops/s
That's more like it!
Average credit per run time: 0.0532 (Rome), 0.0800 (Genoa, = 1.50x Rome)
Average credit per CPU time: 0.0145 (Rome), 0.0217 (Genoa, = 1.50x Rome)
I omitted the two extremely short tasks from Genoa's average.

The longest task so far took 20 CPU hours on Rome, which seems manageable. I will therefore try 2-threaded and 1-threaded tasks too and see where PPD land with those.

Credit per run time went down a lot since yesterday, on Rome and on Genoa. So I suppose that's indeed CreditNew at work, slowly trying to converge to some sort of "proper" PPD.

Edit,
The current #1 host by RAC, a Threadripper 3990X (Zen 2, 64c/128t, 280 W default TDP, BOINC benchmark = 4.71/21.36 billion FP/INT ops/s) is running 16-threaded tasks and gets…
…8.83 average CPU time per Run time
…0.1590 average credit per Run time
…0.0180 average credit per CPU time = in between my current Rome and Genoa scoring.

The large variety and presumed unpredictability of workunit sizes, plus the somewhat differing thread counts per task between hosts (most run with 16t, but not all), should make it rather difficult for CreditNew to converge.
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
Average credit per run time: 0.0532 (Rome), 0.0800 (Genoa, = 1.50x Rome)
Average credit per CPU time: 0.0145 (Rome), 0.0217 (Genoa, = 1.50x Rome)
Another day later:
Average credit per run time: 0.048 (Rome), 0.072 (Genoa), 0.160 (TR 3990X)
Average credit per CPU time: 0.0131 (Rome), 0.0195 (Genoa), 0.0193 (TR 3990X)
PPD from 128 threads: 132,000 (Rome), 197,000 (Genoa), 107,000 (TR 3990X)

As you can see, credit/time of my two hosts went further down since yesterday while CreditNew is trying to make sense of them, whereas the TR 3990X's credit rate stood about the same. Its credit/CPU time actually went up a little.

That's with still 4-threaded tasks on my two computers and 16-threaded tasks on the Threadripper.
PPT: 2x 155 W (Rome), 360 W (Genoa), 280 W ? (TR 3990X)
PPD/PPT: 426 (Rome), 547 (Genoa), 382 ? (TR 3990X)
Power draw at the wall: 330 W (Rome), 365 W (Genoa), unknown (TR 3990X)

I also found some other Epyc 9554 hosts further down in the top_hosts table. But they have too few results to make for a valid comparison.

At this point, I will leave it at that with these credit-based stats. Performance tuning on this basis would take far too long, as CreditNew would need several days of computation with constant host-side settings, in order to converge. And even then the fuzziness of CreditNew will be big enough to obfuscate the real performance optimum.

In other words, I will have to look into making a benchmark test with a fixed workunit. But that's something for another time; there is Radioactivity and Genefer also which should be taken care of…
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
Reactions: biodoc

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
One more thing.
Power draw at the wall: 330 W (Rome), 365 W (Genoa)
…which means that's a rather light workload. You may have already guessed that from the low ratios of CPU time to run time which I reported. Another hint:
min/avg/max core clocks =
3.39/3.66/3.74 GHz (Genoa, f_base/f_max = 3.1/3.75 GHz),
2.65/2.74/2.85 GHz (Rome, f_base/f_max = 2.35/3.35 GHz).

Asteroids@home, in contrast, runs at
2.34/2.35/2.35 GHz (Rome, i.e. doesn't get past f_base); I don't have Genoa figures right now.
 

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
MilkyWay@Home had a general drop of granted credit during the week and other unusual developments, not sure why
The admins did… something. (post 76760)
At first, this caused that an undesired large number of new workunits was generated and, AFAIU, as a side effect validation stopped. (post 76762)
Then there were a lot of validations with 0.00 credit. (post 76769)
The respective workunits still have one unsent or in-progress task; some have an additional validation-inconclusive result. We'll see whether or not credit will be corrected once results of the unsent/in-progress tasks come in. (post 76780)
 
Last edited:

Skillz

Senior member
Feb 14, 2014
940
962
136
I thought your credit dip in the last week or so was due to you moving on to something else. Didn't realize the project was having issues. Hopefully it gets sorted out.
 

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
Due to the (politely said) unique way how MilkyWay implements doublechecking (wingman task is generated only after a first result came in, not together with the first task of a workunit), and because an excessive amount of new workunits had been generated accidentally, it will likely take a long while until credit is granted again at MilkyWay. (New results are piling up in "inconclusive" state now, known as "pending" at other projects.)

For the same reason, it will probably also take a long while until we see whether or not the 0.00 credit, which was assigned to one or more days worth of results across all participants, will be overridden with non-zero credit. At least so far it looks as if the scientists will be able to use these results.

From what I read, the current admin took over without own BOINC server experience and presumably without his predecessor available for guidance. If so, then mishaps like this are pretty much inevitable.
 

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
Two days ago or so, the admin
– removed many of the unneeded new ready-to-send tasks on the server,
– switched all 0.00-credit valid results back to "validation inconclusive".
Meaning, validations will get back to normal sometime soonish, and all results which accidentally weren't credited will be so in the process.
 
Reactions: Ken g6 and Skillz

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
Now if milkyway@home can validate WUs it would be great
Here is a rough estimation.
  • Each valid result from mid January used to earn about 1,000 credits. At least this is the order of magnitude of what I see in the current few valid results of the ~20 top hosts. There is quite some credit variation though. Let's go with 1,100 credit/result on average. (source)
  • Before January 17, the server maintained a level of ~1,000 ready-to-send NBody tasks. Then the mishap with the huge number of new tasks happened. The admin removed many of them on January 24, such that there is now a level of ~690,000 ready-to-send. (source)
  • Before January 17, MilkyWay@Home gave out typically ~14 M credit per day globally, sometimes more. (source).
  • So I guess that MilkyWay@Home received on the order of 13,000 valid results per day before the mishap.
  • Let's optimistically assume that there is still the same amount of computer capacity active.
    Also let's assume that average workunit size stays the same as earlier in January, and that the fraction of successful returns remains the same.
  • If so, ~690,000 "*_0" tasks / ~13,000 results/day = ~50 days = 7...8 weeks is what it takes until there are results returned for all of these "*_0" tasks.
  • From what I understand, only after this will the server start to assign "*_1" tasks to hosts. And obviously, the server needs to receive valid results from "*_1" tasks in order to validate the pile of earlier "*_0" results.
    Note, there is currently a quite constant level of ~690,000 tasks ready to send. This is because for each "*_0" result returned, the server generates a "*_1" task. (This is true for success returns as well as error returns.) That is, the current pile of tasks ready to send is slowly containing fewer _0 tasks and more _1 tasks. But still, the server assigns all those _0 tasks first because they were queued earlier.
  • We need to count these 7...8 weeks from January 17 on. (That's because that's the day from which on there were only a few _1, _2, _3... tasks left from before, and far more _0 tasks stuffed into the queue.) Which means that hosts will begin to receive "*_1" tasks in the middle of March, maybe early March.
Two questions:
– Is my math sound?
– How many active contributors will turn away during this time of maybe two months during which crediting is almost entirely deferred?

PS:
The recently biggest contributing "team" was Gridcoin. Arguably, the minders decide on their participation in projects differently than everyone else. However, Gridcoin's computation has been less than 1/10th of the overall computation. So the question whether or not folks will stick with MilkyWay during this time is going to be answered from the perspective of normal contributors.

PPS:
My own plan was to run almost exclusively MilkyWay outside of competitions for as long and as much as I can use computers to heat the apartment. So far I have no reason to deviate from this plan. That is, my amount of CPUs active at MilkyWay solely depends on outside temperatures, besides contests, presumably until well into spring.
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,591
8,013
136
It was brought to my attention that many (perhaps half) of the tasks take only ~1/10 of the normal tasks (and give ~1/10th of the credit). Then my estimation shifts to ~23,000 results/day at mid January, and completion of the current pile of _0 tasks would happen at about mid February at this rate.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |