Interested in a Folding@Home Coronavirus Race?

Markfw · Mar 16, 2020

The default team is now doing 2 BILLION PPD ! They were at 300 million

Markfw · Mar 16, 2020

biodoc said:
Here's the link to FAH server status.

To me, that is irrelevant, if I can't get any work. I am getting pissed. This is the worst in the 20 years I have been folding. Why add these new units if they can't create work for them ?

StefanR5R · Mar 16, 2020

What use is it to complain? Just wait until they add resources, or until volunteered computer capacity goes down again.

It is an academic operation; they don't have infinite resources. Internet connection bandwidth for example. Their outbound and inbound traffic must be mindboggling.

Look at it this way: At this time, there is more donated computing capacity available to the researchers than they can technically make use of. This is good, not bad.

For the time being, the more technically savvy donors among us can look around for other projects which are not in such a unique situation. If you are specifically looking for COVID-19 related projects, then the only current alternative known to me is Rosetta@home, which unlike F@H is CPU-only (i.e., does not have a GPU application available). If you are specifically looking for a GPU-enabled project in the medical field, then GPUGrid is AFAIK the only alternative.

VirtualLarry · Mar 16, 2020

StefanR5R said:
Internet connection bandwidth for example. Their outbound and inbound traffic must be mindboggling.

That being said, my LAN right now is connected to my Internet Essentials connection, which until a few days ago, was only 15/2, but Comcast used the Covid-19 thing as an excuse and PR move to finally bump the connection speed up to 25/3 going forward (the minimum standard for "broadband", according to the US Federal Gov't, btw). I noticed that when my F@H program on my main PC was uploading a WU, it was tying up my entire 3.7Mbit/sec upload for a noticeable amount of time. Imagine the bandwidth that they must have, to receive so many WUs from so many people, so often! And downloading them too. (That does not seem to max out my 27Mbit/sec download speed.)

StefanR5R · Mar 16, 2020

biodoc said:
Here's the link to FAH server status.

And here is a folding forum thread which explains what the terms at the serverstats page mean.

If I remember correctly, there were only two, at times just one, work servers with openmm_22 work up during the weekend. (COVID-19 related GPU work is of the openmm_22 type.) And one of the two assign servers had a much lower assign rate than the other one, i.e. was struggling somehow or was bottlenecked by the work servers. So, the F@H admins apparently worked today to improve WU availability. Also, the current four work servers with openmm_22 jobs are all on different campuses; that way the internet bandwidth from and to them is probably maximized.

Markfw · Mar 16, 2020

VirtualLarry said:
That being said, my LAN right now is connected to my Internet Essentials connection, which until a few days ago, was only 15/2, but Comcast used the Covid-19 thing as an excuse and PR move to finally bump the connection speed up to 25/3 going forward (the minimum standard for "broadband", according to the US Federal Gov't, btw). I noticed that when my F@H program on my main PC was uploading a WU, it was tying up my entire 3.7Mbit/sec upload for a noticeable amount of time. Imagine the bandwidth that they must have, to receive so many WUs from so many people, so often! And downloading them too. (That does not seem to max out my 27Mbit/sec download speed.)

I have the option (without changing any hardware) to get 150/150 megabit. Its like $20 a month more (I am on fiber-optic, all the way to my house, Frontier). I also volunteered a 7551 EPYC server to help them out, and no response.

Edit: I now have 5 working of 12. When they hit 10-11 of my machines working, Then I can chill,.

StefanR5R · Mar 16, 2020

Each of their three top work servers has >100 TeraByte disk space (serverstats shows only the free space left), and is handling 10,800 work units per hour (3.0 work units per second). With 79 MB (?) per work unit, and 55 MB typically per result, that's 237 MB/s (1900 Mb/s) outbound traffic for new work sent and 165 MB/s (1320 Mb/s) inbound traffic for results received, just for one of their top work servers. On top of that comes the internal traffic e.g. for moving the results to further processing.

Markfw · Mar 16, 2020

StefanR5R said:
Each of their three top work servers has >100 TeraByte disk space (serverstats shows only the free space left), and is handling 10,800 work units per hour (3.0 work units per second). With 79 MB (?) per work unit, and 55 MB typically per result, that's 237 MB/s (1900 Mb/s) outbound traffic for new work sent and 165 MB/s (1320 Mb/s) inbound traffic for results received, just for one of their top work servers. On top of that comes the internal traffic e.g. for moving the results to further processing.

Well, thats cool, but as I said, regardless, why add new units when you can't process them, its the horse before the cart (or is it the cart before the horse ??) situation. You just make everyone upset. I spend like $800 a month on electricity, and $8000-12,000 a year on hardware to help these causes. When they can't even use my help, why bother providing the service ?

Anyway, I guess my electric bill will be way less this month, and I will forgo buying hardware until someone can use it.

borandi · Mar 16, 2020

If the cloud provider who put their GPU array up to the task offered to host the project servers as well (for free?), I'm sure it wouldn't be a problem. It's one thing to offer a GPU array to do work, it's another to host the project though.

amrnuke · Mar 16, 2020

StefanR5R said:
Each of their three top work servers has >100 TeraByte disk space (serverstats shows only the free space left), and is handling 10,800 work units per hour (3.0 work units per second). With 79 MB (?) per work unit, and 55 MB typically per result, that's 237 MB/s (1900 Mb/s) outbound traffic for new work sent and 165 MB/s (1320 Mb/s) inbound traffic for results received, just for one of their top work servers. On top of that comes the internal traffic e.g. for moving the results to further processing.

The big question then is the classic one in computing - how do we fix the bottleneck? Is there a good place to donate? I'd rather donate cash if my RX5700 can't do anything for them due to infrastructure issues.

To Hades with Coronavirus and cancer. And the fact that I can't make my GPU work on either makes me a little pissed off. TAKE MY MONEY!

Markfw · Mar 16, 2020

Not that its going to happen, but I have offered my 3 EPYC servers to Stanford to assist. The first level person referred me to his boss. No idea if this will happen or not.

StefanR5R · Mar 17, 2020

As far as I understand the serverstats, forum messages, and commonly reported error logs: The admins obviously worked on Sunday and on Monday to utilize their work servers optimally, and now the system appears to be bottlenecked at the network. Consider that they may have bandwidth quota at their connections.

Markfw said:
but as I said, regardless, why add new units when you can't process them

There are only two ways that this (sudden inrush of new volunteers, plus occasional Folders switching F@H back on due to the news) could have been averted: If they hadn't decided to work on COVID-19 in the first place (I for one am glad they do this work), or if they had but had kept it a secret (that would have been absurd).

Markfw said:
You just make everyone upset.

Please don't be upset about the fact that there are suddenly so many people interested in helping.

As mentioned, there are still other Distributed Computing projects which can use help (but only one other working on COVID-19: Rosetta, and only one other medical project using GPUs: GPUGrid).

Ryan Smith · Mar 17, 2020

Thanks for the feedback, gang. We're officially a go. More details to come soon as we hammer out dates and such.

borandi · Mar 17, 2020

A source close to the issues says it's a storage bandwidth bottleneck now.

HighTechJoe · Mar 17, 2020

I'm one of those that spun things back up with the COVID-19 project news. I didn't get much in the way of work units over the weekend, but things picked up Monday morning and again this morning with both my GPUs getting work units.

It is frustrating not getting work units from a points standpoint (I should be gaining ~2.7m points per day), but I'm glad to see so many people getting involved and donating compute time.

Here is my production graph:

Regardless, looking forward to contributing to the team in the race!

StefanR5R · Mar 17, 2020

borandi said:
A source close to the issues says it's a storage bandwidth bottleneck now.

Thanks for the info; then my assumption about networking constraints was mistaken.

Markfw · Mar 17, 2020

borandi said:
A source close to the issues says it's a storage bandwidth bottleneck now.

Well, how do they fix that ? I now have 1 of 12 video cards working.....Its getting worse.

Markfw · Mar 17, 2020

OK, this is getting ridiculous. ZERO of 12 cards have work. I have rebooted almost all of my boxes to get the "latest" from stanford, and a couple of days ago, that helped a few boxes get work. Now they all have the same error:

22:30:34:WU01:FS01:Connecting to 18.218.241.186:80
22:30:35:WU01:FS01:Assigned to work server 128.252.203.10
22:30:35:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:TU102 [GeForce RTX 2080 Ti Rev. A] M 13448 from 128.252.203.10
22:30:35:WU01:FS01:Connecting to 128.252.203.10:8080
22:30:51:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
22:41:40:WU01:FS01:Connecting to 65.254.110.245:8080
22:41:40:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration

TennesseeTony · Mar 17, 2020

Tentative race start time is 10am EST-USA tomorrow.

Whoever did the custom stats tracking last time ( @Ken g6 ? ) can you have a script that soon??? :/

Markfw · Mar 17, 2020

TennesseeTony said:
Tentative race start time is 10am EST-USA tomorrow.

Whoever did the custom stats tracking last time ( @Ken g6 ? ) can you have a script that soon??? :/

Tony, I already posted, but how can we have a race when NO units are available. Zero points and zero points ? we need to get this delayed until they work out their problems.

VirtualLarry · Mar 17, 2020

Markfw said:
Tony, I already posted, but how can we have a race when NO units are available. Zero points and zero points ? we need to get this delayed until they work out their problems.

I tend to agree, if Stanford isn't sending out WU regularly to all participants, we might as well hold a lotto for the winner of this "race". :|

Bad timing, if nothing else, but I can kind of understand, these "media companies" want to cash in on the "trendiness" of the "CoronaVirus" issue, and get new members / pageviews / and yes, race participants. I would say that's a win/win all around, but really, ONLY if Stanford is performing "up to par".

Markfw · Mar 17, 2020

VirtualLarry said:
I tend to agree, if Stanford isn't sending out WU regularly to all participants, we might as well hold a lotto for the winner of this "race". :|

I mean seriously, I have 12 boxes, and NONE can get a unit ? What are the odds anyone with even 3 cards, can get a unit. Maybe I will try to get a CPU unit going.

Edit: tried CPU, 2 tries, NOTHING

LOUISSSSS · Mar 17, 2020

opening up the FAHControl Client Advanced Control on my desktop and seem to be having some "Connecting" problems. It says at "Connecting" for very long time. i remember once today i didn't do anything and got a GPU project and a CPU project here and there. But lots of time spent idle. Is this normal?

Daishiki · Mar 17, 2020

Markfw said:
I mean seriously, I have 12 boxes, and NONE can get a unit ? What are the odds anyone with even 3 cards, can get a unit. Maybe I will try to get a CPU unit going.

Edit: tried CPU, 2 tries, NOTHING

Yeah, I have three boxes standing by... at least I had some WUs yesterday.

Ken g6 · Mar 17, 2020

TennesseeTony said:
Tentative race start time is 10am EST-USA tomorrow.

Whoever did the custom stats tracking last time ( @Ken g6 ? ) can you have a script that soon??? :/

I just tried re-setting-up my script. The stats server isn't working.

stats not updating - Folding Forum

Markfw said:
we need to get this delayed until they work out their problems.

This.

Interested in a Folding@Home Coronavirus Race?

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Elite Member

No Lifer

Elite Member

Moderator Emeritus, Elite Member

Elite Member

Moderator Emeritus, Elite Member

Member

Golden Member

Moderator Emeritus, Elite Member

Elite Member

The New Boss

Member

Junior Member

Elite Member

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Elite Member

Moderator Emeritus, Elite Member

No Lifer

Moderator Emeritus, Elite Member

Diamond Member

Golden Member

Programming Moderator, Elite Member