On a whim, I started running MilkyWay@Home's "N-Body Simulation" today.
On a dual-socket EPYC 7452 (2x 155 W PPT), I am running 4-threaded tasks, all SMT threads used, and CPU affinity of tasks set to be aligned with CCXs. Average core clocks are 2.6 GHz. Edit, for the time being, the host runs 50% MilkyWay and 50% Asteroids (2 client instances with MW@H with socket affinities, one client instance with A@H without affinities).
On a single-socket EPYC 9554P (360 W PPT), I am running 8-threaded tasks, all SMT threads used, and CPU affinity of tasks set to be aligned with CCXs. Average core clocks are 3.6 GHz. A small part of the current workload on this host consists of Asteroids@home.
(I configured the thread count per task via app_config.xml. There is no setting for this on the MW@H web site. By default, the application would start either 16 threads or as many threads as the number of logical CPUs which BOINC is allowed to use, whichever is less. The thread count refers to the number of computational worker threads; each task launches one more thread in addition but this consumes only a sub-seconds amount of CPU time i.e. just sleeps all the time.)
Results from client instance on socket 0 of the dual-7452:
Results from client instance on socket 1 of the dual-7452:
Results from the 9554P:
The first three columns are cut-and-paste from the results tables on the MW@H web site.
The fourth column, CPU time per Run time, corresponds very well with the average CPU utilization which I am seeing with "top" or "htop". As you can see, scaling isn't very good even at low thread count.
Now to the subject of this thread:
Edit, in case that 4-threaded tasks on the Zen 4 host won't work out, I will have to test without Asteroids@Home in the mix.
On a dual-socket EPYC 7452 (2x 155 W PPT), I am running 4-threaded tasks, all SMT threads used, and CPU affinity of tasks set to be aligned with CCXs. Average core clocks are 2.6 GHz. Edit, for the time being, the host runs 50% MilkyWay and 50% Asteroids (2 client instances with MW@H with socket affinities, one client instance with A@H without affinities).
On a single-socket EPYC 9554P (360 W PPT), I am running 8-threaded tasks, all SMT threads used, and CPU affinity of tasks set to be aligned with CCXs. Average core clocks are 3.6 GHz. A small part of the current workload on this host consists of Asteroids@home.
(I configured the thread count per task via app_config.xml. There is no setting for this on the MW@H web site. By default, the application would start either 16 threads or as many threads as the number of logical CPUs which BOINC is allowed to use, whichever is less. The thread count refers to the number of computational worker threads; each task launches one more thread in addition but this consumes only a sub-seconds amount of CPU time i.e. just sleeps all the time.)
Results from client instance on socket 0 of the dual-7452:
Run time (sec) | CPU time (sec) | Credit | CPU time per Run time | Credit per Run time | Credit per CPU time |
4630.24 | 14132.19 | 493.01 | 3.05 | 0.106 | 0.035 |
4467.86 | 13834.82 | 494.6 | 3.10 | 0.111 | 0.036 |
4511.08 | 13898.48 | 585.07 | 3.08 | 0.130 | 0.042 |
4350.68 | 13644.57 | 487.95 | 3.14 | 0.112 | 0.036 |
4653.54 | 14272.68 | 475.13 | 3.07 | 0.102 | 0.033 |
Results from client instance on socket 1 of the dual-7452:
Run time (sec) | CPU time (sec) | Credit | CPU time per Run time | Credit per Run time | Credit per CPU time |
4427.66 | 14053.83 | 498.26 | 3.17 | 0.113 | 0.035 |
4615.23 | 14315.8 | 491.47 | 3.10 | 0.106 | 0.034 |
4392.79 | 13786.02 | 486.87 | 3.14 | 0.111 | 0.035 |
4372.72 | 13903.69 | 486.58 | 3.18 | 0.111 | 0.035 |
4662.52 | 14655.39 | 476.96 | 3.14 | 0.102 | 0.033 |
Results from the 9554P:
Run time (sec) | CPU time (sec) | Credit | CPU time per Run time | Credit per Run time | Credit per CPU time |
1749.13 | 11345.92 | 207.24 | 6.49 | 0.118 | 0.018 |
4988.38 | 35653.47 | 700.93 | 7.15 | 0.141 | 0.020 |
5045.94 | 36031.08 | 626.63 | 7.14 | 0.124 | 0.017 |
1536.01 | 10843.42 | 215.54 | 7.06 | 0.140 | 0.020 |
6499.84 | 46624.73 | 845.3 | 7.17 | 0.130 | 0.018 |
4866.2 | 34857.28 | 700.06 | 7.16 | 0.144 | 0.020 |
1533.58 | 10750.01 | 194.52 | 7.01 | 0.127 | 0.018 |
1488.21 | 10427.97 | 191.67 | 7.01 | 0.129 | 0.018 |
6778.63 | 48483.06 | 946.37 | 7.15 | 0.140 | 0.020 |
5048.96 | 36133.04 | 756.07 | 7.16 | 0.150 | 0.021 |
4931.97 | 35358.92 | 721.99 | 7.17 | 0.146 | 0.020 |
6762.51 | 48365.44 | 846.41 | 7.15 | 0.125 | 0.018 |
4966.59 | 35623.22 | 776.77 | 7.17 | 0.156 | 0.022 |
4900.25 | 35102.64 | 671.11 | 7.16 | 0.137 | 0.019 |
4909.32 | 35267.84 | 680.48 | 7.18 | 0.139 | 0.019 |
6688.28 | 47963.89 | 871.52 | 7.17 | 0.130 | 0.018 |
1517.63 | 10711.25 | 192.42 | 7.06 | 0.127 | 0.018 |
The first three columns are cut-and-paste from the results tables on the MW@H web site.
The fourth column, CPU time per Run time, corresponds very well with the average CPU utilization which I am seeing with "top" or "htop". As you can see, scaling isn't very good even at low thread count.
Now to the subject of this thread:
- As you can see, the 3.6 GHz Zen 4 host gets only about half the credit per CPU time as the 2.6 GHz Zen 2 host. Instead, the Zen 4 host should get more credit per CPU time than the Zen 2 host, due to faster CPUs.
- Likewise, the Zen 4 host gets merely 1.23 times the credit per run time as the Zen 2 host, even though the Zen 4 host throws double the amount of CPUs, and faster CPUs, onto each task.
Edit, in case that 4-threaded tasks on the Zen 4 host won't work out, I will have to test without Asteroids@Home in the mix.
Last edited: