What do you mean race to idle? It simply gets the job done quicker while using the same amount of electricity?
Ideally, if you have an "embarassingly parallel" workload, you get more efficiency out of the system that manages to host more cores in one socket, if all other factors remain the same.
No matter which CPU you use in an AM4 system, you are always going to have some baseline power consumption from the platform itself. Then add to that the video card, RAM, storage subsystem, etc. So when you choose fewer cores, the CPU power consumption will be lower, but that savings will be offset somewhat by total system power draw, of which CPU is only one part.
Furthermore, with any modern process, you will probably get the higher-core-count CPU running at a lower clockspeed where the process is closer to an ideal point in the voltage curve. Unless you buy a real discount quad or whatever, that comes in at a low clock.
Bottom line is that if you compare the total joules used by the system to complete a given encoding job, you will find that something like the R7 1700 will be more efficient than any of the R3s.
This applies on the simple jobs as well. Quick to finish can work. But not if the percentage of power increase exceeds the percentage of time saved.
Correct. What we are looking at is average power draw vs. total time to completion, which yields a total number of joules of electrical energy used to complete the task. In the case of the R7 1700 vs 1800x, the 1800x at stock clocks suffers from sitting in a less-efficient part of the 14nm LPP voltage curve. You burn too much power to get that extra 400 MHz for it to be "worth it" from a raw efficiency perspective.
But if you start comparing R3s to the same R7 1700 . . . well, see above.