So the problem of ”loading threads fully” apply to Zen6 as well.
You miss the point, possibly intentionally.
If a workload scales well to 20-32 threads then shows diminishing returns Zen 6 will be faster because upto 24 or those threads will be on a full core upto 8 will be on a HT virtual core.
With NVL 16 will be on a p core and upto 16 will be on an e core.
Plenty of workloads don't even scale much beyond 8 threads but more cores means you can potentially run multiple instances of such workloads at the same time. With NVL you can run 2 before you start using E cores. With Zen 6 you can run 3 before you are onto HT threads.
There will be plenty of cases where Zen 6 is just better because it's performance is more consistent as you load up more threads.