- Mar 3, 2017
- 1,659
- 6,101
- 136
Ipad does not have the wealth of software that benefits from single core. Arm based Macs lost compatibility with gaming software, and Arm based Mac has been abandoned as a gaming platform.
And if you only play games where FPS is a major thing, then you have a poor taste in games.
Well, the hype has been +40% 1T perf and a score of 4200 is what it’d take to reach that.
Core for core Zen5 is >40% faster than Zen4 in SPEC.
We don't know the lineup of Turin (and Turin dense) as of yet. That is, which SKU with what core count and thread count will get which cTDP_low/default TDP/cTDP_high is not obvious so far. There has been a rumored table of SKUs recently but this seemed to include some mistakes.the caveat is that Turing has 500W TDP Vs Genoa with 400W
Ignoring the increased core count per socket for a second, or assuming that same-core-count SKUs might get their cTDP_high upped to 500 W, the clock speed in fully parallel workloads is going to depend on how much "IPC" the given workload will be able to extract from the considerably widened cores.so with high core counts in an all core workload Turing probably clocks higher
please stop calling it Turing32% IPC in Spec Int 2017 1T. A very specific claim and it annoys me when people take that specific claim and try and generalise to all workloads, it screams dishonest and disingenuous.
Kepler Said
But the caveat is that Turing has 500W TDP Vs Genoa with 400W so with high core counts in an all core workload Turing probably clocks higher. On desktop I don't think there will be much of a clock speed difference going from Zen 4 to 5 so the mt uplift there will be lower.
Turin no DLSS????? DOAplease stop calling it Turing
Remember that despite industry leading IPC:Cac ratios of AMD cores, moar IPC is also moar power in the age of very dead Dennard scaling.But the caveat is that Turing has 500W TDP Vs Genoa with 400W so with high core counts in an all core workload Turing probably clocks higher.
Pretty much everything. If it doesn't scale that much, or MT isn't implemented for a specific algo/library and I don't want to bother rewriting it myself, I will simply use it in parallel for different datasets, or even use it while I do other kind of analyses. I work in biophysics/structural bioinformatics (academia), there's always different stuff to do/try. And the GPU isn't an option, since it's busy doing some other stuff most of the time, and it's far too time consuming to port stuff to it anyway (when it's not straight-up impossible).So how well does "any kind of serious work" done on desktops and workstations, in IRL, scale past ≈eight threads?
I already have access to a powerful cluster, but desktop work is desktop work. For stuff that runs in let's say 10 or 20 minutes having to deal with remote resources isn't that convenient. A workstation running it locally in 5-10 minutes would be much better.If your application scales easily to high thread counts but workstation and server are too expensive to you, then buy a dozen of 2nd hand PCs and a cheap Ethernet switch and run the CPU intensive part of your well scaling application on the Ethernet cluster. This instantly gives this type of applications greatly more nT performance than any generational CPU update will ever do, and for little money to boot. (Or if you do streaming vector arithmetic, get your application ported to GPU, as @H433x0n pointed out.)
Of course more ST is good! What I'm saying is that in the end, +30% MT and +20% ST would be better (for me and many other people) than the reverse, especially since the ST baseline is already fairly good and already allows fast prototyping.In other other words, even though you, as end user with nT performance needs but without budget do not matter to AMD and to OEMs, you still get to reap much of the benefits of those CPU architecture updates which are driven by requirements of customer groups which do matter to AMD and OEMs. Nice!
Here's where I once again point out to people that the mobile gaming market is larger than the PC and console gaming markets COMBINED. Apple's mobile gaming market alone is much bigger than the entire PC (or console) gaming market.
So the Mac's "abandonment" as a gaming platform (how can it be abandoned when it never had it in the first place) is irrelevant. The iPad, and the slightly lower clocked version of M4's CPU that will be shipping in millions of iPhones four months from now, will have plenty of games for their users to choose from. Even if you turn your nose up at them and don't consider mobile gaming to be "real" gaming. Revenue don't lie, that's what most people consider to be gaming when voting with their wallets.
Not as bad as RGT calling it "Cheering"please stop calling it Turing
please stop calling it Turing
Remember that despite industry leading IPC:Cac ratios of AMD cores, moar IPC is also moar power in the age of very dead Dennard scaling.
lots of server power is spent on I/O / uncore / memory so if they have been able to continue to improve power in those area's it can mean more power per core then just a straight extrapolation.500 W / 400 W = 1.25
128? cores / 96 cores = 1.33?
196? dense cores / 128 dense cores = 1.53?
(These fractions don't though that the socket power budget is divided between CCDs and IOD and fabric.)
500 W / 400 W = 1.25
128? cores / 96 cores = 1.33?
196? dense cores / 128 dense cores = 1.53?
(These fractions don't reflect though that the socket power budget is spread between CCDs and IOD and fabric.)
CorrectI may be utterly wrong but I presume Keplers core for core comment was referring to a 96c Turin vs a 96c Genoa, would not really be core for core if core counts were not equalised.
Oh noes first Starfield and now Turin? ahahahahsdouifhasdof sorryTurin no DLSS????? DOA
Then it just means the inverse is probably true. Turin roughly matches Genoa in clock speeds but uses up the extra TDP budget to do that due to the IPC uplift. The 9950X having the same TDP as 7950X means it clocks lower in all core workloads. In ST workloads you don't max out the TDP budget so clock speeds are roughly comparable. Something along those lines.
I got the core-for-core bit, but I had forgotten about the other mention which indicates that it is going to be possible to operate 96c Turin at 500 W socket power.I may be utterly wrong but I presume Keplers core for core comment was referring to a 96c Turin vs a 96c Genoa, would not really be core for core if core counts were not equalised.
April :-DOh yeah? Which month?
Successor of Bergamo (Turin Dense?) will have 192 cores not 196.500 W / 400 W = 1.25
128? cores / 96 cores = 1.33?
196? dense cores / 128 dense cores = 1.53?
(These fractions don't reflect though that the socket power budget is spread between CCDs and IOD and fabric. Edit: However, the new sIOD will apparently have more GMI links and faster IMCs.)
Does +30% nT gain actually translate into your workloads taking 1-(1/1.3) ~=23% lower time to complete?I already have access to a powerful cluster, but desktop work is desktop work. For stuff that runs in let's say 10 or 20 minutes having to deal with remote resources isn't that convenient. A workstation running it locally in 5-10 minutes would be much better.
Of course more ST is good! What I'm saying is that in the end, +30% MT and +20% ST would be better (for me and many other people) than the reverse, especially since the ST baseline is already fairly good and already allows fast prototyping.
Now, if the "problem" with zen 5 is SMT, that might be reasonable. 16 really fast cores might still be a good deal even if SMT does not add that much in some workloads.