Discussion Intel current and future Lakes & Rapids thread

Page 507 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

dmens

Platinum Member
Mar 18, 2005
2,271
917
136
That's because you're assuming that Intel Skylake and Coves are extremely well designed. But that's not the case. We already know that there are many different types of CPU microarchitecture that achieved much higher PPC/PPA. Zen 3 and Apple M1.

For example, Zen3 achieved more IPC than Sunny Cove while using 256 reorder buffer and 4 ALUs(Same as Gracemont ALU count). And still smaller than Tiger lake. TSMC 7nm and Intel 10SF isn't really same but they have similar density. This confirms that there are huge rooms to improve.

I'd bet that Atom team has achieved substantially more than the Cove team did. Intel wasn't really using their transistors well. It's more like Intel Cove was wasting "cakes".

Um… OK? Intel big core teams are not good? Tell me something I don’t know.

On the other hand, the same team that designed Tremont also designed Gracemont and you can do the Tremont/Skylake comparison with real silicon like I said. What makes you think they suddenly figured out how to get ~2x the peak perf without spending any more area in a single iteration? Pro-tip: they didn’t.
 
Last edited:

diediealldie

Member
May 9, 2020
77
68
61
Um… OK? Intel big core teams are not good? Tell me something I don’t know.

On the other hand, the same team that designed Tremont also designed Gracemont and you can do the Tremont/Skylake comparison with real silicon like I said. What makes you think they suddenly figured out how to get ~2x the peak perf without spending any more area in a single iteration? Pro-tip: they didn’t.


Tremont and Sunny cove in Lakefield performance-power curve intersect at 55% of maximum sunny cove performance(which has +20% IPC compared to Skylake with better lithography).
Since Sunny cove in Lakefield is manufactured using intel 10nm, If you put a hypothetical Skylake curve there, it'll be well above Sunny cove since it's manufactured using 14nm and uses an old design and max boost clock is 3Ghz.

And we can put a hypothetical Gracemont core curve there, it'll be below Tremont since it uses better uArch and process(10ESF) compared to Tremont(Normal 10nm(not 10SF), which has a worse high voltage scaling). I don't think it's unrealistic. Tremont was already good enough, like 25% behind Skylake(And pretty sure max boost clock of Lakefield is from Sunny Cove so Tremont IPC must be quite high already).

So in short, Tremont was already good enough before, Intel called them 'Big-core level' and Tremont benchmark numbers are everywhere. And Gracemont should be better.
 
Reactions: Zucker2k

insertcarehere

Senior member
Jan 17, 2013
639
607
136
Um… OK? Intel big core teams are not good? Tell me something I don’t know.

On the other hand, the same team that designed Tremont also designed Gracemont and you can do the Tremont/Skylake comparison with real silicon like I said. What makes you think they suddenly figured out how to get ~2x the peak perf without spending any more area in a single iteration? Pro-tip: they didn’t.

We have precisely nothing solid from Intel or other sources on estimates regarding the die areas of Golden Cove, Gracemont, or Tremont cores. Why are you so sure that Gracemont, with all its documented low-level changes, cannot possibly change in die area from Tremont?
 
Reactions: RanFodar

uzzi38

Platinum Member
Oct 16, 2019
2,690
6,346
146
That's not the point. In the second graph, you can see the performance of a big core extend much higher up the performance axis. That is how the graph is supposed to look.

But in the first graph, the e-core actually achieves higher overall performance than Skylake at any point you draw a vertical line. It even shows the Atom having a higher peak performance than Skylake where the graph terminates. That is pure nonsense. You don't get performance out of thin air without spending silicon area. The real graph needs to be extended way further to show that the Skylake still has a higher performance envelope, even if it uses higher power to do so.

Alternatively, you can believe that graph as is and conclude that 8+8 can beat 16 zen cores. Oh wait, that is what some people on this thread have concluded. LOL.
What's the comparison point and in what workload? Because if we're talking the 6700K (AKA 4GHz boost clock) in the right workloads each Gracemont core absolutely can just straight out-perform it on a per-core basis.

EDIT: Intel don't seem to list what the Skylake comparison product is.

Also yeah in the right conditions ADL can beat a 5950X in MT. CB20 is one of those. That doesn't mean it will in every workload.
 

Abwx

Lifer
Apr 2, 2011
11,143
3,840
136
Also yeah in the right conditions ADL can beat a 5950X in MT. CB20 is one of those. That doesn't mean it will in every workload.

Is there a screenshot of the bench.?.

Because that would mean that a small core has way higher IPC than a Zen 3 core...
 

diediealldie

Member
May 9, 2020
77
68
61
The small print states these are pre-silicon projections. I would not trust these values given last few years Intel implementation performance. Did this get measured in real silicon later?

As for the size of Tremonts, their size is ~5.14 mm^2 (cluster + L2), so we already know that Tremont is quite small. We could add an additional 20~25% size to cover the L3 cache.

A Look at Intel Lakefield: A 3D-Stacked Single-ISA Heterogeneous Penta-Core SoC – WikiChip Fuse

So now, let's look into benchmarks. We can find 10W Intel Tremont cores (N6005, 3.3Ghz Turbo Tremont). For example, if we pick up Cinebench R20 single-core then we can find its single-core performance is 294 which is slightly higher than Cascade Lake 8253(single-core 286, 3Ghz Turbo Skylake) without any AVX support. Cinebench multi also says the same thing, Tremont(2Ghz x 4, 10W) is comparable to i3 8109U(3Ghz x 2 + HT, 20~28W)

So, we found that Tremont IPC was somewhat between Haswell and Skylake without AVX, and known size of about 6.5mm^2 per cluster (Including shared L3). Now we have 10ESF(intel 7) which iterated 2 times over (10nm -> 10SF -> 10ESF). There will be more transistors per square mm and much better high voltage profile(probably better than Tiger lake).

I'm actually quite surprised to see people being sarcastic here. Lakefield was already announced 2 years ago and chips and benchmarks are everywhere. Maybe Intel is investing in the Internet itself to cover up all their failures of Tremont just to mess with me? Interesting...
 

uzzi38

Platinum Member
Oct 16, 2019
2,690
6,346
146
Is there a screenshot of the bench.?.

Because that would mean that a small core has way higher IPC than a Zen 3 core...
Not that I've seen unfortunately. And again, it will depends on the workload - Cinebench R20 is a best case scenario.

EDIT: Also yeah I wouldn't expect Gracemont to be higher IPC than Zen 3.
 
Last edited:
Reactions: saneandsad

jpiniero

Lifer
Oct 1, 2010
14,805
5,429
136
Because that would mean that a small core has way higher IPC than a Zen 3 core...

You mean the big core. The big core has higher IPC but the gap is probally more to the big core MT clocks. And of course 250 W versus... like 140?
 

Abwx

Lifer
Apr 2, 2011
11,143
3,840
136
Not that I've seen unfortunately. And again, it will depends on the workload - Cinebench R20 is a best case scenario.

8 big cores run at 5GHz, wich provide 25% frequency advantage over 8 Zen cores, and then there s 8 small cores that run at 4GHz and without SMT, you think that the big cores can compensate for such a discrepancy.?..

You mean the big core. The big core has higher IPC but the gap is probally more to the big core MT clocks. And of course 250 W versus... like 140?

If Cinebench is a best case then no doubt that it would had been used by Intel in their average and displayed here :



Yet, it is not...



For example, if we pick up Cinebench R20 single-core then we can find its single-core performance is 294

So, we found that Tremont IPC was somewhat between Haswell and Skylake

Haswell score at 3.3GHz is something like 310.

 
Last edited:
Reactions: lightmanek

coercitiv

Diamond Member
Jan 24, 2014
6,341
12,597
136
I'm actually quite surprised to see people being sarcastic here. Lakefield was already announced 2 years ago and chips and benchmarks are everywhere. Maybe Intel is investing in the Internet itself to cover up all their failures of Tremont just to mess with me? Interesting...
People are being sarcastic because this is turning into a hope and belief contest.

Forget whatever conclusions you drew about Gracemont and go back to the original graph that shows 40% better perf at iso power or 40% power at iso clocks. Now erase that "Skylake" name and try to mentally write the actual product gen that should be able to deliver the corresponding perf/power curve:
  • is it 6th gen with ~4Ghz clocks and ~5% better node efficiency over the base 14nm process (as delivered with Broadwell)?
  • is it 10th gen with ~5Ghz clocks and 20%+ better node efficiency over base 14nm?
The difference between vanilla Skylake and Comet Lake is staggering, we're talking 25%+ better clocks and an efficiency jump that nowadays Intel would designate with a new node name. Which of these two performance and efficiency points do we stack against Gracemont?

If we can't even establish a proper baseline for the claims (thanks in part to Intel's marketing spin), how can we have a proper & serious conversation about expected performance? The only thing we can do is draw a line for the minimum we should expect from Gracemont, which is 8% faster than Skylake @ 4Ghz while probably delivering ~2x performance per area at iso process and a very nice improvement in power consumption at very low clocks. (legacy of the arch pedigree)

That's it, everything else needs to be discussed at the very least with proper graphs featuring labeled axes and measurements of actual silicon.
 
Last edited:

Asterox

Golden Member
May 15, 2012
1,027
1,781
136
For Intel, hm Cinebench is not a good or "Real world test".


But hm, Cinebench also hates Intel if we compare power consumption numbers vs Blender.

Hm, Alder Lake 8 + 8 stock power consumption in Cinebench R20 .................


 
Reactions: lightmanek

jpiniero

Lifer
Oct 1, 2010
14,805
5,429
136
8 big cores run at 5GHz, wich provide 25% frequency advantage over 8 Zen cores, and then there s 8 small cores that run at 4GHz and without SMT, you think that the big cores can compensate for such a discrepancy.?..

If the IPC gain between Cypress Cove and Golden Cove is enough, possibly. Uzzi's right that it's more likely an outlier. Have to be 20% or more.
 

uzzi38

Platinum Member
Oct 16, 2019
2,690
6,346
146
8 big cores run at 5GHz, wich provide 25% frequency advantage over 8 Zen cores, and then there s 8 small cores that run at 4GHz and without SMT, you think that the big cores can compensate for such a discrepancy.?..

Again, depends on the workload.

If Cinebench is a best case then no doubt that it would had been used by Intel in their average and displayed here :



Yet, it is not...





Haswell score at 3.3GHz is something like 310.

Intel use the same benchmarks for every IPC figure they give. Same methodology, every time (kudos to them for it too).

Even for GLC R20 is a best case scenario. ~25% uplift over Zen 3, almost 30% vs Rocket Lake. The 810 ST number is real as far as I know anyway.
 

Abwx

Lifer
Apr 2, 2011
11,143
3,840
136
Again, depends on the workload.


Intel use the same benchmarks for every IPC figure they give. Same methodology, every time (kudos to them for it too).

Even for GLC R20 is a best case scenario. ~25% uplift over Zen 3, almost 30% vs Rocket Lake. The 810 ST number is real as far as I know anyway.

They have published the benches used, all thoses bars are due to the fact that they display all subscores of a handfull benches where there s only 3 that are relevants or so.
Based on overall scores and individual subcomponent scores on: SYSmark 25, CrossMark, PCMark 10, SPEC CPU 2017, WebXPRT 3, Geekbench 5


There s no Cinebench here, seems that the marketing trick to display a lot of bars did work, people assumed that it was the same benches than in the ICL/TGL slide...
 

repoman27

Senior member
Dec 17, 2018
370
519
136
I found it notable that pretty much every new disclosure regarding ADL platforms that Intel made during the Architecture Day presentation was a regression from previously leaked or anticipated specs. Resetting expectations, I suppose.

2+8+2 LP die probably only has two Thunderbolt 4 ports rather than four, which I suppose makes sense. AVX-512 is completely disabled 100% of the time despite the hardware being present on the big cores. LPDDR5 tops out at 5200 even though the LP5 controller in TGL (which has yet to be enabled anywhere) is allegedly LPDDR5-5400. The PCIe capabilities of the 600 series PCH were reversed from up to Gen4 x16 + Gen3 x12 to Gen4 x12 + Gen3 x16.

Ian made a point of clarifying that the memory controllers were the same across all of the ADL dies, which is nice, but mostly irrelevant. Each platform / package will still only expose the relevant interfaces. In other words, M is almost certainly LP4/5 only, S will be DDR4/5 only, and P will support all four memory technologies but only at 1DPC for DDR4/5. I guess the 8+8+1 HP die having LPDDR interfaces might point to them being available on S BGA (BGA 1964) platforms?

Obviously the most significant disclosure Intel has made, and emphasized during both the Architecture Day and Hot Chips presentations, is Thread Detector and the fact that it is only fully enabled on Windows 11 at this point. The dependency on Windows 11 for optimum performance will certainly affect the release schedule for Alder Lake parts. However, the last week of October has historically been a common release week for Microsoft, and aligns quite nicely with what most folks were anticipating for a launch date for ADL-S.

On the other hand, I have to admit that I can't see an ADL-P/M launch happening in Aug or any time prior to general availability of Windows 11. So I'm not sure when those will be announced, either alongside S at the end of October, or later in Q1 or Q2'22. I noticed that the TGL-UP3/H35 Refresh chips are actually based on an entirely new stepping (C0 vs. B1), so maybe Intel will expand on that later this month as a stopgap. Pretty crazy to do a new stepping so late in the release cycle for just the 4 or 5 SKUs we've seen thus far.
 

Abwx

Lifer
Apr 2, 2011
11,143
3,840
136
If the IPC gain between Cypress Cove and Golden Cove is enough, possibly. Uzzi's right that it's more likely an outlier. Have to be 20% or more.

Cypress Cove does 6000 pts , with 20% more that s 7200, and then there are something like 4100 pts that should be produced by the small cores to get to the alleged 11300.

That would put the small cores at 512 pts each at 4GHz, so about 640 at 5GHz, wich is a slighly better IPC than Cypress Cove s 623@5GHz...

That s not only more that is needed, but more than more...
 
Reactions: lightmanek

repoman27

Senior member
Dec 17, 2018
370
519
136
Also, here's an updated version of my compilation of Alder Lake info.

Alder Lake (ADL)

manufacturing process:
Intel 10nm Enhanced SuperFin (10+++ > 10++ > 10ESF > Intel 7)

dies:
2+8+2 LP = 2 P-cores + 8 E-cores + GT2 graphics + 2 Thunderbolt 4 ports (Intel Family 6, Model 154, Stepping 1?)
6+8+2 LP = 6 P-cores + 8 E-cores + GT2 graphics + 4 Thunderbolt 4 ports (Intel Family 6, Model 154, Stepping 0?)
6+0+1 HP = 6 P-cores + GT1 graphics (Intel Family 6, Model 151, Stepping 5?)
8+8+1 HP = 8 P-cores + 8 E-cores + GT1 graphics (Intel Family 6, Model 151, Stepping 2?)

CPU cores:
P-core = Golden Cove, Hyper-Threading supported, AVX-512 disabled
E-core = Gracemont, no Hyper-Threading, no AVX-512 support

graphics:
GT1 = 32EU Xe-LP Gen12.2
GT2 = 96EU Xe-LP Gen12.2

chipsets:
Alder Lake PCH = Alder Point (ADP), Intel 14nm
ADP-LP = 600 Series on-package PCH, OPI x8 @ 4 GT/s
ADP-H = 600 Series PCH (2-chip platform), DMI Gen4 x8, 28 mm x 25 mm

packages:
M = BGA 1781, 28.5 mm x 19 mm x 1.1 mm (Y > Type 4 > UP4 > M)
P = BGA 1744, 50 mm x 25 mm x 1.3 mm (U > Type 3 > UP3 / H35 > P)
S BGA = BGA 1964, ? (H > S BGA)
S = LGA 1700, 45 mm x 37.5 mm

memory interfaces:
M = LPDDR4X-4266 / LPDDR5-5200
P = LPDDR4X-4266 / LPDDR5-5200 / DDR4-3200 1DPC / DDR5-4800 1DPC
S = DDR4-3200 2DPC / DDR5-4000 2DPC / DDR5-4800 1DPC

PCI Express:
M = CPU Gen5 1x8 / Gen4 1x4?, PCH Gen3 up to 10 lanes
P = CPU Gen5 1x8 + Gen4 2x4, PCH Gen3 up to 12 lanes
S = CPU Gen5 1x16 / 2x8 + Gen4 1x4, PCH Gen4 up to 12 lanes + Gen3 up to 16 lanes

platforms:
M5 = 2+8+2 LP and TGP-LP? dies, M package
U9 = 2+8+2 LP and TGP-LP? dies, M package
U15 = 2+8+2 LP and ADP-LP dies, P package
U28 = 6+8+2 LP and ADP-LP dies, P package
H45 = 6+8+2 LP and ADP-LP dies, P package
H55 = 8+8+1 HP die, S BGA package
S35 = 6+0+1 HP or 8+8+1 HP die, S package
S65 = 6+0+1 HP or 8+8+1 HP die, S package
S80 = 6+0+1 HP or 8+8+1 HP die, S package
S125 = 8+8+1 HP die, S package

launch schedule:
ADL-M/P 2+8+2 (M5/U9/U15) press embargo ?
ADL-P 6+8+2 (U28) press embargo ?
ADL-S 8+8+1 Prod WW35'21-WW42'21, RTS WW43'21-WW50'21, press embargo Oct 25-31, 2021
ADL-S 6+0+1 Prod WW41'21-WW48'21, RTS WW49'21-WW04'22
ADL-P 6+8+2 (H45) press embargo Jan '22?
ADL-S 8+8+1 (H55) press embargo Apr '22?

sources:
sharkbay PTT BBS 2020-01-02
sharkbay PTT BBS 2020-03-02
sharkbay PTT BBS 2020-05-13
@JZWSVIC Zhihu 2020-07-12
sharkbay PTT BBS 2020-07-15
Li Tang Technology interposer list (site appears to be offline now)
Coelacanth's Dream Alder Lake
Intel Architecture Day 2020-08-13
Notebookcheck 2020-10-03
Intel CES 2021-01-11
HXL @9550pro Twitter 2021-03-06
VideoCardz 2021-03-11
VideoCardz 2021-03-20
188号 @momomo_us Twitter 2021-03-26
HXL @9550pro Twitter 2021-04-16
HXL @9550pro Twitter 2021-07-09
Intel Architecture Day 2021-08-19
 

jpiniero

Lifer
Oct 1, 2010
14,805
5,429
136
Cypress Cove does 6000 pts , with 20% more that s 7200, and then there are something like 4100 pts that should be produced by the small cores to get to the alleged 11300.

For 11300 the gain would need to be more than 20% yeah. The small cores are most likely only adding like 3000.
 

dmens

Platinum Member
Mar 18, 2005
2,271
917
136

Tremont and Sunny cove in Lakefield performance-power curve intersect at 55% of maximum sunny cove performance(which has +20% IPC compared to Skylake with better lithography).
Since Sunny cove in Lakefield is manufactured using intel 10nm, If you put a hypothetical Skylake curve there, it'll be well above Sunny cove since it's manufactured using 14nm and uses an old design and max boost clock is 3Ghz.

And we can put a hypothetical Gracemont core curve there, it'll be below Tremont since it uses better uArch and process(10ESF) compared to Tremont(Normal 10nm(not 10SF), which has a worse high voltage scaling). I don't think it's unrealistic. Tremont was already good enough, like 25% behind Skylake(And pretty sure max boost clock of Lakefield is from Sunny Cove so Tremont IPC must be quite high already).

Yet more marketing spin. The Sunny Cove on Lakefield was so thermally crippled that it was practically useless. The Lakefield reviews showed this: the one big core hardly ever kicked in even during straight up benchmarking sessions. So, Intel marketing person simply adds an artificial power ceiling on the big core and boom, the little core can achieve a much higher % of peak perf as the big core.

So in short, Tremont was already good enough before, Intel called them 'Big-core level' and Tremont benchmark numbers are everywhere. And Gracemont should be better.

LOL. They can call it whatever they want. The actual benchmarks speak for themselves. By the way, that graph you posted is from a "pre-silicon projection". Go look at the Lakefield reviews on ST perf from those Tremont cores. Or the Jasper Lake benchmarks where the Tremonts get all the voltage they want during boost. It ain't pretty.
 

Hulk

Diamond Member
Oct 9, 1999
4,351
2,213
136
An exercise in futility.
Skylake performance at "100" when Gracemont is "140."
Assume Skylake power at 100% to put Gracemont at 60."



 

repoman27

Senior member
Dec 17, 2018
370
519
136
The 6+0 is gone. Replaced with Rocket Lake Refresh. Dunno if it's LGA 1700.
There's only one Rocket Lake die, 8+1 HP, and it's 276 mm² on 14nm. SKUs based on it retail between $157-$539. The ADL 6+0+1 HP die is intended to replace Comet Lake Refresh (CML 6+2), which fills in the entry level desktop SKUs between $64-$192. Intel needs an HP die that can be used for lower-cost LGA 1700 platforms. The ADL 6+0+1 die had a definite slot on the production dashboard earlier this year, and its corresponding CPUID, 90574 (Model 151, Stepping 4), has shown up in benchmarks as recently as last week.

Unfortunately, Intel has made it all too easy to not understand their product stack. To help with that, I compiled a spreadsheet of Intel's 2021 CPU offerings a little while back. It's a lot like the SKU tables that Intel provides at product launches, but I tried to normalize all of the columns to make it easier to sort and include all of the columns that I tend to care about. ARK is a handy tool, but lately I've been finding it really difficult to quickly compare all of the SKUs for a given platform. Intel's fractured launches combined with a bit of intentional obfuscation were making it ever harder to get a cohesive overview. I'll link to a Google Sheets version of my spreadsheet in case anyone else finds it useful. There are separate tabs / sheets in the workbook for products based on each die.

 

jpiniero

Lifer
Oct 1, 2010
14,805
5,429
136
There's only one Rocket Lake die, 8+1 HP, and it's 276 mm² on 14nm. SKUs based on it retail between $157-$539. The ADL 6+0+1 HP die is intended to replace Comet Lake Refresh (CML 6+2), which fills in the entry level desktop SKUs between $64-$192. Intel needs an HP die that can be used for lower-cost LGA 1700 platforms. The ADL 6+0+1 die had a definite slot on the production dashboard earlier this year, and its corresponding CPUID, 90574 (Model 151, Stepping 4), has shown up in benchmarks as recently as last week.

A 6+0 SKU (12400?) wouldn't surprise me but it would be from the 8+8 die. OEMs might actually be okay with i3 and below being on LGA 1200. Use up the defective 4C Rocket Lake or maybe there's a new cut.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |