Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Page 268 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
677
559
106






As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



Comparison of upcoming Intel's U-series CPU: Core Ultra 100U, Lunar Lake and Panther Lake

ModelCode-NameDateTDPNodeTilesMain TileCPULP E-CoreLLCGPUXe-cores
Core Ultra 100UMeteor LakeQ4 202315 - 57 WIntel 4 + N5 + N64tCPU2P + 8E212 MBIntel Graphics4
?Lunar LakeQ4 202417 - 30 WN3B + N62CPU + GPU & IMC4P + 4E08 MBArc8
?Panther LakeQ1 2026 ??Intel 18A + N3E3CPU + MC4P + 8E4?Arc12



Comparison of die size of Each Tile of Meteor Lake, Arrow Lake, Lunar Lake and Panther Lake

Meteor LakeArrow Lake (20A)Arrow Lake (N3B)Arrow Lake Refresh (N3B)Lunar LakePanther Lake
PlatformMobile H/U OnlyDesktop OnlyDesktop & Mobile H&HXDesktop OnlyMobile U OnlyMobile H
Process NodeIntel 4Intel 20ATSMC N3BTSMC N3BTSMC N3BIntel 18A
DateQ4 2023Q1 2025 ?Desktop-Q4-2024
H&HX-Q1-2025
Q4 2025 ?Q4 2024Q1 2026 ?
Full Die6P + 8P6P + 8E ?8P + 16E8P + 32E4P + 4E4P + 8E
LLC24 MB24 MB ?36 MB ??8 MB?
tCPU66.48
tGPU44.45
SoC96.77
IOE44.45
Total252.15



Intel Core Ultra 100 - Meteor Lake



As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)

 

Attachments

  • PantherLake.png
    283.5 KB · Views: 23,968
  • LNL.png
    881.8 KB · Views: 25,440
Last edited:

coercitiv

Diamond Member
Jan 24, 2014
6,247
12,147
136
Yeah XMX has to come back, because MS's TOPs requirements are only getting larger as time goes on. It's a lot more area efficient (AKA cheaper to produce and for the end consumer later) to use XMX than it is to slap on an even _bigger_ NPU, even if the NPU would be more power efficient.
Never really bothered to to get more in-depth with the subject, but my basic understanding is ML based tasks in personal computing will be split in two categories:
  • "Low" compute tasks where efficiency is important, such as video call background blur, noise reduction, recognition, translation, dictation, grammar & auto correct etc. The NPU should handle them, so it needs to be scaled to their scope and made as efficient as possible.
  • Heavy compute tasks using generative models (language, multimedia, science & engineering) where performance is important. These will leverage the GPU mostly, because this way the compute area can be used for both AI and graphics, which is a good compromise for a consumer chip.
 

moinmoin

Diamond Member
Jun 1, 2017
4,967
7,715
136
If the former is true, then having AVX-512 support in the die and then fused off is a complete waste of expensive silicon and adds up to cost significantly (25% is a ton of money). Having two separate designs makes a lot of sense considering LNC is new.
It would be especially silly considering the whole reason of existence for E-cores is area efficiency. But with AVX-512 existing but disabled the P-cores are essentially artificially bloated without reason. And I have a hard time imagining combining P and E-cores with all the hard- and software changes that necessitates is cheaper than just optimizing P-cores.
 

Geddagod

Golden Member
Dec 28, 2021
1,159
1,033
106
I remember reading some articles that had mixed views about the AVX-512 die area during the Linus Torvalds AVX controversy. Many claimed that AVX-512 instructions take up significant die space (as much as 25% per core) due to it's complex logic. While a few others claimed that AVX-512 support doesn't take up significant space in the total die area.

If the former is true, then having AVX-512 support in the die and then fused off is a complete waste of expensive silicon and adds up to cost significantly (25% is a ton of money). Having two separate designs makes a lot of sense considering LNC is new.
Where did you hear AVX-512 adds 25% to the core area?
Also, even if it is true (I doubt it is lmao), that's just the core. It's not adding 25% to the whole die, the impact there is gonna be way, waaaay smaller
 

SiliconFly

Golden Member
Mar 10, 2023
1,056
541
96
Where did you hear AVX-512 adds 25% to the core area?
Also, even if it is true (I doubt it is lmao)
You should. Like I said, AVX-512 % die area projections differs from one die to die. Some are from trust worthy sources while others are plain speak, possibly just rumors or guesses (may or may not be accurate). You be your own judge, cos Intel doesn't publish exact figures.

Link, link & link.

...It's not adding 25% to the whole die, the impact there is gonna be way, waaaay smaller
Your accuracy amazes me! Assuming you can read correctly, I clearly mention their claims upto 25% per core. I never said 25% of the whole die area. Then also, I'm sure it's definitely not waaaay smaller. Definitely not a single digit % number.
 
Jul 27, 2020
16,659
10,665
106
If we assume each P core in Intel CPUs contains two AVX-512 units and same core is used for server and consumer with the latter having one unit disabled, that's a lot of die area being wasted in the name of segmentation.
 

Geddagod

Golden Member
Dec 28, 2021
1,159
1,033
106
You should. Like I said, AVX-512 % die area projections differs from one die to die. Some are from trust worthy sources while others are plain speak, possibly just rumors or guesses (may or may not be accurate). You be your own judge, cos Intel doesn't publish exact figures.
You can literally just look at skylake client and then skylake server (which has AVX-512).
Assuming you can read correctly, I clearly mention their claims upto 25% per core. I never said 25% of the whole die area.
Shouldn't have said this then
If the former is true, then having AVX-512 support in the die and then fused off is a complete waste of expensive silicon and adds up to cost significantly (25% is a ton of money)
You are talking about the die in one sentence, and then in parenthesis just mention 25% is a ton of money? It would be generous of me to assume you are talking about the core area tbh, though if it was some other people who typed that I would have just made that assumption lol
Then also, I'm sure it's definitely not waaaay smaller. Definitely not a single digit % number.
Using very optimistic calculations, at best it looks like on AVX-512 is ~15% of a skylake server core. A skylake core looks to be ~ 2/3 of a skylake "block". There's a lot of stuff on the CPU that's not just skylake "blocks" but let's just ignore that. It very conceivably can be a single digit % number, even if the number is 25% as you think it is.
 

Geddagod

Golden Member
Dec 28, 2021
1,159
1,033
106
Intel might have a marginal lead in client with their upcoming products.
Lol
But, AMD's upcoming cores appear to be better suited for data center than Intel's upcoming cores. Diamond rapids may not match Zen5 series in overall server performance and/or efficiency.
DMR not matching Zen 5 would be kinda pathetic since DMR would be launching pretty much near Zen 6
don't think the transistors are in it to enable it, right ? While googling on the subject, I found this from tomshardware.com
That doesn't mean the transistors won't/can't be there....
AMD drops performance in exchange for area/cost for the Zen4c cores. This leaves Intel in a situation where they need a faster Atom or a more efficient Cove core, but they have neither.
Why?
Ya. Agree. Nothing special. But still gonna be light years ahead of Zen5 I presume.
"I presume"
LionCove+ will be something like RaptorCove compared to GoldenCove or RedwoodCove. Nothing more.
Uhh I think Bionc said something about changes to the L0 and L1, but I could be misremembering
 

dullard

Elite Member
May 21, 2001
25,111
3,480
126
You are talking about the die in one sentence, and then in parenthesis just mention 25% is a ton of money?
So many of the arguments here are simple misunderstandings like that. People here tend to change subject midsentence and not tell anyone about the change of subject. Or, my pet peeve, use a pronoun that doesn't refer to anything remotely close to the sentence the pronoun is in. The way I read his post was that paragraph was talking about dies, and thus a 25% cost would naturally be assumed to be referring to the entire die cost.
 

Geddagod

Golden Member
Dec 28, 2021
1,159
1,033
106
So many of the arguments here are simple misunderstandings like that. People here tend to change subject midsentence and not tell anyone about the change of subject. Or, my pet peeve, use a pronoun that doesn't refer to anything remotely close to the sentence the pronoun is in. The way I read his post was that paragraph was talking about dies, and thus a 25% cost would naturally be assumed to be referring to the entire die cost.
Yup. Regardless, idk why he was so mad, other than the "I doubt it is lmao", nothing in my reply was thaaaat annoying. And that was referring to the core, not the total die.
But whatever, I* don't use this site much anymore due to how many times it loads super slowly, or just doesn't load at all.
 

SiliconFly

Golden Member
Mar 10, 2023
1,056
541
96
Once done benchmarking, put the CPUs in water and measure their "density" using the Archimedes Principle
I know it sounds like a stretch, but I think it is kinda possible in theory to calculate the transistor density of a cpu using Archimedes principle if we know the volume of a transistor (and other materials like packaging, etc). All we need is a very large container filled with water and a lot of cpus to calculate the displacement. Then subtract and divide. Voila!
 
Reactions: igor_kavinski

SiliconFly

Golden Member
Mar 10, 2023
1,056
541
96
Lmao Xino said ARL will be the same perf as RPL or maybe a singe digits level improvement (prob referring to ST)
Well, generally speaking, there are 3 possibilities for ARL...

(1) There is a very high probability that ARL might have a clock regression of up to 10% to 15%. Or maybe not. Hard to say at this point. If a clock regression is there AND if LNC's IPC gains are in the order of 20% to 25% only (which sounds reasonable), then we may end up with ARL having only single digit level IPC gains. Imho, thats not exactly a bad assessment.

(2) Next is, similar to MILD's claims, if LNC ends up having massive IPC gains in the order of 30% or 40%, and there isn't much clock regression, then ARL is gonna be awesome. The likelihood of this happening isn't that high if you ask me. But quite a possibility.

(3) Then there is a third but remote possibility that ARL might have a slight performance regression over RPL, cos RPL screams at a mind-numbing clock of 6.2 GHz. And if LNC's IPC gains aren't large enough, we may end up with something very similar to MTL, a slight performance regression. But the probability of something like this happening is pretty low. But still a possibility.
 
Reactions: Tlh97 and hemedans

Ghostsonplanets

Senior member
Mar 1, 2024
384
658
96
Lmao Xino said ARL will be the same perf as RPL or maybe a singe digits level improvement (prob referring to ST)
That would be really unfortunate for a brand new generation. Specially coming after the impressive Sunny and Golden Cove generations, which both had ~20% IPC increase over the past uArch gen.

But, quite frankly, I'm more interested into how Lunar Lake will shape up than ARL. The return of low power x86 in the vein of Core M is much needed to fight against Apple M and X Elite. Intel ST performance is already very competitive so that a single digit uplift would still maintain them in the fight. But figuring high performance at low power and with good efficiency is key and hence why LNL is such an interesting prospect to me.
 
Reactions: Tlh97 and Thibsie

Geddagod

Golden Member
Dec 28, 2021
1,159
1,033
106
Oh wait their accounts have all been suspended when was this lmao
No it hasn't. It was today.
Then there is a third but remote possibility that ARL might have a slight performance regression over RPL, cos RPL screams at a mind-numbing clock of 6.2 GHz.
Xino claims ARL might get up to 5.6ghz, but I doubt he is talking about the 14900ks, most people when comparing generations don't include the KS parts.
That would be really unfortunate for a brand new generation. Specially coming after the impressive Sunny and Golden Cove generations, which both had ~20% IPC increase over the past uArch gen.
Might be closer to 10% than 20%, unfortunately. I agree though, after all the LNC hype...
But figuring high performance at low power and with good efficiency is key and hence why LNL is such an interesting prospect to me.
Intel's low power optimization is just so cooked it's wild. Crossing my fingers for LNL (though I prob won't end up getting it anyway).
 

eek2121

Platinum Member
Aug 2, 2005
2,931
4,027
136
Never really bothered to to get more in-depth with the subject, but my basic understanding is ML based tasks in personal computing will be split in two categories:
  • "Low" compute tasks where efficiency is important, such as video call background blur, noise reduction, recognition, translation, dictation, grammar & auto correct etc. The NPU should handle them, so it needs to be scaled to their scope and made as efficient as possible.
  • Heavy compute tasks using generative models (language, multimedia, science & engineering) where performance is important. These will leverage the GPU mostly, because this way the compute area can be used for both AI and graphics, which is a good compromise for a consumer chip.

Meh, both can be used for point #2. I am actually more curious if this becomes a move back to CMT. Put enough instructions and speed on the NPU and suddenly the CPU doesn’t need to have an FPU anymore. 🙃
 

SiliconFly

Golden Member
Mar 10, 2023
1,056
541
96
Wouldn't that break compatibility with existing software? If they somehow divert those instructions from CPU decoder to NPU, there would be a latency hit involved.
I think it's possible. Not very sure though. If the FP instructions are removed, the CPU will throw an exception when it doesn't recognize the instruction, the OS has to catch it and divert it to the NPU. There's gonna be a lot of latency involved.

And if I'm right, too many programs use FP these days I think. Possibly even browsers and apps like MS Office and lots of games too I think. Again, not sure though. If thats the case, then we're stuck with FP forever!
 

naukkis

Senior member
Jun 5, 2002
718
597
136
Meh, both can be used for point #2. I am actually more curious if this becomes a move back to CMT. Put enough instructions and speed on the NPU and suddenly the CPU doesn’t need to have an FPU anymore. 🙃

NPU is exact opposite of FPU. Floating point instructions are varying point so number expression range can be huge, like from 2^-64 to 2^64 and calculations can be done between opposite extremes. NPU instead is relying extremely short integers, like 4 and 8 bits - only 16 or 256 values. If we think normal integer(fixed point math) cpu as middle point NPU is other way and FPU the other, they are absolutely not alternatives.
 

H433x0n

Senior member
Mar 15, 2023
915
993
96
An fmax of 5.6ghz would be fine.

At this point they should be receiving ES2 for a launch in October. If the practical IPC is ~15% I guess that makes sense once taking into account the 2-4% penalty from tile overhead.

Unfortunately in typical leaker fashion Xino wasn’t very specific. Was this test at JEDEC-4800? Was it with most recent stepping of SoC tile? Are the IPC figures from mobile or desktop ARL? Which version of RPL is he talking about? The 13900K or a 14900KS 1T performance?

Expectations are pretty low but if IPC bump is <15% then they deserve to get clobbered.
 
Reactions: Tlh97
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |