Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Page 270 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
677
559
106






As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



Comparison of upcoming Intel's U-series CPU: Core Ultra 100U, Lunar Lake and Panther Lake

ModelCode-NameDateTDPNodeTilesMain TileCPULP E-CoreLLCGPUXe-cores
Core Ultra 100UMeteor LakeQ4 202315 - 57 WIntel 4 + N5 + N64tCPU2P + 8E212 MBIntel Graphics4
?Lunar LakeQ4 202417 - 30 WN3B + N62CPU + GPU & IMC4P + 4E08 MBArc8
?Panther LakeQ1 2026 ??Intel 18A + N3E3CPU + MC4P + 8E4?Arc12



Comparison of die size of Each Tile of Meteor Lake, Arrow Lake, Lunar Lake and Panther Lake

Meteor LakeArrow Lake (20A)Arrow Lake (N3B)Arrow Lake Refresh (N3B)Lunar LakePanther Lake
PlatformMobile H/U OnlyDesktop OnlyDesktop & Mobile H&HXDesktop OnlyMobile U OnlyMobile H
Process NodeIntel 4Intel 20ATSMC N3BTSMC N3BTSMC N3BIntel 18A
DateQ4 2023Q1 2025 ?Desktop-Q4-2024
H&HX-Q1-2025
Q4 2025 ?Q4 2024Q1 2026 ?
Full Die6P + 8P6P + 8E ?8P + 16E8P + 32E4P + 4E4P + 8E
LLC24 MB24 MB ?36 MB ??8 MB?
tCPU66.48
tGPU44.45
SoC96.77
IOE44.45
Total252.15



Intel Core Ultra 100 - Meteor Lake



As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)

 

Attachments

  • PantherLake.png
    283.5 KB · Views: 23,967
  • LNL.png
    881.8 KB · Views: 25,439
Last edited:

naukkis

Senior member
Jun 5, 2002
716
589
136
The ability to use different data types is theoretically possible with NPUs. But, the main issue is memory, bandwidth, and power used. If you only need 4 bits (which AI often only needs 4 bits or 8 bits), then using something set up for 512 bits is quite a waste. Using 512 bits when your application needs 4 bits will require 128x more memory, will have to move 128x more data around, and will have to process 128x more of that data, using much more power. All while only being able to use much smaller AI models due to those limits. So, it isn't really efficient to use something set up for 512 bits with 4 bits

The reverse is true too. If you have an NPU optimized and designed for say, 4-bit math, and need 16-bit data, then you need to transfer that data around in 4 chunks which takes more time. Then you have memory to store only 1/4th the data. It can work, but just won't be as performant as you want.

Whole NPU meaning is to make optimized hardware for very short datatypes with simplified instructions. FPU does very complex instructions with long datatypes being exactly opposite optimizing point to NPU. And FPU isn't actually co-processor anymore, it's a part of a ISA and it cannot removed without losing compatibility. And as it's a part of a ISA pretty much every program used it by default if float or double type variables are used as it's faster use them with FPU than with integer units.
 
Reactions: Tlh97 and moinmoin

Hitman928

Diamond Member
Apr 15, 2012
5,372
8,197
136
Not sure if this is news, but Microsoft just announced "Surface AI PCs". They utilize Intel MTL and come in 2 models, the Surface Pro 10 for Business and Surface Laptop 6 for Business.

 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,027
136
Whole NPU meaning is to make optimized hardware for very short datatypes with simplified instructions. FPU does very complex instructions with long datatypes being exactly opposite optimizing point to NPU. And FPU isn't actually co-processor anymore, it's a part of a ISA and it cannot removed without losing compatibility. And as it's a part of a ISA pretty much every program used it by default if float or double type variables are used as it's faster use them with FPU than with integer units.
While I do agree, technology does evolve over time and this is what I am getting at. I was thinking in terms of literal decades when I said that.

The NPU will likely take on more responsibility as time goes on. I will be shocked if x86 20 years from now as it does today. New wafer costs keep going up and having duplicate functionality on multiple parts of the package wastes die space.

I am terrible at predicting , but if I had to I would say the lines between the NPU, CPU cores, and GPU cores are going to blur.

IIRC someone mentioned AMD has a patent for catching exceptions for missing instructions and redirecting the workload off chip. The reason it got brought up is because of Intel’s missing AVX-512. I could absolutely see something similar happening here.

Someone mentioned latency, but once the compilers are changed, there would be no performance penalty and performance may actually increase.

Me personally? I want socketed NPUs so competitors can play too, but of course we won’t get that. We are getting PCIE accelerators soon, however.

We are still in the very early days of AI. If you compare AI to the invention of the internet, we equivalent to where they were in the 70s.
 
Reactions: Tlh97 and Thibsie

adroc_thurston

Platinum Member
Jul 2, 2023
2,422
3,413
96
Speculation was about replacing FPU with NPU
did you just invent GPUs.
like dawg we already invented silly parallel SIMD crunchers. in 2005. In Xenos, from Xbox 360.
The NPU will likely take on more responsibility as time goes on.
It does dumb matrix math.
are you daft
I will be shocked if x86 20 years from now as it does today. New wafer costs keep going up and having duplicate functionality on multiple parts of the package wastes die space.
THE FUTURE IS FUSION™
but if I had to I would say the lines between the NPU, CPU cores, and GPU cores are going to blur.
They'll be more clear-cut than ever.
 

DavidC1

Member
Dec 29, 2023
184
252
76
Technology wise the Atom team's work indeed has been more interesting to follow for quite some time now, like for a decade by now?
After falling out of the spotlight with Silvermont/Airmont, they've been working quietly behind the scenes.

They've been doing a consistent cadence of using new ideas in one generation and optimization/expansion the next.

Bonnell - 2 way in order
New Ideas: Silvermont - 2 way out of order, lowered pipeline stages
New Ideas: Goldmont - 3 way out of order(added OoOE FP) + 16KB predecode
Exp/Opt: Goldmont Plus - 3 way + wider backend, quadrupled(64KB) predecode

New Ideas: Tremont - Clustered, 2x3 way, 128KB predecode, greatly improved branch prediction
Exp/Opt: Gracemont - Improved clustered 2x3 way, so the throughput is effective 6-wide. Predecode cache replaced with OD-ILD that predecodes on the fly for better performance under more demanding workloads and higher area/power efficiency

New Ideas?: Skymont - Clustered 3x3 way + ??
Darkmont - 18A shrink
Exp/Opt: Arctic Wolf - Clustered 3x3 way with backend optimizations?

Gracemont is already superior to Golden Cove in the fetch department, where it can fetch 2x32B(2x16B from the OD-ILD) to feed the two clustered decoders, while Goldemont Cove can only work with 1x32B for its 6-wide decoders.

Not only SMT is going to go away, eventually I suspect uop caches will too, as the primary purpose of the uop cache is to increase clocks while attempting to minimize branch misprediction penalties that comes with increased pipeline stages.
 
Last edited:

Henry swagger

Senior member
Feb 9, 2022
386
244
86
After falling out of the spotlight with Silvermont/Airmont, they've been working quietly behind the scenes.

They've been doing a consistent cadence of using new ideas in one generation and optimization/expansion the next.

Bonnell - 2 way in order
New Ideas: Silvermont - 2 way out of order, lowered pipeline stages
New Ideas: Goldmont - 3 way out of order(added OoOE FP) + 16KB predecode
Exp/Opt: Goldmont Plus - 3 way + wider backend, quadrupled(64KB) predecode

New Ideas: Tremont - Clustered, 2x3 way, 128KB predecode, greatly improved branch prediction
Exp/Opt: Gracemont - Improved clustered 2x3 way, so the throughput is effective 6-wide. Predecode cache replaced with OD-ILD that predecodes on the fly for better performance under more demanding workloads and higher area/power efficiency

New Ideas?: Skymont - Clustered 3x3 way + ??
Darkmont - 18A shrink
Exp/Opt: Arctic Wolf - Clustered 3x3 way with backend optimizations?

Gracemont is already superior to Golden Cove in the fetch department, where it can fetch 2x32B(2x16B from the OD-ILD) to feed the two clustered decoders, while Goldemont Cove can only work with 1x32B for its 6-wide decoders.

Not only SMT is going to go away, eventually I suspect uop caches will too, as the primary purpose of the uop cache is to increase clocks while attempting to minimize branch misprediction penalties that comes with increased pipeline stages.
raichu said skymont will be 8 wide so 4×4 way. And is targeting rocket lake to golden cove ipc
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |