Info LPDDR6 @ Q3-2024: Mother of All CPU Upgrades

Tigerick · Feb 28, 2024

FlameTail said:
Dubious leaks.

NP, we should know more in the future. I have no doubt though

So, you still think LPDDR6 will become popular in 2027 even there are reports of coming in end of 2025?

FlameTail · Feb 28, 2024

Tigerick said:
So, you still think LPDDR6 will become popular in 2027 even there are reports of coming in end of 2025?

What I am saying is:

Yes, LPDDR6 will come in late 2025, with, support from Snapdragon 8 Gen 5 and Dimensity 9500. It will see wide adoption in flagship Android phones of 2026.

However, it will not be the full fat speedy LPDDR6-12800. I believe for initially (late 2025, entire 2026), it will be limited to a lesser speed such as LPDDR5-11000.

Full speed LPDDR6-12800 won't be coming until 2027. That's what I think.

Tigerick · Feb 28, 2024

FlameTail said:
What I am saying is:

Yes, LPDDR6 will come in late 2025, with, support from Snapdragon 8 Gen 5 and Dimensity 9500. It will see wide adoption in flagship Android phones of 2026.

However, it will not be the full fat speedy LPDDR6-12800. I believe for initially (late 2025, entire 2026), it will be limited to a lesser speed such as LPDDR5-11000.

Full speed LPDDR6-12800 won't be coming until 2027. That's what I think.

So when I shown you the slide about Q4 2025, you just simply rejected the idea and said dubious leaks. And now you said you are expecting SD Gen 5 coming in LPDDR6. I really wish you would clear your mind before typing anything...

You should check Samsung roadmap when did Samsung started manufactured 3.2Gbps and 6.4Gbps and compare launching date of each memory

Tigerick · Mar 2, 2024

8G4 AI/DSP 및 몇가지 추가 소식 - 갤럭시 마이너 갤러리

현재 확인된 8G4의 주파수는 4ghz 이상아래는 다른 유출자발 작년 12월자 q/a (sm8635,sm7675를 작년 10월에 가장 먼저 유출)Q. 소비전력은 어때요 ?A. 최종 버전이 확정되지 않습니다. 다만 현재

gall.dcinside.com

Here comes the source of the leak about X Elite Gen 2, what is interesting is the leaker said there are going to have 3 SoC for mobile PC. Based on my understanding, let's list on the table below:

Date	Phone SoC	CPU cores	Memory Interface	Date	Mobile SoC	Node	Memory Interface
Q4 2024	8 Gen 4	Phoenix 2+6	64-bit LPDDR5T-9600	2025	X G1 ?	N3E	64-bit LPDDR5T-9600

Q4 2025	8 Gen 5	Pegasus 2+6	64-bit LPDDR6-12800	2026 ?	X G2 ?	N3E	64-bit LPDDR6-12800

		Pegasus 12P +++		2026 ?	X Elite G2 ?	N3P ?	128-bit LPDDR6-12800

Q4 2024	D 9400	X5+X4 4+4	64-bit LPDDR5T-9600	2025	?	N3E	64-bit LPDDR5T-9600

Q4 2025	D 9500	X6+X5 4+4 ?	64-bit LPDDR6-12800	2026 ?	?	N3P	64-bit LPDDR6-12800

Darkmont · Mar 7, 2024

igor_kavinski said:
DDR5 not going beyond 8800 MT/s is disappointing.

DDR6, whenever it gets released, is going to have terrible CAS latency compared to DDR5 and the same old crap will ensue where you have to wait a year or two for the newer RAM standard to get fast enough to leave the older standard behind. This, in my opinion, is the great RAM speed scam.

Why do they even release a new generation of memory when the previous more mature standard already beats it at its fastest available speeds? DDR5-4800 vs. DDR4-4000 was a sick joke.

Why don't these "brilliant" engineers actually try to bring the latencies down to an acceptable level before proudly touting their new technologies? It's actually not their fault. They would never release something half baked into the world if it were up to them. It's the profit craving executives/bean counters calling the shots at these big companies. The bane of every techie's existence.

/rant

This is a fundamental misunderstanding of how modern DDR works or is developed. CL and other primary timings within the same bank, like tRCD, tRP, or tRAS's impacts on performance can be minimized with enough parallelism at the Bank, Bank Group, and Rank level [Figure 1]. You'll find that in this research paper characterizing how different DRAM architectures perform, that in workloads where latency is paramount, DDR4 outperformed DDR3 even with a higher CL latency in ns [Figure 2]. Much of what determines latency in modern DRAM is no longer timing delays alone, but how concurrently you can service requests from the CPU [Figure 3]. Those queuing delays are caused by resource conflicts at the bank and bank group level, where accesses must be serialized within a Bank/BG, which takes longer than going to a different bank or bank group. This also doesn't address the other ways that DDR5 hides latency through a wider prefetch, split 32-bit subchannels, and higher bandwidth, which itself lowers latency since the closer you approach your DRAM's theoretical limits, the higher the latency gets as you work your DRAM's resources more and more [Figure 4]. https://user.eng.umd.edu/~blj/papers/memsys2018-dramsim.pdf

adroc_thurston · Mar 7, 2024

Darkmont said:
This is a fundamental misunderstanding of how modern DDR works or is developed. CL and other primary timings within the same bank, like tRCD, tRP, or tRAS's impacts on performance can be minimized with enough parallelism at the Bank, Bank Group, and Rank level [Figure 1]. You'll find that in this research paper characterizing how different DRAM architectures perform, that in workloads where latency is paramount, DDR4 outperformed DDR3 even with a higher CL latency in ns [Figure 2]. Much of what determines latency in modern DRAM is no longer timing delays alone, but how concurrently you can service requests from the CPU [Figure 3]. Those queuing delays are caused by resource conflicts at the bank and bank group level, where accesses must be serialized within a Bank/BG, which takes longer than going to a different bank or bank group. This also doesn't address the other ways that DDR5 hides latency through a wider prefetch, split 32-bit subchannels, and higher bandwidth, which itself lowers latency since the closer you approach your DRAM's theoretical limits, the higher the latency gets as you work your DRAM's resources more and more [Figure 4]. https://user.eng.umd.edu/~blj/papers/memsys2018-dramsim.pdf View attachment 94960
View attachment 94964
View attachment 94965
View attachment 94966

real and feral my man.
real and feral.

Tuna-Fish · Mar 9, 2024

Darkmont said:
You'll find that in this research paper characterizing how different DRAM architectures perform, that in workloads where latency is paramount, DDR4 outperformed DDR3 even with a higher CL latency in ns [Figure 2]. Much of what determines latency in modern DRAM is no longer timing delays alone, but how concurrently you can service requests from the CPU

This is true, if there is sufficient parallelism in your workload that you can make use of those concurrent results. This is true of, for example, terminally OO java/C# business software that is generally both very thread-parallel and latency-bound.

It is not true of a lot of latency-bound software that doesn't have a lot of TLP, notably, games. For a lot of software, the measure that matters most is just how quickly will the memory return a response to a single request into a closed row, and for that early implementations of new memory standards are generally somewhat worse than the late high-end memory of the last standard.

Tigerick · Mar 11, 2024

Look like LPDDR6 is coming sooner than I think. Rumor said that JEDEC is expected to finalize the specifications of the next-gen LPDDR6 memory by the third quarter of 2024. Yeah, this year.

And the article mentioned upcoming 8 Gen 4 going to employ it. I think it is too rush for this year shipment. But next year, all major OEMs going to ship CPU/APU with LPDDR6 solutions.

Samsung and SK Hynix are two major suppliers of LPDDR6. As I mentioned in the front page, Micron will be MIA....

Tigerick · Mar 11, 2024

LPDDR6 표준 확정 임박...온디바이스 AI 메모리 경쟁 초읽기

올 3분기 스마트폰과 정보기술(IT)기기에 쓰이는 저전력 램 메모리(LPDDR)의 최신 표준이 확정된다. ‘LPDDR6’로, 기존 세대에서 5년만의 업데이트다. LPDDR6가 모바일 기기에서 인공지능(AI) 연산을 직접 지원하는 ‘온디바이스 AI’ 핵심 메모리로 자리매

www.etnews.com

Here is the original Korean article, no memory speed is mentioned. Samsung has been testing 6.4 Gbps with 1bnm tech-node since last year, hopefully we should see LPDDR6-12800 soon...it is going to be game changer.

With 64-bit memory bus, 12GB should be standard. And that's the times Apple will upgrade base memory from 8GB to 12GB, thanks god.

With 128-bit memory bus, 24GB is the standard. However with 50% extra memory bandwidth, OEMs have to put in much more transistors to feed the bandwidth. That's why the die area will get bigger even with more advanced process. That's why AMD is splitting the x86 SoC into two chiplets...Hmm, if AMD is delaying Sarlak until next year, could Sarlak be supporting both 256-bit LPDDR5x and 192-bit LPDDR6?

soresu · Mar 11, 2024

itsmydamnation said:
Because DRAM cells haven't improved in like 30 years

Currently DRAM has entered a significant scaling conundrum which can only be surmounted by a pretty significant change in structure to a capacitorless device design.

Such a design change will allow for drastically higher data retention times (>400 seconds), as well as open the door to multi layer 3D scaling (similar to 3D/V NAND flash) too.

But it's not going to happen over night sadly.

By the time it's ready for production some MRAM variant might entirely displace DRAM.

That being said, once it is ready for production we should see scaling on DRAM increase pretty fast again, at least capacity wise.

soresu · Mar 11, 2024

Tigerick said:
hopefully we should see LPDDR6-12800 soon...it is going to be game changer

That's only about 1.33x faster than the LPDDR5T speeds announced by SK Hynix January 2023 and shipped the following November.

12800 is certainly an improvement, but hardly a "game changer" unless it is achieving that bandwidth at significantly better perf/watt than LPDDR5T.

soresu · Mar 11, 2024

Tigerick said:
However with 50% extra memory bandwidth, OEMs have to put in much more transistors to feed the bandwidth. That's why the die area will get bigger even with more advanced process. That's why AMD is splitting the x86 SoC into two chiplets

I was under the impression it had more to do with data routing and associated physical pin arrangement.

ie you literally can't just keep shrinking the memory I/O because of the physical pin constraints.

Tigerick · Mar 11, 2024

soresu said:
That's only about 1.33x faster than the LPDDR5T speeds announced by SK Hynix January 2023 and shipped the following November.

12800 is certainly an improvement, but hardly a "game changer" unless it is achieving that bandwidth at significantly better perf/watt than LPDDR5T.

soresu said:
I was under the impression it had more to do with data routing and associated physical pin arrangement.

ie you literally can't just keep shrinking the memory I/O because of the physical pin constraints.

Clearly you have underestimate the impact of LPDDR6. I am mostly referring to PC market, with 64-bit memory bus LPDDR6, OEMs could design smaller SoC with the memory bandwidth of past 128-bit LPDDR5X. By removing half of memory pin connector, OEMs like NV could design SoC with Xbox series S's graphics performance with better battery life..

Doug S · Mar 11, 2024

Tigerick said:
Clearly you have underestimate the impact of LPDDR6. I am mostly referring to PC market, with 64-bit memory bus LPDDR6, OEMs could design smaller SoC with the memory bandwidth of past 128-bit LPDDR5X. By removing half of memory pin connector, OEMs like NV could design SoC with Xbox series S's graphics performance with better battery life..

Clearly you are OVERESTIMATING the impact of LPDDR6, given that you started a whole thread with crazy hyperbole like "the mother of all CPU upgrades" and believe everyone is going to be rushing to implement it as soon as possible cost be damned! Never heard of diminishing returns, I guess?

Most tasks are not limited by memory bandwidth with LPDDR5/5X/5T, meaning they would not benefit at all. Sure for stuff that is running up against that limit it will be a nice bump, but hardly the game changer you are hyping it to be.

Did you say the same thing about LPDDR5, given that it was a similar (if not larger since there was no LPDDR4T) bump but coming from a lower bound? If you didn't hype LPDDR5 this much, why do you think LPDDR6 will be more impactful than LPDDR5 was?

FlameTail · Mar 11, 2024

Doug S said:
Clearly you are OVERESTIMATING the impact of LPDDR6, given that you started a whole thread with crazy hyperbole like "the mother of all CPU upgrades" and believe everyone is going to be rushing to implement it as soon as possible cost be damned! Never heard of diminishing returns, I guess?

Did you say the same thing about LPDDR5, given that it was a similar (if not larger since there was no LPDDR4T) bump but coming from a lower bound? If you didn't hype LPDDR5 this much, why do you think LPDDR6 will be more impactful than LPDDR5 was?

Yeah this is something that I have been wondering about this thread as well.

Compute power increases with time, and it natural that memory bandwidth increases too (if anything, memory bandwidth is not as increasing as fast the compute is). So it's natural to have new memory standards come up with time. Now we have LPDDR5X, then will come LPDDR6, then LPDDR6X, then LPDDR7... the cycle continues.

dr1337 · Mar 11, 2024

Darkmont said:
This is a fundamental misunderstanding of how modern DDR works or is developed.

And you linked a study that simulates a comparison of DDR3 to DDR4 at nearly identical timings, but with the DDR4 sporting higher transfer rates...

Bit different of a case compared to what consumers saw with DDR5-4800 vs. DDR4-4000. DDR3 could get into 40ns range easily, DDR4 eventually got into 30ns ranges especially at 4000mhz, DDR5 is still at 50ns range at best and usually 60ns on average. In other terms DDR4 is actually about two generations behind on what was the average latency trend. Even DDR5-8000 doesn't crack the 50ns barrier despite its crazy bandwidth.

Latency in DRAM has objectively taken a step back, though it's true even 60ns (and even higher) is absolutely acceptable for 99% of tasks out there. With the latest generation between DDR4 and DDR5, there is objectively a gap in progression with relation to latency vs. bandwidth.

Darkmont · Mar 11, 2024

Tuna-Fish said:
This is true, if there is sufficient parallelism in your workload that you can make use of those concurrent results. This is true of, for example, terminally OO java/C# business software that is generally both very thread-parallel and latency-bound.

It is not true of a lot of latency-bound software that doesn't have a lot of TLP, notably, games. For a lot of software, the measure that matters most is just how quickly will the memory return a response to a single request into a closed row, and for that early implementations of new memory standards are generally somewhat worse than the late high-end memory of the last standard.

1. A lot of thread parallelism isn't necessarily required for sufficient DRAM parallelism, though it much helps.
2. I don't know what games you're talking about, but recent games of the past 5 years+ have been making more and more use of threads when running game logic. It goes without saying that modern games are absolutely not 1T/ST Cinebench style applications.
3. Video Games are inherently unpredictable, random access workloads because of well, how you play them, and differing events that can take place in a run. I'd wager that most are also very memory hungry too, considering that they're in constant need of data for what's taking place.

The following are two quick profiles of Cinebench R23 vs Forza Horizon 4 taken by someone I know on their 13900K. No spec changes between the snapshots. Perfect? No, but I believe them to be good examples of my point that games are by their nature speculative and memory heavy (You can also see fewer threads used yet more DRAM related stalls). You can also see here with a 13900K that games, in CPU-bound scenarios of course, are willing and able to eat more bandwidth when provided: https://kingfaris.co.uk/blog/13900k-ram-oc-impact/intro

Darkmont · Mar 11, 2024

dr1337 said:
And you linked a study that simulates a comparison of DDR3 to DDR4 at nearly identical timings, but with the DDR4 sporting higher transfer rates...

Bit different of a case compared to what consumers saw with DDR5-4800 vs. DDR4-4000. DDR3 could get into 40ns range easily, DDR4 eventually got into 30ns ranges especially at 4000mhz, DDR5 is still at 50ns range at best and usually 60ns on average. In other terms DDR4 is actually about two generations behind on what was the average latency trend. Even DDR5-8000 doesn't crack the 50ns barrier despite its crazy bandwidth.

Latency in DRAM has objectively taken a step back, though it's true even 60ns (and even higher) is absolutely acceptable for 99% of tasks out there. With the latest generation between DDR4 and DDR5, there is objectively a gap in progression with relation to latency vs. bandwidth.

Firstly, I don't know how this bit on DDR4 vs DDR3 disproves my point. Secondly, I know for a fact that you're basing those latency numbers from Aida, a benchmark that pulls paltry bandwidth and measures idle, not loaded, sequential access times from DRAM, which is nothing like games that are inherently speculative and random access. In this DDR4 vs DDR5 study, yes, DDR5 at iso bandwidth to DDR4 has slightly higher sequential access times, but has lower random latency, something games take advantage of. I'm not saying DRAM latency hasn't taken a step back in a few ways, it has, but the bandwidth improvements have far outweighed the minor increase to sequential latencies. https://drive.google.com/file/d/1LelSBpNknT9wgqX3z2j6OUh2Md_3Vx06/view

Darkmont · Mar 11, 2024

I'd also like to apologize if I've come off as crass or condescending in any of my posts. Simply attempting to argue my point in good faith.

Tigerick · Mar 12, 2024

Clearly, we have people don't understand the effect of using 64-bit memory bus with same bandwidth. Here is the candidate of next gen 64-bit LPDDR6, Steam Deck 2:-

In November 2023, Valve Product Designer Lawrence Yang told Bloomberg that the Steam Deck 2 will feature a "next generation" power upgrade but won't be available for two or three years. Yang also told Axios, "In the next two or three years, we're confident that something will be what we would consider appropriate for a proper Steam Deck 2 device."

The only way to get power upgrade within same TDP is upgrade to newer process AND using 64-bit LPDDR6. By reducing half of memory pin connector thus power savings, Valve and AMD are able to double the FP32 within same power target. Based on timings, we should be expecting Steam Deck 2 in early 2026.

ATM, NV is preparing to launch ARM SoC with similar memory interface, I am expecting around 4TF which is similar to XSS. There is rumor about NV entering handheld console, the above SoC is perfect choice.

I haven't mention about more powerful memory interface yet. TLDR, there is lots of speculation, but I will keep updating this page...so stay tuned.

adroc_thurston · Mar 12, 2024

Tigerick said:
The only way to get power upgrade within same TDP is upgrade to newer process AND using 64-bit LPDDR6. By reducing half of memory pin connector thus power savings, Valve and AMD are able to double the FP32 within same power target. Based on timings, we should be expecting Steam Deck 2 in early 2026.

All premium FF AMD SoCs are 128b.
Don't.

Tigerick · Mar 12, 2024

adroc_thurston said:
All premium FF AMD SoCs are 128b.
Don't.

Too early for u, sorry...

FlameTail · Mar 15, 2024

What are DQs?

soresu · Mar 15, 2024

FlameTail said:
View attachment 95424
What are DQs?

My initial thought was data query.

igor_kavinski · Mar 16, 2024

soresu said:
My initial thought was data query.

It's some serious engineering sheee-it. I tried to understand it but then decided that my brain is too fragile to expend on stuff I have no use for.

Info LPDDR6 @ Q3-2024: Mother of All CPU Upgrades

Senior member

Platinum Member

Senior member

Senior member

Junior Member

Platinum Member

Golden Member

Senior member

Senior member

Platinum Member

Platinum Member

Platinum Member

Senior member

Platinum Member

Platinum Member

Senior member

Junior Member

Junior Member

Junior Member

Senior member

Platinum Member

Senior member

Platinum Member

Platinum Member

Lifer