Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 51 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Jul 27, 2020
16,824
10,781
106
has intel made a breakthrough because all I've seen regarding this new approach is criticisms.
If the iGPU is blocked from using the L4 cache, the only reason for its existence may be to hide increased RAM access latency. Also, there was something about the L4 cache contents surviving between reboots AND shutdowns (HBM3??). THAT could be a killer feature if done right with something like 512MB L4.
 

Saylick

Diamond Member
Sep 10, 2012
3,217
6,586
136
My thinking goes, there will be levels within the L3, going from lowest latency to higher latency. MFU data in the lowest rungs and lesser used data ends up in the higher/slower rungs of the ladder.
Hm, isn't that like dynamically adjusting the levels of associativity in the cache?
 

Mopetar

Diamond Member
Jan 31, 2011
7,941
6,242
136
More like he predicted either slc (see apple) or l4 cache.

Apple's SLC is an L3 cache that can also be accessed by the GPU cores, but it's not really an extra level of cache any more than v-cache is.

Haswell is something that actually had an L4 cache in the form of some on-package eDRAM that sat between main system memory and the on-die L3 cache for certain models.

Although it was huge (128 MB) even by today's standards, having it off die made the hit time too high for it to be useful in most cases. But it was actually a distinct and separate forth level of cache.
 

DrMrLordX

Lifer
Apr 27, 2000
21,709
10,984
136
Apple's SLC is an L3 cache that can also be accessed by the GPU cores, but it's not really an extra level of cache any more than v-cache is.


It can be accessed by anything on the entire SoC, which makes it different than a traditional L3 cache. I think the idea here is that the cache Anand was predicting would be shared across multiple devices.

Haswell is something that actually had an L4 cache in the form of some on-package eDRAM that sat between main system memory and the on-die L3 cache for certain models.


Broadwell also had that. Crystalwell. Sufficiently-fast DDR4 basically killed it.
 
Reactions: Tlh97 and Ajay

A///

Diamond Member
Feb 24, 2017
4,352
3,154
136
If the iGPU is blocked from using the L4 cache, the only reason for its existence may be to hide increased RAM access latency. Also, there was something about the L4 cache contents surviving between reboots AND shutdowns (HBM3??). THAT could be a killer feature if done right with something like 512MB L4.
latency from the tiles? perhaps that was my theory a while back but that was before i learned of the igpu being blocked according to rumors, neither of which have been confirmed by intel. how does hbm3 handle persistent state changes for not wearing out?
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,804
3,268
136
It can be accessed by anything on the entire SoC, which makes it different than a traditional L3 cache. I think the idea here is that the cache Anand was predicting would be shared across multiple devices.




Broadwell also had that. Crystalwell. Sufficiently-fast DDR4 basically killed it.
except alot of games that scale well with vcache also scalded well with crystalwell. its just stuff that doesnt get benchmarked like late game simulation time etc
 

BorisTheBlade82

Senior member
May 1, 2020
667
1,022
136

SteinFG

Senior member
Dec 29, 2021
458
521
106
Might someone give a short explainer as to what that "ladder" cache is supposed to be. I mean, the current L3 is already shared on a bidirectional ring.
So might they be adding further links in order to decrease latency and increase bandwidth?
My guess is it's some sort of analogue of intel's mesh interconnect. Made to scale past 8 cores in the future. Quick thought:
 
Jul 27, 2020
16,824
10,781
106
how does hbm3 handle persistent state changes for not wearing out?
HBM3 isn't persistent so maybe it will retain the data as long as the mobo receives current from the PSU. As soon as the AC mains switch is turned off, the data will be lost. In a laptop, the battery should be able to keep the HBM3 powered for days or even weeks.
 

A///

Diamond Member
Feb 24, 2017
4,352
3,154
136
HBM3 isn't persistent so maybe it will retain the data as long as the mobo receives current from the PSU. As soon as the AC mains switch is turned off, the data will be lost. In a laptop, the battery should be able to keep the HBM3 powered for days or even weeks.
if the state changes aren't persistent are they static? Is it saving and loading specific core data? Your post made it seem as if the l4 $ would be a new form of hibernate where the data would be stored and kept alive by trickle electricity. if the data is static and written to by the host os to store core files without reading them off slower nvme then i can see your post making sense. or is it writing what's in the ram to it for next time bypassing hibernate to nvme ssd?

old macs on system 7 used to have a shutdown mode for saving your open windows as they were back when the finder had a fixed panel at the bottom. as it stands hibernate is not friendly long term to ssd tech if you utilise it multiple times a day depending on what you've got in memory.
 
Jul 27, 2020
16,824
10,781
106
Is it saving and loading specific core data?
Don't know what approach Intel has taken but maybe a caching algo could be designed to retain data in the L4 that is almost always required, even if for 30 seconds, like during the boot sequence to make the OS boot faster or to make heavy applications like Adobe Photoshop load faster. That's how the hybrid SSHD's used their 8GB of NAND cache. The algo could assign a score to data and the data that is requested after every reboot would get more precedence in staying in the cache than something that is required once or twice a day.
 

Ajay

Lifer
Jan 8, 2001
15,636
7,966
136
Apple's SLC is an L3 cache that can also be accessed by the GPU cores, but it's not really an extra level of cache any more than v-cache is.

Haswell is something that actually had an L4 cache in the form of some on-package eDRAM that sat between main system memory and the on-die L3 cache for certain models.

Although it was huge (128 MB) even by today's standards, having it off die made the hit time too high for it to be useful in most cases. But it was actually a distinct and separate forth level of cache.
I thought, functionally, the Apple's SLC acted a bit more like a memory side cache? I’d have to find a technical analysis. I don’t think, in this scenario, it would be a victim cache - it could be used to store more speculative fetches, or as an MRU. Kind of over my skis here since I haven’t read an architectural breakdown.
 
Jul 27, 2020
16,824
10,781
106
they were good doorstops.
Bad experience?

I had a Seagate SSHD in my PS3. Made loading a bit faster. It also helps prolong the HDD life for typical users coz they will run just a few applications/games whose data can be cached inside 8GB so the heads don't have to move as often. I think there are cheap laptops that ship with Toshiba SSHDs. Their user experience would be a lot worse with plain HDDs. There are third world countries where a 500GB SSHD would make better sense to buy than a 256GB SSD. Try telling a person there that SSD will be wayyyy faster. "But only 256! I want ma PeeCee with 500 JeeBees. I got so many EmPee3ssssss and moviezzzz".
 

soresu

Platinum Member
Dec 19, 2014
2,722
1,921
136
Bad experience?

I had a Seagate SSHD in my PS3. Made loading a bit faster. It also helps prolong the HDD life for typical users coz they will run just a few applications/games whose data can be cached inside 8GB so the heads don't have to move as often. I think there are cheap laptops that ship with Toshiba SSHDs. Their user experience would be a lot worse with plain HDDs. There are third world countries where a 500GB SSHD would make better sense to buy than a 256GB SSD. Try telling a person there that SSD will be wayyyy faster. "But only 256! I want ma PeeCee with 500 JeeBees. I got so many EmPee3ssssss and moviezzzz".
The problem is that they ended up costing more engineering effort than either pure SSD or HDD, with minimal benefit and more failure points.

HDD manufacturers probably decided it was better to concentrate R&D money on multi actuator and EAMR technologies.

I don't know if they are still bothering with bit patterning or 3D magnetic recording as future strategies for 100+ TB HDDs.
 
Reactions: A///

Doug S

Platinum Member
Feb 8, 2020
2,321
3,682
136
It can be accessed by anything on the entire SoC, which makes it different than a traditional L3 cache. I think the idea here is that the cache Anand was predicting would be shared across multiple devices.

Not only that, but the CPU cores do not have direct access to DRAM at all, everything goes through the SLC. That is probably the biggest difference between Apple Silicon and Intel & AMD CPUs.

Dunno why people are so confidently wrong about stuff like Apple's System Level Cache when there is plenty of information out there that will correct their misconceptions. And, you know, the information you get from the name itself.
 
Jul 27, 2020
16,824
10,781
106
I have one in my sys. Prob owned 5 or so over the years. Never had a problem.
I've heard when they DO fail coz of flash writes exhausted, the drive stops working even if the HDD function is fine. Seems the firmware is not designed to bypass the flash. Tries to access flash, fails and then goes into a loop of retries or something. At least, that's what I can guess from reports of Seagate SSHD failures.
 

A///

Diamond Member
Feb 24, 2017
4,352
3,154
136
The problem is that they ended up costing more engineering effort than either pure SSD or HDD, with minimal benefit and more failure points.

HDD manufacturers probably decided it was better to concentrate R&D money on multi actuator and EAMR technologies.

I don't know if they are still bothering with bit patterning or 3D magnetic recording as future strategies for 100+ TB HDDs.
on fp's if the flash portion died the drive was useless when they first came out. the early drives including apple's twist with the fusions were terrible. I don't remember anyone liking those drives.
 
Reactions: Tlh97 and soresu
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |