Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

igor_kavinski · May 3, 2023

A/// said:
has intel made a breakthrough because all I've seen regarding this new approach is criticisms.

If the iGPU is blocked from using the L4 cache, the only reason for its existence may be to hide increased RAM access latency. Also, there was something about the L4 cache contents surviving between reboots AND shutdowns (HBM3??). THAT could be a killer feature if done right with something like 512MB L4.

igor_kavinski · May 3, 2023

BorisTheBlade82 said:
Might someone give a short explainer as to what that "ladder" cache is supposed to be.

My thinking goes, there will be levels within the L3, going from lowest latency to higher latency. MFU data in the lowest rungs and lesser used data ends up in the higher/slower rungs of the ladder.

Saylick · May 3, 2023

igor_kavinski said:
My thinking goes, there will be levels within the L3, going from lowest latency to higher latency. MFU data in the lowest rungs and lesser used data ends up in the higher/slower rungs of the ladder.

Hm, isn't that like dynamically adjusting the levels of associativity in the cache?

igor_kavinski · May 3, 2023

Saylick said:
Hm, isn't that like dynamically adjusting the levels of associativity in the cache?

Been done before?

Saylick · May 3, 2023

igor_kavinski said:
Been done before?

I have no idea if it has.

Mopetar · May 3, 2023

DrMrLordX said:
More like he predicted either slc (see apple) or l4 cache.

Apple's SLC is an L3 cache that can also be accessed by the GPU cores, but it's not really an extra level of cache any more than v-cache is.

Haswell is something that actually had an L4 cache in the form of some on-package eDRAM that sat between main system memory and the on-die L3 cache for certain models.

Although it was huge (128 MB) even by today's standards, having it off die made the hit time too high for it to be useful in most cases. But it was actually a distinct and separate forth level of cache.

DrMrLordX · May 3, 2023

Mopetar said:
Apple's SLC is an L3 cache that can also be accessed by the GPU cores, but it's not really an extra level of cache any more than v-cache is.

It can be accessed by anything on the entire SoC, which makes it different than a traditional L3 cache. I think the idea here is that the cache Anand was predicting would be shared across multiple devices.

Mopetar said:
Haswell is something that actually had an L4 cache in the form of some on-package eDRAM that sat between main system memory and the on-die L3 cache for certain models.

Broadwell also had that. Crystalwell. Sufficiently-fast DDR4 basically killed it.

A/// · May 3, 2023

igor_kavinski said:
If the iGPU is blocked from using the L4 cache, the only reason for its existence may be to hide increased RAM access latency. Also, there was something about the L4 cache contents surviving between reboots AND shutdowns (HBM3??). THAT could be a killer feature if done right with something like 512MB L4.

latency from the tiles? perhaps that was my theory a while back but that was before i learned of the igpu being blocked according to rumors, neither of which have been confirmed by intel. how does hbm3 handle persistent state changes for not wearing out?

itsmydamnation · May 3, 2023

DrMrLordX said:
It can be accessed by anything on the entire SoC, which makes it different than a traditional L3 cache. I think the idea here is that the cache Anand was predicting would be shared across multiple devices.

Broadwell also had that. Crystalwell. Sufficiently-fast DDR4 basically killed it.

except alot of games that scale well with vcache also scalded well with crystalwell. its just stuff that doesnt get benchmarked like late game simulation time etc

Exist50 · May 3, 2023

Should probably prove this L4 cache discussion to another thread. Though at the risk of being a hypocrite:

igor_kavinski said:
If the iGPU is blocked from using the L4 cache

As far as I'm aware, the GPU was supposed to be the main beneficiary of ADM.

soresu · May 3, 2023

itsmydamnation said:
except alot of games that scale well with vcache also scalded well with crystalwell. its just stuff that doesnt get benchmarked like late game simulation time etc

So you think they brought back L4 to combat AMD's V$ SKUs?

BorisTheBlade82 · May 4, 2023

soresu said:
So you think they brought back L4 to combat AMD's V$ SKUs?

I answered to your question in the MTL thread:

Page 67 - Discussion - Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Page 67 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

SteinFG · May 4, 2023

BorisTheBlade82 said:
Might someone give a short explainer as to what that "ladder" cache is supposed to be. I mean, the current L3 is already shared on a bidirectional ring.
So might they be adding further links in order to decrease latency and increase bandwidth?

My guess is it's some sort of analogue of intel's mesh interconnect. Made to scale past 8 cores in the future. Quick thought:

igor_kavinski · May 4, 2023

A/// said:
how does hbm3 handle persistent state changes for not wearing out?

HBM3 isn't persistent so maybe it will retain the data as long as the mobo receives current from the PSU. As soon as the AC mains switch is turned off, the data will be lost. In a laptop, the battery should be able to keep the HBM3 powered for days or even weeks.

A/// · May 4, 2023

igor_kavinski said:
HBM3 isn't persistent so maybe it will retain the data as long as the mobo receives current from the PSU. As soon as the AC mains switch is turned off, the data will be lost. In a laptop, the battery should be able to keep the HBM3 powered for days or even weeks.

if the state changes aren't persistent are they static? Is it saving and loading specific core data? Your post made it seem as if the l4 $ would be a new form of hibernate where the data would be stored and kept alive by trickle electricity. if the data is static and written to by the host os to store core files without reading them off slower nvme then i can see your post making sense. or is it writing what's in the ram to it for next time bypassing hibernate to nvme ssd?

old macs on system 7 used to have a shutdown mode for saving your open windows as they were back when the finder had a fixed panel at the bottom. as it stands hibernate is not friendly long term to ssd tech if you utilise it multiple times a day depending on what you've got in memory.

igor_kavinski · May 4, 2023

A/// said:
Is it saving and loading specific core data?

Don't know what approach Intel has taken but maybe a caching algo could be designed to retain data in the L4 that is almost always required, even if for 30 seconds, like during the boot sequence to make the OS boot faster or to make heavy applications like Adobe Photoshop load faster. That's how the hybrid SSHD's used their 8GB of NAND cache. The algo could assign a score to data and the data that is requested after every reboot would get more precedence in staying in the cache than something that is required once or twice a day.

eek2121 · May 4, 2023

A reminder to the new folks and the uninformed here. RGT and MLID are not valid sources of information. Quite often, they state the exact opposite of what eventually ends up getting released.

Ajay · May 4, 2023

Mopetar said:
Apple's SLC is an L3 cache that can also be accessed by the GPU cores, but it's not really an extra level of cache any more than v-cache is.

Haswell is something that actually had an L4 cache in the form of some on-package eDRAM that sat between main system memory and the on-die L3 cache for certain models.

Although it was huge (128 MB) even by today's standards, having it off die made the hit time too high for it to be useful in most cases. But it was actually a distinct and separate forth level of cache.

I thought, functionally, the Apple's SLC acted a bit more like a memory side cache? I’d have to find a technical analysis. I don’t think, in this scenario, it would be a victim cache - it could be used to store more speculative fetches, or as an MRU. Kind of over my skis here since I haven’t read an architectural breakdown.

A/// · May 4, 2023

igor_kavinski said:
hybrid SSHD's

they were good doorstops.

igor_kavinski · May 4, 2023

A/// said:
they were good doorstops.

Bad experience?

I had a Seagate SSHD in my PS3. Made loading a bit faster. It also helps prolong the HDD life for typical users coz they will run just a few applications/games whose data can be cached inside 8GB so the heads don't have to move as often. I think there are cheap laptops that ship with Toshiba SSHDs. Their user experience would be a lot worse with plain HDDs. There are third world countries where a 500GB SSHD would make better sense to buy than a 256GB SSD. Try telling a person there that SSD will be wayyyy faster. "But only 256! I want ma PeeCee with 500 JeeBees. I got so many EmPee3ssssss and moviezzzz".

soresu · May 4, 2023

igor_kavinski said:
Bad experience?

I had a Seagate SSHD in my PS3. Made loading a bit faster. It also helps prolong the HDD life for typical users coz they will run just a few applications/games whose data can be cached inside 8GB so the heads don't have to move as often. I think there are cheap laptops that ship with Toshiba SSHDs. Their user experience would be a lot worse with plain HDDs. There are third world countries where a 500GB SSHD would make better sense to buy than a 256GB SSD. Try telling a person there that SSD will be wayyyy faster. "But only 256! I want ma PeeCee with 500 JeeBees. I got so many EmPee3ssssss and moviezzzz".

The problem is that they ended up costing more engineering effort than either pure SSD or HDD, with minimal benefit and more failure points.

HDD manufacturers probably decided it was better to concentrate R&D money on multi actuator and EAMR technologies.

I don't know if they are still bothering with bit patterning or 3D magnetic recording as future strategies for 100+ TB HDDs.

Doug S · May 4, 2023

DrMrLordX said:
It can be accessed by anything on the entire SoC, which makes it different than a traditional L3 cache. I think the idea here is that the cache Anand was predicting would be shared across multiple devices.

Not only that, but the CPU cores do not have direct access to DRAM at all, everything goes through the SLC. That is probably the biggest difference between Apple Silicon and Intel & AMD CPUs.

Dunno why people are so confidently wrong about stuff like Apple's System Level Cache when there is plenty of information out there that will correct their misconceptions. And, you know, the information you get from the name itself.

Schmide · May 4, 2023

A/// said:
they were good doorstops.

I have one in my sys. Prob owned 5 or so over the years. Never had a problem. Now that I think about it, never had one fail. They all exist in some gifted system or another. I wonder if the extra cache helped Frodo share the load.

igor_kavinski · May 4, 2023

Schmide said:
I have one in my sys. Prob owned 5 or so over the years. Never had a problem.

I've heard when they DO fail coz of flash writes exhausted, the drive stops working even if the HDD function is fine. Seems the firmware is not designed to bypass the flash. Tries to access flash, fails and then goes into a loop of retries or something. At least, that's what I can guess from reports of Seagate SSHD failures.

A/// · May 5, 2023

soresu said:
The problem is that they ended up costing more engineering effort than either pure SSD or HDD, with minimal benefit and more failure points.

HDD manufacturers probably decided it was better to concentrate R&D money on multi actuator and EAMR technologies.

I don't know if they are still bothering with bit patterning or 3D magnetic recording as future strategies for 100+ TB HDDs.

on fp's if the flash portion died the drive was useless when they first came out. the early drives including apple's twist with the fusions were terrible. I don't remember anyone liking those drives.

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Lifer

Lifer

Diamond Member

Lifer

Diamond Member

Diamond Member

Lifer

Diamond Member

Platinum Member

Platinum Member

Platinum Member

Senior member

Senior member

Lifer

Diamond Member

Lifer

Platinum Member

Lifer

Diamond Member

Lifer

Platinum Member

Platinum Member

Diamond Member

Lifer

Diamond Member