Discussion Qualcomm Snapdragon Thread

Page 49 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Nothingness

Platinum Member
Jul 3, 2013
2,434
764
136
I remind people that one of the reasons Andrei left this forum (not AT) was constant unfounded criticisms by some here. OTOH I guess he could not tell us much anyway due to where he works now.
 

Nothingness

Platinum Member
Jul 3, 2013
2,434
764
136
Reactions: DisEnchantment

DisEnchantment

Golden Member
Mar 3, 2017
1,619
5,875
136
Unless I missed something it's the dispatch width which is 14 uops. It's not the same as a 14 wide instruction decoder.
Dispatch width is extremely wide. Since there is no uop cache those uops should come fully from instruction decode. Compared to 6 vs Z4 or 8 for Z5. But not apples to apples comparable
Same mis-predict penalty vs Z4
Same L1 latency vs Z4
It has lower latency for vector (neon), but not apples to apples comparable vs x86 vector
ROB is surprisingly conservative for a 2024 core.

Similar amounts of LS and ALU ports compared to Z5, although Z5/Z4 has two additional FP ports. Store and Load queues depths comparable to Z4.
Not sure how branch address are calculated, for instance Z5 has 4 AGUs for those

It looks OK, nothing in particular stands out. If it is mostly an efficiency play then we will see soon.
 
Last edited:

Nothingness

Platinum Member
Jul 3, 2013
2,434
764
136
Dispatch width is extremely wide. Since there is no uop cache those uops should come fully from instruction decode. Compared to 6 vs Z4 or 8 for Z5. But not apples to apples comparable
That can be quite complex: uops are stored in queues between decoder and dispatch; also a decoder could emit two uops per cycle. For instance a mem operation with writeback can be split in two uops, one going into ALU queue(s) for the writeback of the base register, while the other goes into load/store queue(s). So I'm afraid at this point nothing can be guessed about decoder width (though I agree it's surely wide, but it's unlikely to be 14-wide).

I've often been wrong, so I won't exclude I'm wrong again

Same mis-predict penalty vs Z4
Same L1 latency vs Z4
It has lower latency for vector (neon), but not apples to apples comparable vs x86 vector
ROB is surprisingly conservative for a 2024 core.

Similar amounts of LS and ALU ports compared to Z5, although Z5/Z4 has two additional FP ports. Store and Load queues depths comparable to Z4.
Not sure how branch address are calculated, for instance Z5 has 4 AGUs for those

It looks OK, nothing in particular stands out. If it is mostly an efficiency play then we will see soon.
I agree with you on all these points. Can't wait to see the reverse engineering of the uarch details by talented hackers! And benchmarks.

PS - A writeback operation in AArch64 is for instance a ldr x0, [x1], #8 which will do the load then add 8 (size of x0) to x1.
 
Last edited:
Reactions: carancho

SarahKerrigan

Senior member
Oct 12, 2014
379
548
136
That can be quite complex: uops are stored in queues between decoder and dispatch; also a decoder could emit two uops per cycle. For instance a mem operation with writeback can be split in two uops, one going into ALU queue(s) for the writeback of the base register, while the other goes into load/store queue(s). So I'm afraid at this point nothing can be guessed about decoder width (though I agree it's surely wide, but it's unlikely to be 14-wide).

I've often been wrong, so I won't exclude I'm wrong again


I agree with you on all these points. Can't wait to see the reverse engineering of the uarch details by talented hackers! And benchmarks.

PS - A writeback operation in AArch64 is for instance a ldr x0, [x1], #8 which will do the load then add 8 (size of x0) to x1.

There's also fusion to consider. As you know, "width" is kind of a fuzzy concept, especially with aggressively OoO machines where number of uops executing in a given cycle can greatly exceed the machine's sustained whole-pipe width.

Note that Neoverse V2, which is emphatically an 8-wide core, is listed as 16-wide in its LLVM machine model. Vendors tend to do the high-level machine-model variables in their own unique ways, often based on quantitative analysis on codegen rather than on the uarch manual.
 

soresu

Platinum Member
Dec 19, 2014
2,692
1,898
136
Went back to that ARM rumor site and found something odd under Cortex X6:


Implication seems to be a new core IP segment between X and A7xx starting with this 'Alto'.

Not sure if this is just a bad translation or not.
 

SpudLobby

Senior member
May 18, 2022
621
372
96
Went back to that ARM rumor site and found something odd under Cortex X6:

View attachment 98541
Implication seems to be a new core IP segment between X and A7xx starting with this 'Alto'.

Not sure if this is just a bad translation or not.
If ELP = the A5x isn’t that in between A5x and A7x? Or is ELP extra large perf not extra low power?
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |