Richland & Kabini rumours

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

Haserath

Senior member
Sep 12, 2010
793
1
81
Kaveri is alive but it's launching ~H1 2014 instead of H1 2013. Richland is (maybe 28nm) Piledriver with updated GPU (GCN?) launching in place of Kaveri. Both Kaveri and Richland should be FM2 compatible (2 placeholders on AMD's slide for FM2 successors to Trinity).

My guess is that AMD delayed Kaveri to beef up the GPU portion of the chip in response to Haswell's projected GPU performance. CPU side can't be redesigned significantly in ~6-8 months ( it can but not to affect performance in any significant way). That's why I think the reason for delay is GPU related and not CPU related.

That would be a dumb move. Push it back a year and you push competition from Haswell to Broadwell.

I bet there is another reason for the delay. If it was going smoothly, they would release Kaveri then plan the next chip with whatever they needed.
 

Abwx

Lifer
Apr 2, 2011
11,819
4,743
136
My guess is that AMD delayed Kaveri to beef up the GPU portion of the chip in response to Haswell's projected GPU performance.

That would be a dumb move. Push it back a year and you push competition from Haswell to Broadwell.

Both propositions can be compatible if Richland has GCN and is
better GPU wise than HW , then it make sense to push Kaveri
as Broadwell competitor....
 

Azuma Hazuki

Golden Member
Jun 18, 2012
1,532
866
131
If AMD wants to remain relevant, they're going to have to do some good product design, and take the reins from OEMs in hands somewhat similarly to what Intel is planning to do.

Now is the time to leverage the Radeon-brand memory and SSDs, especially the SSDs. Partner with Asus and bring out something beautiful in 13", not an ultrabook but something with a 1600x900 screen, a good Kabini APU, 2x4 GB DDR3-2000 1.35v, and a 128 GB SSD. This should be around $700 US.

It needs to be absolutely gorgeous and solid as a rock, something like my U46E but more modern-looking. All metal, not a single rough edge or clashing color. No glossiness, except on the LCD if you must.

And ENFORCE this. You can call it something catchy too, like Gestalt or Synergy. The point being, show that 1) you own the brand and 2) The whole is more than the sum of its parts. Make people associate "AMD" with "these awesome little all-purpose machines that go stupid fast [because of the SSD]."
 

Haserath

Senior member
Sep 12, 2010
793
1
81
So anyone want to answer this? There's a simple name for it too, not longwinded technical explanation.

Power, power efficiency, and speed.

Clover trail fits into phones. Core fits into tablet+.

Kabini can't reach phones and doesn't have the brawn to match core, even at the same power.

It's a difficult spot to be in, and they can't afford to make Kabini bigger. They still only have 128-bit data width so AVX is half speed...
---
Wait a sec... SSE is 128-bit, right? Brazos only had 64-bit width. Wouldn't that make Kabini 4x faster in many things even without IPC and clocks?
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,691
136
Power, power efficiency, and speed.

Wait a sec... SSE is 128-bit, right? Brazos only had 64-bit width. Wouldn't that make Kabini 4x faster in many things even without IPC and clocks?
Yes, Jaguar will have ~2x more flops throughput per core versus Bobcat. This would make any SSE workload much faster on Kabini ,basically on the level of K10(per core),if not faster. What they refer as ">15% IPC" increase on the slides is the common integer ipc increase(a net effect of all the changes in the core).

So summed up: 2x more FP throughput,2x more cores, possibly higher clock in QC Kabini variant Vs DC Bobcat. End result: 4x more performance in some workloads (they have to be very heavy on fp and to be very well threaded). In other stuff the integer speed would determine the performance difference and this would be maximum Turbo on Kabini(2.2Ghz?) with +15% IPC versus 1.7Ghz Bobcat: 2.2x1.15/1.7=1.48. Note that Kabini may clock "just" 10% more than 1.7Ghz Bobcat,even in Turbo mode so integer speedup would end up being : 1.1x1.15=1.26 or 26%. Still not bad.
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Informal you are underestimating marketing again. Within those 15%, the SSE is counted for

And 10% clock means something around 1.9-2Ghz. I dont see the mention of turbo on Jaguar.
 
Last edited:

inf64

Diamond Member
Mar 11, 2011
3,884
4,691
136
Informal you are underestimating marketing again. Within those 15%, the SSE is counted for

And 10% clock means something around 1.9-2Ghz. I dont see the mention of turbo on Jaguar.

SSE resources in the core doubled . But sure they have probably included the gain from some SSE heavy workloads to beef up the "average" number. That's why the > sign is there since if you run Povray you will probably see much more than 15% per core and per clock increase versus Bobcat. Think Yonah->Merom . If you run C11.5 you probably won't see more than 10% though as we have seen with K8->K10 .

Yes,clock is the biggest unknown. I'm pretty sure Turbo is there since it's there on some Brazos models . C-60 is 1Ghz base and 1.33Ghz Turbo core chip . So Turbo is very possible option for frequency tuned Jaguar(they said they increased the pipeline length for this sole purpose). On top of this you have 28nm bulk versus 40nm bulk, it should bring at least some advantage.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
I dont think they directly picked heavy SSE workloads. But they sure did pick average workloads, including meida. Because thats what you do when creating these numbers. And since SSE is universal and commonly used today, its very hard to not have code that uses SSE.
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,691
136
It's true that SSE is very common but some applications are using SSE heavier than others (like povray) . In any case the average figure is mostly inflated by those workloads,but as the architect stated in the video most of the "ipc" improvement comes from notably improved L/S system which now supports 128bit loads and stores(64bit in Bobcat) ,features more advanced OoO memory pipeline,advanced l-t-s forwarding etc. Memory instructions are roughly ~50% of the x86 instruction mix in modern workloads(according to some statistics at least -MOV alone is 35%).
 
Last edited:

inf64

Diamond Member
Mar 11, 2011
3,884
4,691
136
Yes that's true. But memory pipeline improvement is the largest "contributor" to the ipc improvement as the architect stated when he covered that part of the presentation. Without that change the SSE hardware would be underutilized plus it benefits pure integer code too as it handles loads/store for both int and fp portions of the core(even though they have separate schedulers for int and fp , like K10 and Bulldozer).
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Yes that's true. But memory pipeline improvement is the largest "contributor" to the ipc improvement as the architect stated when he covered that part of the presentation. Without that change the SSE hardware would be underutilized plus it benefits pure integer code too as it handles loads/store for both int and fp portions of the core(even though they have separate schedulers for int and fp , like K10 and Bulldozer).

The entire memory part is only done for the SSE part. Same as with Haswell. Outside AVX, the 256bit for Haswell means nothing.
 

Haserath

Senior member
Sep 12, 2010
793
1
81
Informal you are underestimating marketing again. Within those 15%, the SSE is counted for

And 10% clock means something around 1.9-2Ghz. I dont see the mention of turbo on Jaguar.

The L2 cache boost alone should do that... Then they have those core improvements.

This chip could be decent even compared to core, if its price is right.
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,409
5,673
136
Seriously guys, arguing about/getting excited over performance improvements based on marketing slides and hearsay is just dumb. Wait for the benches.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
Power, power efficiency, and speed.

Clover trail fits into phones. Core fits into tablet+.

Yes, you got the idea! The term is called "Connected Standby/S0iX". Because it lacks that(the slide only mentions enhanced C6), it won't be competitive in battery life at all. That makes Kabini dead in the water for Tablets. TDP won't matter, and even if it did, 3W+ is at the very high end of the spectrum.

The idle power in pre-Clover Trail x86 devices are astoundingly high. I can tell you my almost pocketable Viliv S5 with screen-off, idle only achieves 10 hours battery. Since it has a 24WHr battery, that indicates idle power drain of 2.4W.

Of course, implementation isn't easy at all. But when you get it working, you'll get more than order of magnitude decrease in idle power! The reason Intel calls it "S0iX", is because it has the wake-up time of S0 state with power use of S3 state.

About SSE: Kabini uses 2 cycles to achieve 256-bit AVX, just like how for the longest time SSE2 was done using 2 64-bit cycles.
 
Last edited:

Haserath

Senior member
Sep 12, 2010
793
1
81
Yes, you got the idea!The term is called "Connected Standby/S0iX". Because it lacks that(the slide only mentions enhanced C6), it won't be competitive in battery life at all. That makes Kabini dead in the water for Tablets. TDP won't matter, and even if it did, 3W+ is at the very high end of the spectrum.

About SSE: Kabini uses 2 cycles to achieve 256-bit AVX, just like how for the longest time SSE2 was done using 2 64-bit cycles.
I refer to TDP/power, because that's a target needed for form factor. No matter how efficient core is, if it can't reach 1-2W for a phone, it can't reach phones.

Kabini definitely doesn't sound like it stands much of a chance when it has ARM tablets at the bottom and -perhaps- Intel tablets at the top for price, too much competition for them to win.

And AVX isn't as relevant as SSE. We'll probably see expansion down the road if AVX is consumer widespread and AMD is around.

AMD really needs to work on power efficiency tactics, like Intel, on Kabini. The next core shouldn't feature any improvement besides reducing wasted power. That is what limits them after all!
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,035
3,811
136
256bit AVX over 128bit AVX shows almost no performance increase in most instances even on current intel hardware which do it in one cycle, your only saving a single instruction while requiring more data to be local and then you need more OOO resources to keep the ALU utilized. Having only 256bit load and 128bit store also kind of massively hurts 256bit throughput.

haswell will be different with 512bit load 256bit but if you think intel wants to compete against kabini (sub 100 mm sq on dirt cheap 28nm) with ULV parts you have rocks in your head.

kabini will sell great and im willing to put actual money on it, any one else want to step up.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
I refer to TDP/power, because that's a target needed for form factor. No matter how efficient core is, if it can't reach 1-2W for a phone, it can't reach phones.

At 3-4W, they can do it. The chip in PS Vita is said to have a TDP of 4-5W.

The problem is having active standby and the advanced power management required to meet that standard. That's a HUGE power management advantage that seperates true next generation x86 devices from current ones.

AMD claims with 60 nits brightness and 30WHr battery, Hondo gets 6 hours battery with video and 8 hours with browsing.

Intel claims with 200 nits brightness, and the same 30WHr battery, Clover Trail can do 10 hours video and 12+ hours browsing.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
256bit AVX over 128bit AVX shows almost no performance increase in most instances even on current intel hardware which do it in one cycle,

Not true. The gain is nowhere near the theoretical 2x gain, but its nevertheless significant.

The problem is that FP is actually not that important for consumer markets. On Xeon E5's, they say AVX can attribute for 20-30% gain with few HPC ones doing 60%. Things like video transcoding is integer for example.

Perhaps though AMD has server in mind with AVX.
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,035
3,811
136
yep and you still cant actually do anything useful on Clover Trail . few things to remember, kabini is two gens away form hondo. TSMC 28nm comparatively is way better then TSMC 40nm was/is. hondo was a Time to market part that didn't do anything above and beyond Zacate.

So i wouldn't go making assumptions and extrapolations about anything to do with kabini we simple dont know. from the video i have seen from hotchips they specifically mention the entire core has undergone power consumption improvements compared to bobcat.
 
Last edited:

itsmydamnation

Diamond Member
Feb 6, 2011
3,035
3,811
136
Not true. The gain is nowhere near the theoretical 2x gain, but its nevertheless significant.

The problem is that FP is actually not that important for consumer markets. On Xeon E5's, they say AVX can attribute for 20-30% gain with few HPC ones doing 60%. Things like video transcoding is integer for example.

Perhaps though AMD has server in mind with AVX.

reread what i said, the core has 256/128 LS . having 256bit vs 128 avx only saves you an instruction because you still need 2 cycles to complete LS. AVX itself can and does offer serious benefits H264 encode is a perfect example.


edit: by the core i mean SB/IB
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
Just because Clover Trail isn't successful, doesn't change that Kabini does have what's necessary to get into real Tablets(not the clunky old ones).

Battery life will improve, but probably not much. You are talking 15-30 mins.
 
Status
Not open for further replies.
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |