Richland & Kabini rumours

Page 7 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
having 256bit vs 128 avx only saves you an instruction because you still need 2 cycles to complete LS.

Doesn't matter. Lack of having double the LS bandwidth just prevents you from achieving the maximum gain in real world.

Think about it. Trinity's iGPU beats Ivy Bridge because it has enormous more amount of FLOPs, despite having same bandwidth, and despite Trinity looking significantly constrained by bandwidth.

Putting it other way, if you double Sandy Bridge iGPU's bandwidth, despite it not looking like bandwidth bound, it would still gain some performance. Since not all applications and code behave the same.
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,035
3,811
136
Just because Clover Trail isn't successful, doesn't change that Kabini does have what's necessary to get into real Tablets(not the clunky old ones).

Battery life will improve, but probably not much. You are talking 15-30 mins.


see that's a completely biased, baseless opinion, it might very well be true, but at this stage you have nothing to base that on. a broken clock is right two times a day.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
see that's a completely biased, baseless opinion, it might very well be true, but at this stage you have nothing to base that on. a broken clock is right two times a day.

It is true, based on AMD's own presentations: http://forums.anandtech.com/showthread.php?t=2292118

They talk about "Enhanced core C6 power gating". They didn't talk about Connected Standby of their version of "S0iX" either at Hot Chips.

On Clover Trail, S0 and S3 doesn't exist. Because S0iX replaces both. Since it has the speed of the former and power efficiency of the latter.
 

Abwx

Lifer
Apr 2, 2011
11,819
4,744
136
Battery life will improve, but probably not much. You are talking 15-30 mins.

AMD claims with 60 nits brightness and 30WHr battery, Hondo gets 6 hours battery with video and 8 hours with browsing.

Intel claims with 200 nits brightness, and the same 30WHr battery, Clover Trail can do 10 hours video and 12+ hours browsing.


So you re saying that a simple die shrink from 40 to 28nm of Hondo ,
would bring only 5-10% lower comsumption/better battery life.....

Frankly , on this one you are completely out of target....
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,035
3,811
136
Doesn't matter. Lack of having double the LS bandwidth just prevents you from achieving the maximum gain in real world.

Think about it. Trinity's iGPU beats Ivy Bridge because it has enormous more amount of FLOPs, despite having same bandwidth, and despite Trinity looking significantly constrained by bandwidth.

Putting it other way, if you double Sandy Bridge iGPU's bandwidth, despite it not looking like bandwidth bound, it would still gain some performance. Since not all applications and code behave the same.


thats not even close to the same thing, because VLIW4 isn't bottleneck from the ALU's to the memory subsystem. 256bit AVX is. again all you are saving from 128 bit to 256bit is an instruction you still need all the same data, you still need to load and you still need to store. You then require 8 doubles of the same operation which depending on instruction mix isn't very achievable. Infact if you go read beyond3d you will find quite a few game devs have gone into alot of detail about how its much easier to vectorize 4 doubles then it is 8.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
So you re saying that a simple die shrink from 40 to 28nm of Hondo ,
would bring only 5-10% lower comsumption/better battery life.....

You guys won't get it, because you are so entrenched in the PC enthusiast mindset. It's about idle power, which I explained multiple dozens of times before.

The mobile Core 2 Duo had a idle C6 state power of mere 300mW.

Ah, I think I'm wasting energy here.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
thats not even close to the same thing, because VLIW4 isn't bottleneck from the ALU's to the memory subsystem. 256bit AVX is.

The general idea is the same. That doesn't mean the gain from 128-bit to 256-bit = zero.

Even if an application is so-called "primarily" bound by something, you'd still gain somewhat by enhancing something else. Nothing is 100% bound by one metric.
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,035
3,811
136
You guys won't get it, because you are so entrenched in the PC enthusiast mindset It's about idle power, which I explained multiple dozens of times before.

The mobile Core 2 Duo had a idle C6 state power of mere 300mW.

Ah, I think I'm wasting energy here.


no see the thing is you are making assumption and we are calling you out on it. you then justify your position by then falling back to your assumptions. See the conundrum.
 

Haserath

Senior member
Sep 12, 2010
793
1
81
At 3-4W, they can do it. The chip in PS Vita is said to have a TDP of 4-5W.

The problem is having active standby and the advanced power management required to meet that standard. That's a HUGE power management advantage that seperates true next generation x86 devices from current ones.

AMD claims with 60 nits brightness and 30WHr battery, Hondo gets 6 hours battery with video and 8 hours with browsing.

Intel claims with 200 nits brightness, and the same 30WHr battery, Clover Trail can do 10 hours video and 12+ hours browsing.
I wish they had a standard measurement for screen brightness with screen size accounted. Assuming the same size screen, that is definitely significant, but I also wonder what performance is like between them?
So you re saying that a simple die shrink from 40 to 28nm of Hondo ,
would bring only 5-10% lower comsumption/better battery life.....

Frankly , on this one you are completely out of target....

Actually... I would bet it would be something like that. 10% on idle/low load and a bit more for the heavy ones.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
but I also wonder what performance is like between them?

Clover Trail probably ends up even faster in GPU. It's a 1.8GHz one vs a 1GHz one. The GPU is of course better on Hondo.

AMD does not say the screen size, but on Intel its 10.6 inches.
 
Last edited:

Homeles

Platinum Member
Dec 9, 2011
2,580
0
0
You guys won't get it, because you are so entrenched in the PC enthusiast mindset. It's about idle power, which I explained multiple dozens of times before.

The mobile Core 2 Duo had a idle C6 state power of mere 300mW.

Ah, I think I'm wasting energy here.
28nm HKMG still has significantly lower leakage than 40nm.
 

Abwx

Lifer
Apr 2, 2011
11,819
4,744
136
28nm HKMG still has significantly lower leakage than 40nm.

You preceded me , was about to say the same and also
that voltage supply will be lowered as well as better
power saving management than Brazos platform.
 

Homeles

Platinum Member
Dec 9, 2011
2,580
0
0
Last edited:

Haserath

Senior member
Sep 12, 2010
793
1
81
Is that not the point of Swift?

And it's 15%. Video playback has 20% improved battery life.

Kind of blows that 15-30 minute claim out of the water.

I'd say the point of swift is to make an architecture with more speed(Bulldozerize but with IPC too), but I wouldn't be surprised if they could get a bit better efficiency out of it. Apple is good at that.

I also accidentally mixed up numbers when adding there. I did 8.6=9 thanks to 60 minutes in an hour...D:
http://www.anandtech.com/show/5789/the-ipad-24-review-32nm-a5-tested/2
The iPad 2->2.4 actually shows it better. 17%-30% depending on the workload. This was a good jump thanks to a regular LP process to HKMG.
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,035
3,811
136
The general idea is the same. That doesn't mean the gain from 128-bit to 256-bit = zero.

Even if an application is so-called "primarily" bound by something, you'd still gain somewhat by enhancing something else. Nothing is 100% bound by one metric.

The thing to remember here is the only time 256 bit avx would beat 128bit avx is when you are thoughtput constrained, SB/IB can decode more instructions a cycle then needed for peak AVX performance.

so why does an SB/IB with twice the peak AVX throughtput compared to a 8 core bulldozer have very comparable AVX performance? i bet you when Haswell comes out AVX-256bit ( not AVX2) will have much greater performance per clock then SB/IB or AVX-128 on Haswell.



edit: whoops..... i realized i cant count.........

edit2: wait i can.... . BOORAR!
 
Last edited:

itsmydamnation

Diamond Member
Feb 6, 2011
3,035
3,811
136
figured i would just add the notes i took while watching the HC presentation

IPC
1% IPC in core fetch unit
1% IPC in ALU (divide unit)
5% IPC in Load/store /oooe
L2 stream prefetcher "significate IPC improvement" , runs ahead of core prefetcher unit


mirco uarch 10% freq on bobcat
there is more frequency in process
jaguar %10 bigger at same node as bobcat

L1's are write back to the L2
Banked L2's with very high throughput interbank
explicit stated full inclusive L2 for snooping, ie more then one CU is possible.
l2 master routing interface for compute unit
L2D is at 1/2 clock
24 read+write memory transactions in flight


each core independent CC6 state, no need to flush L2 like bobcat, significantly improves CC6 latency. lot of work done to improve latency of both C6 and CC6.

clock gating efficient up across the board especially at idle
lots of work done to improve clock gating

15% IPC is per thread each thread using 512kbit, greater IPC if more of L2 used for a thread
 
Last edited:

Gideon

Platinum Member
Nov 27, 2007
2,012
4,986
136
Just because Clover Trail isn't successful, doesn't change that Kabini does have what's necessary to get into real Tablets(not the clunky old ones).

Battery life will improve, but probably not much. You are talking 15-30 mins.

But aren't there additional benefits from a first SoC design with on-chip south-bridge or FCH (Fusion Controller Hub) as they call it ? South-bridge power-draw should be quite significant, at least drawing from this article:

http://www.anandtech.com/show/5937/amd-reveals-brazos-20-apus-and-fch
Finally, AMD lists the FCH idle power as 750mW, down from 950mW on the previous A50M FCH.

That's only idle power draw and it should be included in the Temash 3.x W TDP (unlike Hondo)


P.S. Does anyone know on what process node it was designed in Hondo ? I doubt it's 40nm
 

Gideon

Platinum Member
Nov 27, 2007
2,012
4,986
136
And 10% clock means something around 1.9-2Ghz. I dont see the mention of turbo on Jaguar.

IMHO Designing an 4-core CPU with integrated graphics but no turbo would be downright idiotic in 2013, especially with a low power budget. Though it doesn't rule out the possibility, as we are dealing with AMD here after all

Hopefully they also have a shared power plane, so that the CPU can tap into the GPUs TDP when the latter is idle. As power delivery and distribution isn't just a core feature, there is a possibility that they didn't want to talk about it during Hot Chips. Anyways, hopefully we'll find out more at CES.
 
Status
Not open for further replies.
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |