Donanimhaber FX8150 Video Review/Benchmarks!

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
Agree. I am in the same boat as I am about ready to update my rendering farm again. I have waited to see how BD performs before making a decision. If BD doesn't keep up, then the decision I had already made--to go with 2600k's, will be what I do.

Although to be honest, I generally don't overclock my rendering machines, sort of like server workloads, I prefer lower or stock clocks and stability. The main workstation, yes I will overclock, because doing test renders is often quicker on one machine than sending it over the network and waiting to inspect the image--depends on how heavy the scene is.

But keeping stable overclocks on several machines at once is not something I like to do. Stability is key for me as I sometime sell rendering time to another small local studio if they are behind. I hate having to chase down corrupt frames, especially when the work isn't mine.

The machines render faster when overclocked for sure, but I have had times in the past when they go to send their frames back across the network, something goes wrong. I don't know if the overclock caused the NIC's to have problems or what. But I have never noticed it when running everything at stock.

To be honest though, that was a long time ago when I first encountered that, but since then, I have always went stock with the rendering machines in the farm. It may have been something inherent in those first machines many years ago and the way I had it setup. But ever since then I have just went stock when upgrading.


I agree not O/C'ing for a render farm. Heat, power, etc. are important. I'm just talking for a work station though. I just do it as a hobby. I make game models and the most time consuming task for me is baking textures. I'd like to speed that up as much as possible for as little cash as possible. BD has the potential to be best bang/$. At least I'm hoping it does. Hard to say with the benches they've been showing though. Keeping my fingers crossed. :\
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
What does this actually mean, if accurate;?

x264 encoding
1080p
BD 4.85 12:33
2500K 3.8 18:56
PII X4 3.7 21:30

720p
BD 4.85 8:40
2500K 3.8 11:01
PII X4 3.7 15:15

Would have been nice to have BD results from 3.7 to 3.8 also


Not familiar with the bench, but it looks like how long it took to complete the task For example, BD clocked @ 4.85GHz took 12min/33sec, 2500K @ 3.8 GHz took 18min/56sec.

Assuming perfect scaling, adjusting for difference in clocks, BD looks to be ~20% faster @1080p than the 2500K. That would make the difference between it and the 2600K negligible. Figuring HT would speed up the 2600K by ~20% compared to the 2500K as well. It would come down to price and if BD was fast enough in lower thread workloads. I think it will be fine in single thread, although admittedly slower. So, if it's appreciably cheaper, which it should be, then BD would be better overall, IMO.
 

sequoia464

Senior member
Feb 12, 2003
870
0
71
I found this interesting ...


"Actually, we already have such an issue known for Bulldozer, and NO bench-marked system has the patch installed!

The shared L1 cache is causing cross invalidations across threads so that the prefetch data is incorrect in too many cases and data must be fetched again. The fix is a "simple" memory alignment and (possible)tagging system in the kernel of Windows/Linux.

I reviewed the code for the Linux patch and was astonished by just how little I know of the Linux kernel... lol! In any event, it could easily cost 10% in terms of single threaded performance, possibly more than double that in multi-threaded loads on the same module due to the increased contention and randomness of accesses.

Not sure if ordained reviewers have been given access to the MS patch, but I'd imagine (and hope) so! Last I saw, the Linux kernel patch was still being worked on by AMD (publicly) and Linus was showing some distaste for the method used to address the issue. One comment questioned the performance cost but had received no replies... but you don't go re-working kernel memory mapping for anything less than 5-10%... just not worth it!"

I saw this at extreme systems .. post 323 here ..http://www.xtremesystems.org/forums...nally-tested&p=4969164&viewfull=1#post4969164
 

swilli89

Golden Member
Mar 23, 2010
1,558
1,181
136
I found this interesting ...


"Actually, we already have such an issue known for Bulldozer, and NO bench-marked system has the patch installed!

The shared L1 cache is causing cross invalidations across threads so that the prefetch data is incorrect in too many cases and data must be fetched again. The fix is a "simple" memory alignment and (possible)tagging system in the kernel of Windows/Linux.

I reviewed the code for the Linux patch and was astonished by just how little I know of the Linux kernel... lol! In any event, it could easily cost 10% in terms of single threaded performance, possibly more than double that in multi-threaded loads on the same module due to the increased contention and randomness of accesses.

Not sure if ordained reviewers have been given access to the MS patch, but I'd imagine (and hope) so! Last I saw, the Linux kernel patch was still being worked on by AMD (publicly) and Linus was showing some distaste for the method used to address the issue. One comment questioned the performance cost but had received no replies... but you don't go re-working kernel memory mapping for anything less than 5-10%... just not worth it!"

I saw this at extreme systems .. post 323 here ..http://www.xtremesystems.org/forums...nally-tested&p=4969164&viewfull=1#post4969164

Quite interesting..
 

Despoiler

Golden Member
Nov 10, 2007
1,967
772
136
I found this interesting ...


"Actually, we already have such an issue known for Bulldozer, and NO bench-marked system has the patch installed!

The shared L1 cache is causing cross invalidations across threads so that the prefetch data is incorrect in too many cases and data must be fetched again. The fix is a "simple" memory alignment and (possible)tagging system in the kernel of Windows/Linux.

I reviewed the code for the Linux patch and was astonished by just how little I know of the Linux kernel... lol! In any event, it could easily cost 10% in terms of single threaded performance, possibly more than double that in multi-threaded loads on the same module due to the increased contention and randomness of accesses.

Not sure if ordained reviewers have been given access to the MS patch, but I'd imagine (and hope) so! Last I saw, the Linux kernel patch was still being worked on by AMD (publicly) and Linus was showing some distaste for the method used to address the issue. One comment questioned the performance cost but had received no replies... but you don't go re-working kernel memory mapping for anything less than 5-10%... just not worth it!"

I saw this at extreme systems .. post 323 here ..http://www.xtremesystems.org/forums...nally-tested&p=4969164&viewfull=1#post4969164

Would this help explain the absolutely horrendous cache performance we have seen in the leaks?
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
Would this help explain the absolutely horrendous cache performance we have seen in the leaks?

I made the post... and have come to the understanding that it doesn't apply as often as I had expected (at least AMD is saying it only causes a 3% long-term effect, though up to my numbers int the short term ("microbenchmaking")).

The cache numbers are actually by design for the L2: 50% greater latency, 25-30% better read performance, and worse write performance (mediated by an L2 Write Coalescing Cache). The L1 details were sparse, and the L3 details were completely missing for all intents and purposes as it pertains to performance.

I would have anticipated, however, that L1 performance would have been greatly increased and that was the explanation for the tiny 16kb L1D, and the sharing of the L1I on a module - faster memory takes more die space in most circumstances.

In all, the patch may have a 5% effect on some benchmarks, less on most, and a few will see more than 5% as they'll be torturing the processor to actually show the design compromise/errata.

All in all, however, Bulldozer should have higher IPC than phenom II. It was designed to have higher IPC, and JF-AMD has stood by that even to the most recent posts I've seen (even after the server shipments). Something is strange with the results, and the results do indicate L1 cache contention issues to be to blame (been in the game a while... funny what you can pick out from benchmarks sometimes).

I am pleased to see the widely differing scores in all the leaks, though, makes me believe the systems aren't being properly tweaked and we are seeing a poor default setting coming to light... or something sinister... but few would benefit from that...

--The loon
 

Ancalagon44

Diamond Member
Feb 17, 2010
3,274
202
106
All in all, however, Bulldozer should have higher IPC than phenom II. It was designed to have higher IPC, and JF-AMD has stood by that even to the most recent posts I've seen (even after the server shipments). Something is strange with the results, and the results do indicate L1 cache contention issues to be to blame (been in the game a while... funny what you can pick out from benchmarks sometimes).

I havent seen a single leaked benchmark where BD beats PhII clock for clock. Not one.

I personally dont think they can - they cut out a third of its decoding and execution units, what do they expect?

So I think JFAMD, who got his information from the engineers, was badly misled on this one.
 

MisterMac

Senior member
Sep 16, 2011
777
0
0
http://www.youtube.com/watch?v=CqTU4wVvZL0

I find this funny, i know it's a cheapshot and all kind to the "hitler finds out" videos, but nonetheless well executed.


Especially the points about mister John Fruehe.


In retrospect, you have to wonder how bad things are in a CPU Company's organisation when the... SERVER MARKETING CHIEF, has to come to geeky gamer nerd forums to defend a DESKTOP product.

Maybe that should have been the first red light LONG ago?
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
http://www.youtube.com/watch?v=CqTU4wVvZL0

I find this funny, i know it's a cheapshot and all kind to the "hitler finds out" videos, but nonetheless well executed.


Especially the points about mister John Fruehe.


In retrospect, you have to wonder how bad things are in a CPU Company's organisation when the... SERVER MARKETING CHIEF, has to come to geeky gamer nerd forums to defend a DESKTOP product.

Maybe that should have been the first red light LONG ago?

JF has been making the rounds with the forums for a long time. He says he does it on his own time and not as an AMD employee. Of course, he's always an AMD employee, but I think he does it on his own more out of passion. There are a number of people in the industry who post here and elsewhere.
 

MisterMac

Senior member
Sep 16, 2011
777
0
0
JF has been making the rounds with the forums for a long time. He says he does it on his own time and not as an AMD employee. Of course, he's always an AMD employee, but I think he does it on his own more out of passion. There are a number of people in the industry who post here and elsewhere.


Of course.


But if i had a passion for my job @ intel or AMD.

I wouldn't even remotely start spurring information on the products features and performance when it most likely wasn't going to be impressive.


When the first fake(were they now fake? or not?) OBR benchmarks appeared, JF would have KNOWN by then if they we're close to internal performance and possible retail performance of BD.

and if so, should just had stated them false and tell everyone to wait patiently.

Not go "MOAR IPC! MOAR PERFORMANCE MOAR CORES MOAR CLOCK!" - spreading hope on something he as a non engineer - and even non-division employee - had any even remotely true knowledge of.

(Forgive my 4chan rhetoric ).
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,803
1,286
136
Integer performance 2/3rds(or was it 1/3 to lazy to check) of K12
FP performance 1/4th of Sandy Bridge

Recipe for BAD NEWS!!!

Terrace215 is luling so hard now
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
Of course.


But if i had a passion for my job @ intel or AMD.

I wouldn't even remotely start spurring information on the products features and performance when it most likely wasn't going to be impressive.


When the first fake(were they now fake? or not?) OBR benchmarks appeared, JF would have KNOWN by then if they we're close to internal performance and possible retail performance of BD.

and if so, should just had stated them false and tell everyone to wait patiently.

Not go "MOAR IPC! MOAR PERFORMANCE MOAR CORES MOAR CLOCK!" - spreading hope on something he as a non engineer - and even non-division employee - had any even remotely true knowledge of.

(Forgive my 4chan rhetoric ).

I'm pretty sure JF was just repeating what he was told. Even if he knew the leaks were accurate for the samples in the field, he made it seem like the final chips would be improved. I'll give him the benefit of the doubt and think that he was assured they would be and he went with it. Unfortunately, it's not looking that way now, though. As far as performance goes, we'll find out soon enough. If it does turn out that what JF was reporting was wrong, I'll be curious to see what he says.
 

Ancalagon44

Diamond Member
Feb 17, 2010
3,274
202
106
I'm pretty sure JF was just repeating what he was told. Even if he knew the leaks were accurate for the samples in the field, he made it seem like the final chips would be improved. I'll give him the benefit of the doubt and think that he was assured they would be and he went with it. Unfortunately, it's not looking that way now, though. As far as performance goes, we'll find out soon enough. If it does turn out that what JF was reporting was wrong, I'll be curious to see what he says.

I dont think he intentionally lied - I think he was just repeating what his engineers told him. Which, if you think about it, sounds like an impossible goal.

I mean, there goal was basically to build a core that was about 80% of the transistor budget of a PhenomII core that simultaneously had higher IPC. As soon as we saw that the pipeline length had been increased, I think it should have been an obvious sign that it would be VERY difficult to achieve the same IPC.

Maybe thats why they did it - they bargained on higher clockspeeds enabling them to make up the deficit that a smaller die had cost them in IPC. Looking at how that went for Prescott, it seems history is due to repeat itself.
 

zlejedi

Senior member
Mar 23, 2009
303
0
0
Maybe thats why they did it - they bargained on higher clockspeeds enabling them to make up the deficit that a smaller die had cost them in IPC. Looking at how that went for Prescott, it seems history is due to repeat itself.

Which would be absolutly hilarous if AMD who had their biggest moment of triumph due to sticking with short pipeline went with long pipeline architecture same way Intel did back then.

At least Intel could be excused as noone at that time expected physics to block speed increase of cpus so early but AMD should have been smarter and learn from opponent mistakes.

Now I think more realistic scenario is that AMD assumed much higher adoption rate of multi threaded code so they assumed ST performance will be more or less irrelevant as only legacy apps will run 1-2 threads in 2009 and for those even cripled speed of 1 amd core will be enough.
 

Ancalagon44

Diamond Member
Feb 17, 2010
3,274
202
106
I think if they did their power management really, really well, and they didnt have this cache problem, they could be on to a winner of an idea.

I mean, if the turbo core thing worked well, and by well I mean better than it currently does, it would allow their performance in cases where not all cores were in use to be really good by scaling up pretty high. Imagine if BD could scale up to say 5GHz instead of 4.2. Then perhaps their gamble would have paid off, because even if the IPC was less than amazing, the sheer clockspeed would deliver good performance (so long as power usage was under control).

However, that doesnt appear to be the case. Turbo core doesnt go nearly high enough to offset the crappy IPC. And I also think its going to increase power consumption quite a lot.
 

formulav8

Diamond Member
Sep 18, 2000
7,004
522
126
More info on the patch...


This patch provides performance tuning for the "Bulldozer" CPU. With its
shared instruction cache there is a chance of generating an excessive
number of cache cross-invalidates when running specific workloads on the
cores of a compute module.

This excessive amount of cross-invalidations can be observed if cache
lines backed by shared physical memory alias in bits [14:12] of their
virtual addresses, as those bits are used for the index generation.

This patch addresses the issue by zeroing out the slice [14:12] of
the file mapping's virtual address at generation time, thus forcing
those bits the same for all mappings of a single shared library across
processes and, in doing so, avoids instruction cache aliases.

It also adds the kernel command line option
"unalias_va_addr=(32|64|off)" with which virtual address unaliasing
can be enabled for 32-bit or 64-bit x86 individually, or be completely
disabled.

This change leaves virtual region address allocation on other families
and/or vendors unaffected.
 
Last edited:

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
Seems like FX will be the "Linux CPU" at least until Windows consumer programs release versions with Bulldozer optimizations.
 

formulav8

Diamond Member
Sep 18, 2000
7,004
522
126
Actually, there is supposed to be some Windows 7 scheduling patch being tested or something. But I can't find any absolute answer

In fact, it seems that the changes made in Win8 on how it does its thread scheduling and such BD can be up to 10% or so better than previous Windows.

 
Last edited:

videoclone

Golden Member
Jun 5, 2003
1,465
0
0
hmmmm top bulldozer cant even beat the old Phenom II X6 1100T

I would say its a very big FAIL of a CPU, they really should have just shrunk the Phenom II X6 1100T to 32nm and with the power saving boost up the tdp to bump up the core clock

a Phenom II X6 1100T at say 4.2Ghz would of been competative with intel's current lineup much more so then this bulldozer POS that cant even compete with the Phenom II X6 1100T
 

SickBeast

Lifer
Jul 21, 2000
14,377
19
81
Guys there is some new data coming out that is showing the BD to be much more powerful than some of these other leaks of data.

The only thing *fail* could wind up being all this belly-aching and complaining about something that does not exist.
 
Status
Not open for further replies.
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |