frostedflakes
Diamond Member
- Mar 1, 2005
- 7,925
- 1
- 81
I've mentioned this before, but I think performance of BD is going to vary wildly due to the shared resources approach between cores. Cinebench is mostly floating point, for example, and for programs not compiled with FMA4, XOP, and other Bulldozer optimizations, they will only have four FPUs to work with in a four module BD. With optimizations, though, hopefully things will perform better as then I think they will be able to send two 128-bit floating point instructions to each 256-bit FPU. At least this is how I understand it.
Integer workloads should perform great without any optimizations, though, since each core in a module has its own dedicated integer resources.
Am really anxious for a full review from AnandTech that can explain the architecture in detail and put it all into perspective.
Integer workloads should perform great without any optimizations, though, since each core in a module has its own dedicated integer resources.
Am really anxious for a full review from AnandTech that can explain the architecture in detail and put it all into perspective.