Avx2 + tsx

Denithor · May 30, 2012

Any chance AMD will be able to implement these in the next generation or two?

If not, Intel's lead is likely to become completely ridiculous. To the point you'd almost have to be retarded to buy an AMD chip.

Olikan · May 30, 2012

i think so... if intel don't patent troll them

amd already supports fma-3, F16C and XOP*

XOP, is like a baby brother of avx-2....

Rackmountsales · May 30, 2012

I think It will take several years before even 50% of the market has AVX2 capable chips, especially with most current chips being more than capable for normal internet browsing/word processing well into the future, and people seeing no need to replace them. Only hardware enthusiasts like us move through hardware at a fast rate, and even then, many of us are still on first generation core i7s.

Edrick · May 30, 2012

Olikan said:
i think so... if intel don't patent troll them

amd already supports fma-3, F16C and XOP*

XOP, is like a baby brother of avx-2....

AMD support FMA4, not FMA3 yet. FMA3 is what Haswell and Piledriver will use. BD has FMA4. Minor detail I know, but the only one that will be used by developers is the one that both CPUs support (FMA3).

BenchPress · May 30, 2012

Olikan said:
i think so... if intel don't patent troll them

For AVX2 the IP isn't exactly new. It comes straight from GPU technology and AMD has plenty of that. The only things that I imagine are patented is Intel's use of two 128-bit SIMD execution stacks, and their one-cacheline-per-cycle gather implementation. AMD needs neither verbatim to implement AVX2.

TSX is a different story, although AMD's ASF experiments might help them a bit.

XOP, is like a baby brother of avx-2....

Only in the sense that 3DNow! is a baby brother of SSE: worthless. Some of the XOP instructions are only 128-bit, while AVX2 is all about using the SPMD programming model using 256-bit vectors. Also, the number one feature of AVX2 is the gather instructions, which replace a sequence of 18 legacy instructions! Without it you can't do SPMD in many cases.

Anyway, one light in the darkness is that AMD offered full AVX support and FMA4 only a few months after Intel. So I have good hopes that they'll support AVX2 not too long after Haswell. At that point XOP will quickly be a faint memory, like 3DNow!. Hopefully they'll remove support for both to save on transistors and avoid ISA fragmentation.

Olikan · May 30, 2012

BenchPress said:
Only in the sense that 3DNow! is a baby brother of SSE: worthless. Some of the XOP instructions are only 128-bit, while AVX2 is all about using the SPMD programming model using 256-bit vectors. Also, the number one feature of AVX2 is the gather instructions, which replace a sequence of 18 legacy instructions! Without it you can't do SPMD in many cases.

yep...that's why i said baby brother :biggrin:

lamedude · May 31, 2012

Intel and AMD have a cross licensing agreement so patents are a non issue.
Trinity has FMA3 so they beat Intel to that one and I assume AVX 1.1 on Trinity/Piledriver slides are the "post 32nm processor instruction extensions" introduced with IB so AMD wasn't far behind getting those. AVX2 on Steamroller looks like a safe bet.

Dresdenboy · May 31, 2012

BenchPress said:
TSX is a different story, although AMD's ASF experiments might help them a bit.

I think so. This paper contains some nice details:
http://www.cs.washington.edu/homes/djg/papers/asf_micro2010.pdf
BTW:

In this paper we develop an out-of-order hardware design to implement ASF on a future AMD processor and evaluate it with an in-house simulator.

P.S.: Did anyone notice, that the Wikipedia BD arch diagram is a colored version of my diagram, where I still speculated about the architecture? So it still contains FADD and FMAC sub units

pelov · May 31, 2012

Edrick said:
AMD support FMA4, not FMA3 yet. FMA3 is what Haswell and Piledriver will use. BD has FMA4. Minor detail I know, but the only one that will be used by developers is the one that both CPUs support (FMA3).

Trinity has FMA3 support, though that's still technically a paper launch (at least here in the US).

I wouldn't worry about AMD not adopting AVX2. They're on a 2-chips-a-year cadence atm (not including Brazos) with a new APU + Server/desktop being released every year through 2013. There isn't much set in stone after that, though.

That's ofc assuming they want/need to. Considering they added AVX, FMA3 and FMA4 chances are they probably will.

Arachnotronic · May 31, 2012

AMD seems to be ahead of Intel on the ISA side. I wouldn't worry about it either.

blckgrffn · May 31, 2012

Intel17 said:
AMD seems to be ahead of Intel on the ISA side. I wouldn't worry about it either.

Which is interesting, as the market only uses whatever ISA's Intel provides, generally.

It will be an even more interesting turn of events, I think, if consoles adopt AMD chips as speculated and some features like FMA-4 become widely used (perhaps unlikely, but I am just talking about an intriguing what-if scenario) and we see some performance synergy on AMD PC chips and games ported from consoles due to the ISA differentiation.

pelov · May 31, 2012

I don't think that'll happen, at least not with FMA4 > FMA3. The only reason AMD is adopting FMA3 is because Intel isn't using FMA4 so they've got to do what chipzilla does. You can't exactly dictate ISAs with a measly market share

The use of AMD chips/GPUs in consoles will be more interesting in their battle versus nVidia and not their x86 battle with Intel. Seeing all 3 major consoles utilizing AMD hardware and developers developing games specifically around AMD hardware is going to kick nVidia in the nuts. It'll be like TWIMTBP x 10 but on the red side

lamedude · May 31, 2012

FMA4 is Intel's fault. Intel was going to use FMA4 then changed their mind "which we found we could not accommodate without unacceptable risk to our product schedules". Like 3DNow! I wouldn't be surprised if AMD drops FMA4.

Denithor · May 31, 2012

What did nV do to piss off all the console makers? Or is more that they are all planning to put in a single APU to handle both CPU + GPU duties? In which case it makes perfect sense, as nV doesn't have anything suitable.

blckgrffn · May 31, 2012

Denithor said:
What did nV do to piss off all the console makers? Or is more that they are all planning to put in a single APU to handle both CPU + GPU duties? In which case it makes perfect sense, as nV doesn't have anything suitable.

AMD is probably more willing to create custom designs, etc. compared to nvidia or Intel. Both of those parties pretty much were dicks to MS in the days of xbox gen 1. Not sure of the current state of affairs of Sony and nvidia.

They certainly have the incentive to make that happen, whereas it appears nvidia is banking on ARM+Mobile for their longer term future.

exar333 · May 31, 2012

Denithor said:
What did nV do to piss off all the console makers? Or is more that they are all planning to put in a single APU to handle both CPU + GPU duties? In which case it makes perfect sense, as nV doesn't have anything suitable.

Cost is too high.

pelov · May 31, 2012

Both cost and nVidia being difficult to work with.

What looks to be a "done deal" at this point is that AMD will be the GPU choice on all three next generation consoles. Yes, all the big guns in the console world, Nintendo, Microsoft, and Sony, are looking very much to be part of Team AMD for GPU. That is correct, NVIDIA, "NO SOUP FOR YOU!" But NVIDIA already knew this, now you do too.

There are going to be game spaces that NVIDIA does succeed in beyond add in cards and that will likely be in the handheld device realm but we do not see much NVIDIA green under our TV sets. NVIDIA was planning to have very much underwritten its GPU business with Tegra and Tegra 2 revenues by now, but that is moving much slower than the upper brass at NVIDIA wishes. Tegra 2 penetration has been sluggish to say the least.

AMD has always been easier to work with than NVIDIA on the console front. Well that may not be exactly true, but Microsoft did not spend months in arbitration with NVIDIA over Xbox 1 GPU and MCP costs back in 2002 and 2003. I always felt as though that bridge was burned.

http://www.hardocp.com/article/2011/07/07/e3_rumors_on_next_generation_console_hardware
http://www.sfgate.com/cgi-bin/article.cgi?f=/c/a/2003/08/15/BU190365.DTL

They weren't just asking too much but also unwilling to negotiate on the prices.

Phynaz · May 31, 2012

lamedude said:
FMA4 is Intel's fault. Intel was going to use FMA4 then changed their mind "which we found we could not accommodate without unacceptable risk to our product schedules". Like 3DNow! I wouldn't be surprised if AMD drops FMA4.

Intel's fault? hmmmmm.....

kernelc · Jun 1, 2012

pelov said:
I don't think that'll happen, at least not with FMA4 > FMA3. The only reason AMD is adopting FMA3 is because Intel isn't using FMA4 so they've got to do what chipzilla does. You can't exactly dictate ISAs with a measly market share

Yes, I agree. And it is a shame: FMA4 is a non-destructive implementation, so it is somewhat more flexible than FMA3.

Regards.

Cerb · Jun 1, 2012

kernelc said:
Yes, I agree. And it is a shame: FMA4 is a non-destructive implementation, so it is somewhat more flexible than FMA3.

Regards.

X86 already relies on moves for everything (IA32 has no true GPRs, x86-64 only has 8, and the old SFRs are still used, so op+move and move+op is everywhere), and they've been optimized to no end by Intel and AMD. For the small amounts of code that would be benefited by FMA4, over FMA3, the extra code for FMA3 will be very small, and any performance penalty likely completely removed or hidden by the time it gets executed.

The same is true for scalar: 2-operand ISAs use only a few more instructions than 3-operand, usually well under 5% (often <1%, but it's going to vary), for the rare case where a destructive operation is suboptimal. It's important to remember than most in-register data is backed by memory, and naturally short-lived.

kernelc · Jun 1, 2012

Cerb said:
X86 already relies on moves for everything (IA32 has no true GPRs, x86-64 only has 8, and the old SFRs are still used, so op+move and move+op is everywhere), and they've been optimized to no end by Intel and AMD. For the small amounts of code that would be benefited by FMA4, over FMA3, the extra code for FMA3 will be very small, and any performance penalty likely completely removed or hidden by the time it gets executed.

The same is true for scalar: 2-operand ISAs use only a few more instructions than 3-operand, usually well under 5% (often <1%, but it's going to vary), for the rare case where a destructive operation is suboptimal. It's important to remember than most in-register data is backed by memory, and naturally short-lived.

It is true, but the ones needing FMA are anyway a small niche, as we are basically speaking of the HPC guys.

Based on what I read about HPC workloads, it seems that non-destructive operations where something to desire.

I understand that FMA4 can results in slight more complex forwarding network, but in some situation it seems worth the investment.

Regards.

Edrick · Jun 1, 2012

kernelc said:
It is true, but the ones needing FMA are anyway a small niche, as we are basically speaking of the HPC guys.

Huh? Just about every game uses FMA (since GPUs have supported FMA for quite a few generations now). Now game developers can offload some of the FMA tasks to the CPUs.

Edrick · Jun 1, 2012

kernelc said:
Yes, I agree. And it is a shame: FMA4 is a non-destructive implementation, so it is somewhat more flexible than FMA3.

Regards.

I believe (my opinion) is that Intel will add FMA4 support sometime in the near future (after Haswell).

Why do I think that? Simple. At the end of the day Intel needs to sell chips. And having a new instruction set (FMA4) will help promote it to the market. So from a marketing standpoint, it sort of makes sense why they didn't release FMA3 and FMA4 together.

bronxzv · Jun 1, 2012

Edrick said:
I believe (my opinion) is that Intel will add FMA4 support sometime in the near future (after Haswell).

why will they do such a strange thing ? FMA4 has no significant advantage in practice vs. FMA3, what can be the impact ? 0.1% speedup ?

bronxzv · Jun 1, 2012

kernelc said:
It is true, but the ones needing FMA are anyway a small niche, as we are basically speaking of the HPC guys.

a lof more than HPC, for example 3D modeling and rendering, sound processing, games engines

Avx2 + tsx

Diamond Member

Platinum Member

Member

Golden Member

Senior member

Platinum Member

Golden Member

Golden Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Member

Elite Member

Member

Golden Member

Golden Member

Senior member

Senior member