NostaSeronx
Diamond Member
- Sep 18, 2011
- 3,688
- 1,222
- 136
Nvidia is really weird about it;More akin to what nV does, really.
But yes, the SIMDs are completely different.
- Compiler-based instruction.r__-reuse
- 4 operand reuse caches to each 4 register banks each with 4 operand slots.
AMD way is;
- Scheduler-based, no new instructions
- Each ALU has a source buffer which can hold 6 source operands and a VDST which can hold 8 destination operands.
- SIMD16 thus has 96 source register reuses and 128 destination register reuses.
- Destination registers can be reused as source registers for dependent operations.
Which could explain why a 40 CU(66AF:F1)/36CU(66AF:F0) is as big as the Radeon VII die.