Thief Mantle slides -
http://i.imgur.com/nt5217p.png
Some interesting stuff regarding smoothness.
Interesting, indeed.
So from that link, the key points, in my opinion are:
- Mantle still uses an abstraction layer, albeit this layer is thinner than the layer used in DX. This new layer allows developers to utilize the CPU
and GPU more efficiently. This new thin layer, combined with the extra functionality afforded to developers, is where the performance gains come from.
- Mantle was built to take advantage of certain key features of the GCN architecture, but
any architecture which supports a minimum feature set will be able to run Mantle. Again, the point is that Mantle is not vendor specific. These features aren't unique to AMD, so if nVidia builds an architecture which has similar features to GCN, it will be capable of running Mantle. However, because next-gen consoles are GCN-based, it makes sense that Mantle caters to the GCN architecture and will thus support GCN out-of-the-box.
-A lot of the optimization effort falls back on the developer's end, but this is a good thing because we want developers back into the driver's seat. More developer control means better optimizations, which means better games. This extra "burden" on developers isn't much different than the optimization process they use for extracting as much performance out of consoles, so this isn't new for them. In other words, developers are very familiar with this process. Note: Mantle is
NOT used on the consoles, but again, the optimization process is similar so transitioning to Mantle from consoles feels natural.
-As a result, you get better CPU and GPU utilization, i.e. you get better performance using the exact same hardware. Without Mantle, having only 50% of the GPU being utilized was not uncommon.
-Native DX ports will not get much better performance because the render pipeline is unchanged. In order to extract better performance, the render pipeline must be rewritten to be more efficient; Mantle allows the developer to do exactly this. Therefore, the render pipeline must be rewritten differently instead of "emulating" DX.