Edge, let me be blunt. Your support for SM3.0 seems unreasonably heated. A lot of ppl in this thread have already been through this excitement with R300 and PS2.0, and that was an extended let-down up to very recently. We're a bit more cautious now when it comes to new features. They may sound great, but we're more interested in playing games that demonstrate a card's features than reading about said features.
UltraShadow2 is not "just" stencil shadows. Play Splinter Cell on a NV3X and then a R3XX and you'll see the difference.
I assume you're referring to buffer vs. projector shadows, as R420 seems to be faster than NV40 in SC at the same settings. nV's IQ advantage has less to do with "UltraShadow" than "Xbox."
A few thoughts on your ATi/nV post:
Yes, it would seem nV made a relatively small leap from PS2.0a to PS3.0 functionality, potentially to their advantage. By the same token, ATi made a relatively small leap from PS1.4 to PS2.0, to great advantage with R300. PS2.x -> PS3.x is, by all accounts, a smaller leap than PS1.x -> PS2.x (though it's certainly still an important one), so I'm not sure ATi will have too tough a time adjusting. Just as you consider NV30 -> NV35 -> NV40 three revisions of PS3.0-level functionality, so I would consider R200 -> R300 -> R350 -> R420 four revisions of PS2.0-level functionality. But that's just talk; what ultimately matters most to consumers (and the developers who desire them) is speedy features available on a large number of cards, not idealism (witness PS1.1 vs. PS1.4).
Is NV40 IEEE32, or is it just "FP32?" Because the two aren't the same, AFAIK, and I don't think NV40 is fully IEEE32-compliant. FP32 hasn't seemed to help nV at all in the IQ dept., anyway, as it's still occasionally noticably uglier than ATi in both Far Cry and Lock On. (This may still be due to FP16/NV3x, though, so I suppose I'll have to suspend judgement for a month or two, to give devs a chance to retarget for NV40.)
Truth of the matter is and moral of the story, ATI got stuck with their pants down somehow by deciding to not take the time or money to develop SM3 class hardware sooner, rather than later.
I'm not convinced this is the case. ATi is working on Xbox2 tech as we speak ("R500"), and I'm pretty sure that's more advanced than R420. ATi may well be developing hardware more advanced than SM3.0, just not mass-producing it.
Its not as easy as people think, they cant just throw on 32bit FP precision... the rest of the SM3.0 requirements and run to the bank with the performance crown. It wont happen.
Well, ATi's DX9 vertex shaders have been FP32 all along, and they're still faster than anything nV offers. So ATi does seem to be able to run FP32 at speed, even if they haven't chosen to integrate it into their pixel shaders yet. I think ATi may have stuck with FP24 in their PSs because they had already laid them out in 130nm low-k with RV350. ATi obviously opted to hold back this gen, and only games can prove their decision right or wrong. I was one of many who thought R300's great PS2.0 performance would mean lots of nice PS2.0 games making NV30 owners a bit jealous. 18 months later, and we've only got a handful of PS2.0 games that run better on ATi. Will it be the same case with SM3.0? I think it will, and mainly because there's now a generationally larger base of SM2.0 cards out there than SM3.0, and devs won't want to ignore that larger potential audience.
I agree, though, that drivers for SM3.0+ hardware will get more complicated, more like CPUs. OTOH, CPUs have been down this road before, so it's not like either ATi or nV are entirely reinventing the wheel with conditionals and loops.
The rest of the market has to catchup to NV before moving on. And the rest of the market is not even at 32bit FP, they are very likely to not have but half of the performance when they do either.
Why do you keep assuming implementing FP32 in hardware automatically means slower performance? AFAIK, it doesn't. FP32 mainly requires more transistors than FP24, for both the logic and the registers/cache. NV3x choked in FP32 because of lack of registers, and NV40 still shows greater performance in FP16 because of its lower register usage. ATi used their transistor budget more wisely than nV with R300, IMO, and they may do well betting on the same horse again (shades of GF3->GF3Ti->GF4).
Also, the NV40 core DOES have the power to perform displacement mapping, and all the nifty features of SM3.0.. dont believe me? Well, I could elaborate.
I'd love to see real examples of NV40's features. Would you mind elaborating?
I'm still curious why you hold all sites other than AT in such low regard.
BTW, I'm not trying to "shut you up," as Matthias said, but, frankly, I can understand his frustration. A lot of what you're saying strikes me as less than reasonable or informed, as I've tried to say in my responses. I'm just trying to understand your POV.
UltraShadow2 is not "just" stencil shadows. Play Splinter Cell on a NV3X and then a R3XX and you'll see the difference.
I assume you're referring to buffer vs. projector shadows, as R420 seems to be faster than NV40 in SC at the same settings. nV's IQ advantage has less to do with "UltraShadow" than "Xbox."
A few thoughts on your ATi/nV post:
Yes, it would seem nV made a relatively small leap from PS2.0a to PS3.0 functionality, potentially to their advantage. By the same token, ATi made a relatively small leap from PS1.4 to PS2.0, to great advantage with R300. PS2.x -> PS3.x is, by all accounts, a smaller leap than PS1.x -> PS2.x (though it's certainly still an important one), so I'm not sure ATi will have too tough a time adjusting. Just as you consider NV30 -> NV35 -> NV40 three revisions of PS3.0-level functionality, so I would consider R200 -> R300 -> R350 -> R420 four revisions of PS2.0-level functionality. But that's just talk; what ultimately matters most to consumers (and the developers who desire them) is speedy features available on a large number of cards, not idealism (witness PS1.1 vs. PS1.4).
Is NV40 IEEE32, or is it just "FP32?" Because the two aren't the same, AFAIK, and I don't think NV40 is fully IEEE32-compliant. FP32 hasn't seemed to help nV at all in the IQ dept., anyway, as it's still occasionally noticably uglier than ATi in both Far Cry and Lock On. (This may still be due to FP16/NV3x, though, so I suppose I'll have to suspend judgement for a month or two, to give devs a chance to retarget for NV40.)
Truth of the matter is and moral of the story, ATI got stuck with their pants down somehow by deciding to not take the time or money to develop SM3 class hardware sooner, rather than later.
I'm not convinced this is the case. ATi is working on Xbox2 tech as we speak ("R500"), and I'm pretty sure that's more advanced than R420. ATi may well be developing hardware more advanced than SM3.0, just not mass-producing it.
Its not as easy as people think, they cant just throw on 32bit FP precision... the rest of the SM3.0 requirements and run to the bank with the performance crown. It wont happen.
Well, ATi's DX9 vertex shaders have been FP32 all along, and they're still faster than anything nV offers. So ATi does seem to be able to run FP32 at speed, even if they haven't chosen to integrate it into their pixel shaders yet. I think ATi may have stuck with FP24 in their PSs because they had already laid them out in 130nm low-k with RV350. ATi obviously opted to hold back this gen, and only games can prove their decision right or wrong. I was one of many who thought R300's great PS2.0 performance would mean lots of nice PS2.0 games making NV30 owners a bit jealous. 18 months later, and we've only got a handful of PS2.0 games that run better on ATi. Will it be the same case with SM3.0? I think it will, and mainly because there's now a generationally larger base of SM2.0 cards out there than SM3.0, and devs won't want to ignore that larger potential audience.
I agree, though, that drivers for SM3.0+ hardware will get more complicated, more like CPUs. OTOH, CPUs have been down this road before, so it's not like either ATi or nV are entirely reinventing the wheel with conditionals and loops.
The rest of the market has to catchup to NV before moving on. And the rest of the market is not even at 32bit FP, they are very likely to not have but half of the performance when they do either.
Why do you keep assuming implementing FP32 in hardware automatically means slower performance? AFAIK, it doesn't. FP32 mainly requires more transistors than FP24, for both the logic and the registers/cache. NV3x choked in FP32 because of lack of registers, and NV40 still shows greater performance in FP16 because of its lower register usage. ATi used their transistor budget more wisely than nV with R300, IMO, and they may do well betting on the same horse again (shades of GF3->GF3Ti->GF4).
Also, the NV40 core DOES have the power to perform displacement mapping, and all the nifty features of SM3.0.. dont believe me? Well, I could elaborate.
I'd love to see real examples of NV40's features. Would you mind elaborating?
I'm still curious why you hold all sites other than AT in such low regard.
BTW, I'm not trying to "shut you up," as Matthias said, but, frankly, I can understand his frustration. A lot of what you're saying strikes me as less than reasonable or informed, as I've tried to say in my responses. I'm just trying to understand your POV.