RTX 3070Ti(GA104 392.5mm2) -> RTX 4080(AD103 378.6mm2) is 80% faster at 4K(TPU).
RTX 3070Ti(GA104 392.5mm2) -> RTX 4070Ti(AD104 294.5mm2) is 43% faster at 4K(TPU).
RTX 3060(GA106 276mm2) -> RTX 4070Ti(AD104 294.5mm2) is 135% faster at 4K(TPU).
RTX 3060(GA106 276mm2) -> RTX 4060Ti(AD106 190mm2) is 41% faster at 4K(TPU).
Why you compare AD103 to GA104 or AD104 to GA106 is unknown to me.
ADA uses a vastly superior process, which in a bit smaller space can pack 2.64x more transistors(AD103 vs GA104), but It also cost a lot more to make. I wouldn't be surprised If the price per wafer was ~3-3.5x more expensive than what they had at Samsung.
There is no way they would price ADA as the previous generation, so no $600 for Ada103.
I have to correct some things you wrote.
RX 7900 XTX is 49% faster than RX6900XT at 4K(TPU).
RTX 3090Ti(GA102 628.4mm2) -> RTX 4090(AD102 608.5mm2) is 45% faster at 4K(TPU).
Even if Nvidia released the Full AD102 It would be ~15% more performance in my opinion and that is 67% over RTX 3090Ti or 42% over 7900XTX.
N31 GCD is only 300mm2, they could have increased the size to 400mm2, which would result in a GCD with 144CU: 9216SP:576TMUs:288ROPs, this is 50% more than what N31 has. I think this would have been enough to fight against full AD102 at least in raster.
AMD underestimated Nvidia's willingness to make such a big chip on a new process, but that doesn't mean they are incompetent at developing GPUs.
Just mirroring the general opinion on anandtech forums that the current iteration of 4060 which is based on AD106 should really be a RTX 4050 and the RTX 4080 should use AD102 and AD103 should be an used for the RTX 4070. The basis of these arguments as you have kind of shown is based on the die size and the memory bus.
Maybe these people are perhaps exaggerating, to perhaps increase the disdain that Nvidia is ripping them off but if Nvidia was aggressive or AMD was more competitive, we could see these larger die sizes be used.
I think a fully enabled AD102 has more in the tank because if we look at how much the chip is disabled, its quite a bit more than the RTX 3090 to 3090 ti which which had a 12% difference in performance. This most important part however is considering how much l2 cache was cut. The RTX 4090 only has 72 out of 96MB of L2 cache, considering how much more performance AD102 has than GA102 but with largely the same bandwidth, this is likely quite the bottleneck. GA106 has 32mb of L2 cache, something with as much power as AD102 needs more than a measly 72mb.
Also with the current design parameters of Navi 3x series, I think what your more likely to see is a jump in specs akin to Navi 32 to Navi 31 for 400mm2 of add die. Basically a 33% increase in specs vs current Navi 31. What your suggesting is something asymmetrical to the current design of the Navi 3 series making the engineering even more difficult. Also packing it too tightly would limit clocks and just increase the problems AMD already has. You would also need to a couple more MCD making this a monstrous 700mm die and if AMD current problems with Navi 31 regarding clocks and power still existed(likely magnified with a larger die), you would likely just get something equal to the current RTX 4090 maybe even lose, while Nvidia releases a fully enabled AD102.