Is that typically the case? I had to make some NAND/NOR layouts in the VLSI course I'm in and for similar sized gates the NOR was a good bit faster. (like 150ps fall times for NOR and about 200-300ps for NAND). Just curious as I'm still learning.
edit: Nevermind. When you have a NOR you have two nmos in parallel for the pull-down so it's effectively twice as wide when both nmos are are, to have symmetric rise/fall times the pull-up needs to be pretty large. But for the NAND you have two PMOS in parallel which right off the bat helps balance the rise/fall times. It seems like the NAND could stay closer to minimum gate sizes while still being symmetric. All this also accounts for why the NOR has decent fall times (vs NAND when all gate sizes are minimum).
I just woke up, but is this half-way correct?
We're all still learning in this field I think. It's a very typical case. Also your reasoning seems fine, but as I understand it:
So, normally you don't size everything minimum as that is not the fastest gate structure. When you talk about speed, you can't just look at fall times (although ymmv, sometimes you just need one direction to be fast) you have to look at worst case which is going to be rise times for the NOR case. PMOS stacks are really slow, significantly slower than NMOS stacks. That is really the kicker from my understanding. The usual reasoning for which comes from differences in hole/electron mobility.
Another point, the classic sizing for a 2-input NAND and NOR means you double the widths of the transistors you're stacking. Doubling your NMOS widths instead of sizing up the PMOS network by 2 will net you a smaller change in total input capacitance compared to the nominal fast inverter sizing.