Originally posted by: oconnect
Most people in the world use some sort of MS Windows product on there computer. As we all know a requirement to run windows is to have a computer that is based on x86 architecture. I've heard from many people mentioning architecture that is far superior to x86. We've reached the point that if you?re not running x86 you don't have much a variety in applications you can runv thus less you can do on your computer.
Windows NT, up through Windows 2000 (beta 2?) ran on Digital's Alpha CPUs. But I get your point.
Originally posted by: michaelpatrick33
x86 is a cludged twenty year old piece of crap and yet, and yet still it lives and now AMD has the gall to expand to 64bit and double the registers and widen the registers and clean out some of the old x86 cludge. Damn them!
You would prefer losing all of your current applications, having to buy everything again, or using old applications with slow, less-reliable emulation? In 64-bit mode, x86 is
relatively sane.
To answer the original poster's question - yes, I believe x86 hampers performance. Take a look at the
K7 and K8 pipelines - for K7, stages 2, 3 and 4 would not be required in an instruction set with fixed-length instructions (e.g. MIPS). For very advanced MIPS designs, you'd still probably need some equivalent of stages 5/6, and all architectures need stage 1. This affects performance when you mispredict a branch and have to stop all the work you're doing and start over. A longer pipeline means more wasted work, and a longer delay until you've finished executing the first correct instruction.
Also, the extra x86 decode logic consumes chip area (increasing cost), and requires additional power.
One interesting design is a Trace Cache (found currently only in P4s, I think), which saves the work of decoding instructions once they've been decoded the first time. With a trace cache, after you've executed a set of instructions, they are saved in a decoded form for use in the future, so the next time around, instructions effectively only have to go through the pipeline starting at state 5 or 6, rather than 1. This is advantageous because instructions are fetched from memory as variable-length and densely packed (so they're generally only a couple bytes long), meaning a given amount of memory bandwidth can supply many more x86 instructions (1-15 bytes, but the vast majority are in the very small size range) per microsecond than the same memory could supply MIPS instructions (all are 4 bytes). Once the instructions make it to the trace cache, both the x86 CPU with trace cache and MIPs CPU without would probably need similar pipeline lengths.
This doesn't mean the backend of the x86 CPU is going to become as easy to design as the MIPs CPU. In x86, for example, you need hardware that can multiply 8 bits * 8 bits with a 16 bit result, 16b*16b and 16b result, 16b*16b and 32b result, 32b*32b with 32b result, and 32b*32b with 64-bit result. In MIPS32, I believe you just need 32b*32b with 64-bit result. This means you need to wrap a lot of extra logic around the x86 multiplier than you do for the MIPS one. That means extra delays, extra power usage, etc.