Originally posted by: Gatecrusher
Ok so the 64 bit processors are now common thing..
now my question is: are there and 128 bit processonrs out there.. Can there be? will there?
I gues I'd like to know how does the 32bit differ from 64bit one.. does it mean it simply crunches more data per cycle (64 bits of it ??) Or what does the 64bits actualy mean.
-Please enlighten me if u can.
Edit: (
First you should understand that we will not be using the new 64-bit cpu-part of these CPUs, until we run Windows Vista 64.)
No. 64-bit (in itself) doesn't really have much to do with performance in terms of computing speed. It's much, much more important than so!
(I've cut and pasted a lot of material I have written before. Hopefully this comes out alright)
Here's my FAQ for you:
Q: Why would we need a 64-bit processor?
A:
In order to be able to run 64-bit software.
Q: Why would we need to run 64-bit software?
A:
Because 32-bit applications and OSes are at the end of the line. They basically can't evolve much farther, from where they are today.
Q: Do you need a 64-bit OS to run 64-bit applications?
A:
Yes. Due to the way hardware is handled in our modern days, there won't be the option of launching a 64-bit app in DOS mode, as were the case with some 32-bit applications. It has to run under an OS.
Neither is it possible to even imagine some kind of 'emulator' running 64-bit apps under a 32-bit OS, because of much the same reasons you can't ever transport a 747 jumbojet in the trunk of a car.
Q: Are there many 64-bit apps around?
A:
No, not yet, but there can't really be, until there's enough 64-bit PCs out there on the market, ready to run them. Intel introduced 32-bit mode with the '386 long time ago. But it really wasn't used as 32-bit much, until Win95 and Linux came along much later. But there wouldn't have been any Win95 or Linux, without 32-bit cpus out there, ready to run them.
Q: Is there any 64-bit OS available?
A:
Yes, there's WindowsXP64. And there's also 64-bit Linux of course. But I expect 64-bit will really take off with Windows Vista.
Q: Why are we using 64-bit processors now then?
A:
For right now, primarily as a fast 32-bit cpu, with a 32-bit OS. Same as '386 and '486, in the old days, which were mostly used as fast 16-bit cpus. Until Windows95 came along.
Q: Is 64-bit about processing larger/wider chunks of data at time, thus increasing performance?
A:
No.
The AMD K8's internals are universally generalized to at least 64bit width.
But data widths will generally be pretty much the same as in 32-bit cpus today.
Character is 8bit or 16bit (unicode).
Integer still defaults to 32 bit, even if integer registers are now 64bit wide and can handle 64 bits..
Fp is still 32bit and 64bit (double precision).
Vectors also remain, for now, 64 and 128bit.
Now, on the other hand: The instructions' address fields, used to refer to location of data, is 64bit instead of 32bit. This is the difference!
Software pointers are 64 bit long, instead of 32 bit. To handle pointer arithmetic efficiently, integer registers are 64 bit wide.Of course, if you need to handle very long integer fields, like in encryption, the 64-bit integer registers and operations are going to enhance performance. But really, they are there mostly to handle pointers.
Q: But 64-bit will be faster, right?
A:
Yes, somewhat faster.
Q: But why then?
A:
Because new addressing needs a new ISA, new binaries. Since we have to define a new instruction set anyway, might as well make it a little bit more modern, clever and rational than Intel's old '86 ISA. So we've got more flexible registers, more of them, and instructions making use of them. In 64-bit mode that is. That's one reason why it will be faster. Another reason is that memory mapping is simpler and more streamlined.
Some early ports hint at 10%-50% improvements, but this is really up to the compiler's ability to optimize for the additional registers.
Q: Is that why 64-bit cpus run 32-bit code faster?
A:
No. 32-bit applications run on AMD'86-64 processors just like on 32-bit processors, with all the same limitations. The reason AMD's 64-bit cpus are so fast, even on 32-bit mode, is that they're a new generation of processor technology, K8 core, that is more efficient and higher performing. Again it's analogous to the old '386 and '486, that were faster than '286 in 16-bit mode, for exactly the same reason. The K8 core has an ipc of 0.62 at SpecINT, the older Barton K7 core an ipc of 0.47.
Q: Will we need 64-bit applications then, if it's only "somewhat" faster?
A:
- Oh, big YES. Definitly!
32 bits can only address 4GB. (without a terrible segmented memory model noone wants)
There's a 2GB memory limit imposed on 32-bit Windows apps! This is not just a limit on ram, it's the absolute limit of the virtual memory model available to an app. So it's quite serious. Remember, that a virtual memory map is highly fragmented.
For all the bullshit from Intel and the likes of tomshardware, pretending 32-64 bits is no issue, the harsh truth is that PC computing will go nowhere on 32 bits. We absolutely need 64-bit to move on.
Consider the old 640KB limit of 8086. This is where we are again today.
And it's not just the application. The hardware, OS and it's various objects, also need to be mapped into other sections of the 32 bit's 4GB space.An increasingly advanced OS also needs more mapping space to the app.
Q: But the PC managed to get by with Windows16 for many years, while Mac and others were 32-bit. Won't it be the same this time?
A:
No. The 16-32 bit migration and 32-64 bit migration are two completely different aspects of addressing technology. 16-32 bits were about moving from a segmented addressing model to a flat linear 32-bit memory model. While 16-bit computing was contorted, slow and buggy, it could well perform the type of tasks that were feasible on the amounts of RAM that were affordable in those days. We're talking about 2 - 16MB.
This is not the case now. Both 32 and 64 bit are linear addressing. This time it's about the size of the memory model. And while you could have a 16-bit Photoshop on Windows3.11, even while a 32-bit version performed better on a Mac, that won't be the case with 32/64 bit. There is no chance of a 32-bit app being able to perform the kinds of tasks that 64-bit computing will bring.
Q: When are we ready to migrate to 64-bits?
A:
- Well, we still need the apps to make use of it, ok? But in general terms, like yesterday. We're late. Windows Vista 64-bit is immediately needed.
Q: But why would we need all that memory. Isn't it just for bloatware?
A:
No. Explained in detail it goes like this:
The software and forms of use of the PC, that are available at any given time, are defined by memory needed for models/objects being handled.
Thus, simple writing, editing, calculating could be done on a 16-64KB CP/M text only display 8-bit PC.
There were lots of people around at the time who figured 128KB and 8 bits were enough for all PC use forever. But that is just because they couldn't fit into their mind to use computers as WYSIWYG publishing tools. Or editing pictures. (-"Photographs in a computer? - Ho, it would need like a MEGABYTE ram for graphics!")
Creating/editing music and sound. Rendering realistic images. Editing video. Solid modeling. Physics simulations. Voxel handling/displaying (Adam & Eve, CAT-scan etc).
All these new uses for the computer came about as the necessary amounts of memory became available.
So it's ultimately the price of RAM and harddrives that dictates what we will use a computer for.
So, take a good look at that limit of about 1.5-1.8GB (practical, remember fragmentation) and compare to prices on RAM and capacity of harddrives. And compare it to how much memory (Including swap. It's important to understand that it's about the size of the virtual map. Not just ram!) you yourself tend to use today. And by all means, how much recent games require. Then, looking backwards for historical guidance, try figure where we will be in 12-18 months.
Q: So what's the memory limit for 64-bit code then?
A:
Well, for x86-64, the virtual space to rumble around in, is fully 16 ExaBytes.
But "only" 4 PetaBytes in that space can be mapped, so that's the limit.
However, that is for AMD'86-64 as such, not for the current K8's.
Current K8's have 256 Terabytes space to rumble around in, but "only"
a total of 1 TeraByte can be mapped, so that's the limit of virtual memory. But again, the space is much bigger, which should be a help for many things, including fragmentation.
Even further limiting is WindowsXP64´addressing scheme, which I understand will give you 'only' 16 Terabyte virtual space, and initially map to only 16GB. (actually it seems it will only map to 4GB for Intels current EM64T processors, while Windows64 will eventually be able to map to fully 1 Terabyte for AMD86-64 processors, and hopefully future Intel CPUs.)
One ExaByte is 1024 PetaBytes.
One PetaByte is 1024 TeraBytes.
One TeraByte is of course 1024 GigaBytes, which you might already be familiar with.
More immediately, current implementations of iAMD'86-64 processors, both AMD and Intel, are of course more limited in physical address space. In case of AMD, the most constrictive component is the integrated memory controller (currently 16GB). Opterons can use other Opterons memory controllers over HT links to access 128GB. Intel implementations too, might have some issues beyond 4GB (sofar). But the important thing is that the software memory model is not limited. It will have enough addresses.
Q: So we're not likely to see 128-bit or 96-bit addressing soon then?
A:
No, if it happens at all in our lifetime, we're probably going to be in a state where we don't care much anymore anyway.
We are most certainly though, going to see increased widths of paths and vector processing though. This is the width that is intuitively mostly mistaken for being the "bit -issue". Take for instance the case of marketing game consoles. Currently, on the PC, on our "32-bit" cpus, that width is 128-bits. But rest assure, that this will increase to 256, 512, 1024... on "64-bit" cpus.
More on how it works.
CPU Instructions
Machine code instructions do very simple things. They manipulate bit fields. A single bit field can be inverted, rotated, shifted or set to some explicit value. Two bit fields can be AND-ed, OR-ed, ADD-ed, IMUL-ed, compared etc...
For all this to work, the processor must be able to know WHAT instruction to execute next. It must be able to refer to the instruction. And then Each instruction must also be able to refer to the bit fields it is going to manipulate. All this, from the software's point of view, is done by something called virtual address. This is simply a number, explicitly stated or derived in some fashion.
The functional model for this is that all program instructions and all data exist in a numbered space. This is the
virtual space. For each 8 bits of digital data, each byte, there is a number. Instructions that want to refer to some data in this space, do so by the number corresponding to the "location" of the first byte of the involved data.
"16-bit" software is made up out of machine code instructions that use a 16 bit long number to refer to "location" of instructions and data.
"32-bit" software is made up out of (different) machine code instructions that use a 32 bit long number to refer to "location" of instructions and data.
"64-bit" software is made up out of (different) machine code instructions that use a 64 bit long number to refer to "location" of instructions and data.
A 16-bit cpu is a cpu that can only run 16-bit software.
A 32-bit cpu is a cpu that can run 32-bit software.
A 64-bit cpu is a cpu that can run 64-bit software.
For the sake of software simplicity, efficiency and reliability, it's extremely beneficial to have a very large
flat virtual space. The flat space in a 16-bit software model is only 64KB. The flat space for 32-bit software is 4GB, and finally the flat space for 64-bit software is 16EB (1 ExaByte = 1024 PetaByte, 1PB = 1024TB, 1TB = 1024GB).
There are two reasons for this. First of all, if the program and its data (and it's API calls) cannot be held within a single flat addressing space, the software must have several flat spaces, and itself must explicitely know and keep track of which address space, or
segment, contains every single piece of data involved. The software application must then also make sure that the correct segment is always set in the CPU's segment register. This is contorted and inefficient, to say the very least.
But that was how 16-bit software worked. Remember Windows3.11? The big revolution of 32-bit software and Win95/NT was that the segmented software model was dispensed with. Our current 32-bit software is in fact based on the concept of a separate single flat virtual space, for each and every process running under the OS.
But 32-bit software only goes so far. The virtual space becomes fragmented! This is very important to remember. We cannot use it all. Ideally, we should have a much, much larger flat virtual space than we can ever use (and that is exactly one of those things 64-bit offers). Whenever the software figures to set up another large block of data, there should be no problem of finding free "numbers" for referring to that data. It's also important to remember that the OS' services and resources also must be mapped in, as well as any "shared" resources.
Today, the limit where a 32-bit application risks stop working, is somewhere around 1½ GB, used for code and data.
There are only two solutions to this. Either we go back to a segmented software model, with all it's problems and limitations, or we go to a 64-bit software model. The choice is extremely clear. And actually already made for us by both MS, windows64, and Linux. Noone intends to go 32-bit segmented. Oracle did, for lack of better options, but that's not the future.
Accessing ram:
Physical ram then works this way: Every "number", ie virtual address that is in actual use by the software, (or any software currently running) is associated with an unique current location in either physical ram or in swap area. This work is handled by the CPU's memory manager and the OS. This physical memory doesn't become fragmented, thanks to a paging mechanism.
This paging mechanism can handle more than 4GB physical ram, even for 32-bit software (segmented), like 36-bit physical address (64GB), for instance.
So this often stated 4GB limitation of 32-bit CPUs is simply not true!
What is true, however, is that our current 32-bit software format, is much more limited than that. Large software tools
run aground somewhere after ~1½Gb, like maybe 1.7GB. So the often also stated assumption that we're fine on 32-bit CPUs, as long as we don't need 4GB,
is likewise also unfortunately untrue.
So this 64-bit is simply yet another migration to a new CPU. A migration that is neccessary because the old one isn't good enough any longer. We have already made that type of migration two times before, on what is known as "x86".
The four x86 families are: (as identified by some examples):
1. 8088, 8086
2. 80286 aka '286, NEC v20
3. '386, 486, Pentium, K5, P-II, K6, K6-2, P-III, K6-3, Athlon, P4, Duron, AthlonXP, P4B/C/E/A, 500-P4, Sempron, PentiumM.
4. Opteron, Athlon64, Sempron64, 501-P4, 600-P4 (P4F), Athlon X2, 800-PentiumD.
The old 32-bit architecture ('386, '486, Pentiums, K6, Athlon/XP) has three different user modes (or four depending upon how you count), representing three (8086, '286, 32-bit) different CPU personalities in one CPU:
Real mode = original 8086. [8086]
protected mode = supporting both older 16-bit protected mode ['286] and 32 bit computing. [32-bit]
Virtual real mode (virtual mode) = submode of 'protected 32-bit mode', emulates an original 8086 inside the protected mode. [8086]
This is known as 'IA32', but is basically the '386. A consequence of the long '86 PC legacy.
The various extensions since, FPU, MMX, SSE, SSE2, may add registers and instructions, but isn't as fundamental change as going 64-bit, (or going 32-bit from 16-bit).
The x86-64 CPUs now have five (or seven, depending on how you count) modes, representing four (8086, '286, 32-bit, 64-bit) main CPU personalities in one CPU:
Legacy mode/real mode = original 8086. [8086].
Legacy mode/protected mode = protected 16/32 bit code. ['286] & [32-bit].
Legacy mode/virtual mode = emulating 8086 inside protected addressing. [8086].
Long mode/compatibility mode = emulating 16/32 bit protected modes inside a 64-bit space. ['286] & [32-bit].
Long mode/64-bit = Our brave new world! 64-bit computing [64-bit]. A 64-bit virtual address space. This last mode also includes double the number of registers, and 64-bit integer GP registers. These things are inherent in x86-64.
*****REAL MODE*****
The first mode - 'real mode' was the addressing mode of the 8086.
The 8086 ISA have instructions that refer to data in memory with a 16-bit address. 16 bits can only represent 65,536 (64K) numbers, and are not enough to specify a byte in memory. So the instructions 16 bits (page) are combined with the content in a 16 bit segment register, to form a 20 bit address (0000 is appended to the end of the segment and the instruction address is added).
This address goes out directly on the physical addressbus. The software generating the instruction's 16-bit address must also set the segment register, and is fully responsible for the outgoing 20-bit addresses being correct.
(This is a horrible way to build software)
Because the software's own idea of addresses, the softwares memory model, is exactly the same as the physical memory, - this is called "real mode".
And because only 20-bit addresses can be formed, in this mode of addressing, only 1MB can be addressed. 2^20 = 1,048,576 = 1024K = 1M. Classic DOS, '86 DOS, take this memory environment for granted.
*****16 BIT PROTECTED MODE*****
The second mode, 16-bit protected mode, was introduced with the '286.
Again, the ISA use instructions with 16 bit long addresses. Again the software itself is responsible for setting the segment register. (Code always only refer to one segment of memory at time, and the software must switch this segment through the segment register.)
But this time the software sets a 13-bit selector in the old 16-bit segment register. - And it is _NOT_ combined with the 16-bit address to form the physical address.
Instead, there are two new 40-bit registers in the cpu, 'global' and 'local' 'descriptor table register'. These are not set by the application software. The OS sets these, with privileged instructions.
These registers points to a 'descriptor table' in memory. (or rather cache)
The OS can set up one 'local descriptor table' for each application. When the OS intends to run an application, it first loads the address of the application's descriptor table into the 'local' register.
The table contains 'page descriptors'. these are 8 byte data structs. The OS sets up these page descriptors as it allocates memory for the application.
So the following happens, when an instruction refers to some data with a 16-bit address: The selector, in the segment register, selects a page descriptor in the descriptor table that the 'local' register points to. This page descriptor gives a 24-bit address to the base of a 64KB segment. It is then used together with the 16-bit address to form a physical 24-bit address. This is good for addressing 16MB memory.
So the _software_ address _is_, as before, the segment register and the 16-bit address. But this time, that address is mapped to something completely different.
Since each application can have it's own mapping (local table), - this way -, different applications use of memory _can_ be kept separate and protected from each other. Thus: "protected mode".
This '16-bit protected mode' is what 'Windows16' software use.
*****32-BIT PROTECTED MODE*****
Both "real mode" and "16-bit protected mode" are pretty close to software HELL, and thus quite uncomfortable. The '386 introduced a 32-bit ISA, and two new addressing modes. The most important is the '32-bit protected mode'. This is important for two reasons: It offers greater memory space. And most important of all - it can be used to implement a flat, linear software memory model!
It looks a lot like 16-bit protected mode, but new instructions now use 32-bit addresses, and the segment register too, is 32 bit wide.
Typically, an application is linear, has a 4GB large 'segment, and the application _never_ changes its segment register. So the software model address is in practice only the 32 bits included directly in the instruction. - How convenient! No segment selection to manage.
Used this way, and Win 9x and NT do, the segment register doesn't come into play.
As before, that 32-bit virtual address is mapped to an effective address, with a memory descriptor. But in this case we are using the first 20 bits of the 32-bit address, as indexes into the descriptor, to form the 32-bit address of a 4KB *page* (it doesn't overlap) rather
than a segment. The last 12 bytes gives the address into this 4KB page.
It's possible for the OS to cut off the software completely from accessing any hardware address. And it can shuffle around and store those 4K pages anywhere it want, including harddrive.
With the help of a lot of clever complexity, descriptors and table registers are compatible with 16-bit protected mode, and allows an OS to run both types of modes, side by side, in a single memory management.
However, old 'real mode' cannot run together with either protected mode. In real mode, the software must have direct access to the lowest MB memory, and also goes out and stomps all over it.
*****VIRTUAL REAL MODE*****
For that reason, the '386 also introduced 'virtual real mode'. This is a mode that fools an old application into believing that it is running in real mode. In reality, it runs in protected mode, and that 1MB *physical* ram is mapped to wherever the OS wants it.
This is how the Windows OSes handle DOS inside windows and concurrently with other apps. This is a sort of 8086 emulator in hardware, inside the cpu.
*****LEGACY MODE*****
Under a 32-bit OS current 64-bit CPUs run exactly like the old 32-bit CPUs, from '386 to Athlon&P4.
And thus have all the above modes. This is called "legacy mode" in iAMD86-64.
*****LONG MODE/COMPATIBILITY MODE*****
In long mode, running under a 64-bit OS, the CPU is an entirely new CPU. It is NOT compatible with any old interrupts or system/hardware level instructions! This is why no 32-bit driver will work.
It is however still able to execute old 32-bit (and 16-bit) application instructions in longmode. It does this by first generating an address by way of combining with the old segment registers. It does not however use any 32-bit mapping scheme. Instead it treats this combined address as a 64-bit address, and sends it to the new 64-bit memory mapping. So it's a bit like a hardware emulation layer.
Windows 64 will not bother with supporting 16-bit applications. Which is why I put 16-bit in parenthesis
*****LONG MODE/64-BIT*****
Not much to it. Software use 64-bit addressing instructions. Ultimately, 4PB worth of addresses residing inside this 64-bit space will be mapped to 52 bits physical space. Currently though only the 48 lower bits will be used. From this 48-bit (256TB) space a total of 1TB can be mapped to 40-bit physical space. Software must however use all 64 bits for addressing purposes, to preserve compatibility with future processors and software.