There are a few major types of clusters (In my own mind)
You have high aviablity style clusters.
For example you have 2 machines. One machine is a copy of the other machine. If the first computer fails the 2nd one kicks in and takes it's place automaticly.
Then you have server clusters..
You have 2 machines that load balance over a network providing network services at a spead and aviablity that a single PC can't do.
Also used in big databases and stuff like that.
Then you have combinations of the above. (For example Google uses Linux with in-house software for this, but they use lots of computers in lots of nodes... I forget the numbers.)
You have 2 pairs of machines. Each machine pair is a node. One node operates in tandam and provides high speed network stuff, the other pair waits on standby for the first pair to go down (if they crash or need updating or something), then they kick in...
That stuff is a lot like raid'ing harddrives. You have copies of machines working together to prevent failure and do things quickly.
Then you have computational clusters.
The famous type is a linux beowolf style cluster. You have a bunch of computers, all with same or similar hardware that splits a single computational task over a bunch of PC's and all go at it at once.
Generally used for scientific or mathmatical stuff. Like astronomy or building chemicals. Also used in DNA decoding and stuff like that.
For the actual computation the software is usually written in-house by however uses the stuff. Since mostly just math stuff they write the programs to do a certian task and that's about that. Like a program that a astronomer wrote would probably only be usefull for other astronomers and probably would have to be adapted to another computer setup..
They have developement software and libraries and stuff to be used to build beowolf apps, and the scientist or whatever would be using probably commercial or opensource stuff to run the databases and software, but the actual proccessing would be something they wrote,
At least that's how I understand it. I am DEFINITATELY NO EXPERT at beowolf stuff.
Check out something like
www.Beowolf.org
Now the easy stuff that I understand is called a Mosix cluster(well for the end user). A mosix cluster is used to make a cluster of computers into a single sudo SMP machine.
Nothing will ever get proccessed faster then the single fastest machine can do it, but you have have a bunch of different stuff on one machine and the treads and forks will migrate to another machine.
A single Mozilla session won't be helped, but a kernel compile along with a bunch of desktop software and a bunch of Mozilla sessions and a bunch of quake games will all be moved around from computer to computer to load balance all the proccessors.
Or if you have something like a multithreaded Database software or a web server you can use a mosix cluster to make everything run much faster and be more responsive.
Mosix was the original name for this stuff and was under a Opensourced liscence, but the guys who developed it decided to try to make more money off of it and released the next version under a very restrictive liscence (sounds familar?), a fork quickly sprouted from that and OpenMosix was created and developement pretty much leaped from there and quickly surpassed (from what I understand) the closed source mosix stuff.
See here
That is were you can find easy to use bootable CD's to quickly build a cluster out spare computers... they don't even have to be the same speed or anything, it will effectively load balance a 2000mhz machines with 200mhz machine. It just assigns different metrics based on the speed of the machine and the network stuff. (IE 90% of the stuff will get put on the 2000mhz machine and only move to the 200mhz machine when it can proccess a thread faster).
People even use this for stuff like major LAN parties for gigantic high speed quake servers for example.
You can combine everything together and have many different styles of clusters on a single cluster. You can do a beowolf/mosix cluster at the same time, or even do a high aviablity network-mosix-beowolf database cluster if you want.
The best OS for this sort of stuff right now is Linux. MS has down network clustering stuff since NT, and the third fastest (KNOWN computer cluster, for example I bet NSA has some cool linux stuff going on behind back doors.) computer cluster in the world runs OS X. And I am sure that their are Solaris, *BSD, and other cluster projects out there.
edit:
The one catch (that I am aware of) for mosix stuff is that software compiled to use specific cpu instructions has to run on similar CPU's. For example a 3d app using MMX instructions will fail on a 486, since a 486 doesn't have any MMX stuff, or you couldn't run a PPC program on a x86 computer.