AMD and float

f95toli

Golden Member
Nov 21, 2002
1,547
0
0
I have a couple of questions regarding AMD processors (duron,athlon etc) and floating point calculations. I know that CPUs from AMD used to be faster when handling float (the datatype) than Intel, I if remember correctly is was because AMD needed fewer clockcycles to handle 32-bit floats or something.
This is suppose to be the reason why people that want to do time-consuming calculations tend to buy AMD. I know for example that most Beowulf-type clusters use AMD.

My two questions are:
1) Is AMD still faster or is it just that AMD procssors are cheaper?
2) If AMD is still faster: Why?




 

rIpTOr

Member
Oct 9, 2000
105
0
0
The current Athlon XP CPU's can do more work per clock cycle. The fastest AMD CPU you can buy is clocked @ 2250MHz. Intel CPU's have a long pipeline and need high clock speeds to get the same ammount of work done. This is why a 2250MHz Athlon can be as fast as a 2800MHz Pentium 4.

AMD CPU's are generally cheaper because they aren't in the same position (financially) as Intel.
 

f95toli

Golden Member
Nov 21, 2002
1,547
0
0
Originally posted by: rIpTOr
The current Athlon XP CPU's can do more work per clock cycle. The fastest AMD CPU you can buy is clocked @ 2250MHz. Intel CPU's have a long pipeline and need high clock speeds to get the same ammount of work done.

But what does"fast" mean in this case? I though it had more to do with the way AMD handled float (as opposed to int)? Note that I am not talking about performance in 3D-games here (which requires low-precision math), what I am meant was performance in things like scientific computing were you need high-precision numbers (many significant digits to put it simple). Performance when it comes to handling integers is not really relevant in this case.

If it only had to do with the pipeline, shouldn't that affect the speed for integers as well?





 

KF

Golden Member
Dec 3, 1999
1,371
0
0
The Athlons do floating point faster than a P4 on a per clock basis. At some clock speed a P4 will be fast enough to beat a slower clocked Athlon. I don't know the ratio. The Athlon devotes a lot of resources to the floating point in comparison to the P4. The advantage in some games of Athlons labeled with a model number similar to a to a given P4s clock speed is often attributed to a games heavy use of FP. OTOH games that are thoroughly optimized for SSE2 instructions generally outperform similar model Athlons

Even programs that are primarily floating point use other instructions as well, so the long pipeline liabilities can have some effect on the program execution speed. If optimized for a P4, there should be no great penalty.

It was a conscious choice with the P4's design to give up per-clock performance in favor of a higher attainable clock speed. For a while, P4s had a hard time ramping up the clock fast enough to keep pace with the Athlon design. Now the fastests P4s are pulling away a little from the fastest Athlons. Naturally Intel is asking a (large) premium for the fastest obtainable. What else? Do the fastest P4s outperform the fastest Athlons on FP? I don't know.

The manufacturers as well as the pundits make all kinds of claims for why one processor will be faster than another, but generally all processors have design measures to mitigate the effects of possible slowdowns, and likewise there are unavoidable stalls due to the limitations of resources available in particular situations. Some compromise on resources is necessary to keep the chip cost down. In the end, you cannot tell what will happen without an accurate simulation, or actually running real programs. The Athlon just has more resources available on average than the P4 for floating point. The FP execution units on an Athlon are more complete, more independent, and have fewer special limitations. In general, the resouces which the Athlon has to prevent stalls and interlocks are gross overkill. The P4, OTOH, is very adeptly balanced and minimized. Intel depends on its market dominance (80% ?) to persuade programmers to program around its processors pitfalls. AMD cannot adopt this type of strategy, obviously. The net result is that it is a lot easier to hand optimize the key FP speed loops for an Athlon, so the Athlon in general will easily outperfom a P4.

There is nothing inherently cheaper about the Athlons chip design, at least not to the naked eye. On the contrary, one would guess that the P4 has an advantage on chip cost. It appears to me that AMD gets high switching speed by the traditional method of high current. AMD has been mostly a step behind in applying leading edge chip processes to its CPUs, but that technology is also somewhat cheaper. Intel seems to use an unusual method to get chip speed up. Lower temperatures. (Switching speed increases with decreasing temperature.) To keep in the performance range of Athlons, this requires next generation chip processes (which require lower operating potentials), but Intel has been able to do this. Intel seems to be spreading its transistors over a larger chip than one might expect, to get heat density down, and therefore lower temperatures. Still, to beat the Athlon by using higher clock speeds, Intel has pushed the P4s clock to the point it draws similar currents. (Current is proportion to clock speed.) Now Intel has the P4 doing multiple threads concurrently to boost the instructions per cycle, by using otherwise unemployed CPU resources. But using more resources also means drawing more current, and with that, a higher temperature.

 

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81


AMD CPU's are generally cheaper because they aren't in the same position (financially) as Intel.

Actually they're cheaper because you're not paying for a name, and because it's cheaper to manufacture an 80x80 core than a 120x120 core... I believe that's the size of the cores now... 80mm being the AMD's and 120mm being the P4's
 

dannybin1742

Platinum Member
Jan 16, 2002
2,335
0
0
hmmmm
in my xray lab we only use p4 machines, and we buy new ones every year, contrary to what is beleived about how much work is done in a clock cycle, alot of scientific work i do is based off clock speed


we do have some old indigo 2 sgis we use though that we keep running, our xray crystalogrphy machine is run and data is processed by a dual p3 rig
 

f95toli

Golden Member
Nov 21, 2002
1,547
0
0
Originally posted by: dannybin1742
hmmmm in my xray lab we only use p4 machines, and we buy new ones every year, contrary to what is beleived about how much work is done in a clock cycle, alot of scientific work i do is based off clock speed we do have some old indigo 2 sgis we use though that we keep running, our xray crystalogrphy machine is run and data is processed by a dual p3 rig

I agree. But most calculations still require fast floating point calculations. We use dual Athlon MP (Tyan,running Linux) for smaller jobs. We have access to a few supercomputers (Cray, Sunfire,IBM etc) but then you have to schedule your run. The reason why I asked is that I have seen quite a few new clusters popping up here and there, it makes sense for certain types of jobs which do not require much trafic between diffrent CPU. It is also very cheap, you can buy several "normal" PC for the price of one Sun workstation.
I was still a bit surprised when I realised that most clusters use Athlon instead of P4, I guess the better float handling is the reason (the price-diffrence between AMD and Intel is too small to explain this, the high-speed network cards and the memory you need in a good cluster is much more expensive than the CPU anyway)

Also, why do you need a dual P3 to process XRD-data?

/Tobias



 

Vape

Junior Member
Dec 18, 2002
15
0
0
Good info KF. Nice to see someone that actually will explain the topic rather than a one sentance general statement... actually I feel the urge to be one of those...


Amd is faster than intel due to the design of the chips and the apps that are not optimized for either athlon or pent4 as there is no advantage of sse2 and stuff. I think i just confused myself there:

Amd is designed more efficiently (regarding clock speeds compared to intel) but they run hotter (which means inefficient...?)

On proggies that are not optimized for either chip, the Amd will win with floats.

When you start doing comparisons of chips on proggies like flask and other divx encoding - sse2 is used and it "appears" that Amd could not process that task faster to save themselves. But really, the encoding is using sse2.


No point comparing two tyres on the road when one tyre is using road surface that matches and goes inbetween the tread...
 

dannybin1742

Platinum Member
Jan 16, 2002
2,335
0
0
the dual P3 rig controls the entire Xray machine, and we have a cluster of 6 older sgi machines, and 3 indigo2 work stations, and now 2 dell boxes clocked at 2.533ghz, we mostly use Xtal under linux, with the speed of the cpu we can do fast forrier transforms (2fo-2fc).


the dual p3 that handles the xray machine runs som type of proprietary software, but what it does is create a diffraction cloud as a .shelx, .phs, or a .pdb file, from the collected diffraction plates. this file can then be viewed by Xtal in any Unix OS, such as irix, or redhat which is what we use. from there we
can do structure refinement, we've found its cheaper to upgrade computers every year than pay for service plans, so this year i think we are going to upgrade to 3.06 with HT.


apparently on the older boxes (pre gigahertz era, it was really hard to do FFT with the structure, the computer was constantly playing catchup.
 

dannybin1742

Platinum Member
Jan 16, 2002
2,335
0
0
trust me, i'd like to build our lab a nice dual XP MP server, but its just easier for us to buy dell machines, and install linux on them once we get them
 

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81
Amd is designed more efficiently (regarding clock speeds compared to intel) but they run hotter (which means inefficient...?)


Why do so many people misunderstand this?
They run hotter than Intel because they use a higher voltage, and because the core is so much smaller. The P4 core is now 131mm square... which means it has 17,161 square mm to dissipate heat through. The AMD T-Bred Revision B is 84mm square... which yeilds 7,056 square mm to dissipate heat. That's less than half the area that Intel CPU's have.
Is this a handicapp? Not really... I don't know of anyone who has problems cooling their AMD CPU's... sure they might need a higher quality heatsink/fan, but I don't know about any of you, but I wouldn't throw some piece of trash on my CPU that's "good enough" anyway. Plus... the size of AMD's cores allow them to produce more cores per silicon wafer... the more cores per wafer, the lower the manufacturing cost.
The moral of the story... the fact that in general, AMD CPU's run hotter isn't a disadvantage... it's all a result of using a small core, which is more of an advantage to me than a disadvantage since AMD's top of the line CPU sells for under $400, and can hold it's own against Intel's top of the line CPU which costs $700.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |