Intel Xeon Question

crazymonkeyzero

Senior member
Feb 25, 2012
363
0
0
Hi guys. I was planning to build a single cpu server system to run highly threaded computational work and was wondering if I should go spend extra to go with the new Xeon E5 series 2620 (2.0 ghz hex core socket 2011) or stick to my original plan of purchasing the Xeon E3 1270 ( 3.4 ghz quad core socket 1155). Will the extra 250 dollars for the cpu and 2011 mobo combo be worth the performance increase if any, or should I stick to the sandy bridge cpu? Based on what I hear, the 1270 typically performs of par with the i7 2600k, (minus the ability to OC), but there are not many reviews of the E5 2620. Thanks in advance for any advice on the matter!

Here are the links to he cpu from newegg, where I will be buying.

http://www.newegg.com/Product/Produc...82E16819115081

http://www.newegg.com/Product/Produc...82E16819117269
 
Last edited:

MisterMac

Senior member
Sep 16, 2011
777
0
0
Depends heavily on what work youll be doing.
And how that work is programmed.

8 threads vs 12, but with a 1.4 ghz penalty is fairly extensive.

On the other hand double l3 cache will help again, if your work is programmed and optimized properly.

You should share youe workload, else it's impossible to figure out what would gain the most.


in short tho.

Thread limited: 1270
Unlimited Threads: 2620
 

gplnpsb

Member
Sep 4, 2011
25
0
0
Even in heavily threaded tasks, I wouldn't bet on the E5-2650 significantly outperforming the E3-1270, if at all. The deficit in clock speed is simply that severe.

With an engineering sample of the E5-2650 (clocked at the same 2.00GHz as the 2620, but with 8 cores), I got around 8.35 in Cinebench R11.5. I would expect the 2620 to get around 75% of that score, perhaps a little more depending on turbo multipliers. Thus the E5-2620 might get around 6.3 in the same heavily threaded benchmark. The 2600K gets around 6.8. The E3-1270 would presumably score similarly.

There are benefits to the E5-2650 and LGA2011 platform - they would provide more memory bandwidth - which might, and I emphasize might, impact your computational scores. It also features more PCI-E bandwidth, which could be useful for raid controllers, or GPU compute.

Please also note that I only quote one benchmark. I haven't done more in depth testing. Depending on your application, LGA 2011 may be worthwhile, but with the information provided, I doubt it would be.

If it were me, I wouldn't bother with the E5-2620. If you could find one, an E5-1620 (3.6GHz quad core at $300), or E5-1650 (3.2GHz Hex core at $600) might make the jump to LGA 2011 worthwhile. Unfortunately those processors are only sold as OEM tray versions at the moment and don't appear to be in stock. I suggest you go with LGA 1155. I also suggest you consider the E3-1240 (3.3GHz). It has a 3% clockspeed deficit relative to the E3-1270, and is 20% cheaper (266 at Newegg). The E3-1230 at $240 is also worth strong consideration.
 

GammaLaser

Member
May 31, 2011
173
0
0
Are there any mobos that support dual E5-1240 Xeons?

And fyi actually there are dual 2011 mobos out there ....

Actually I can't seem to find such a thing as a Xeon E5-1240, I originally assumed it was an LGA1155 processor, none of which support dual sockets. Some LGA2011 CPUs do of course support dual sockets (and I believe parts with quad socket support is planned).
 

gplnpsb

Member
Sep 4, 2011
25
0
0
Actually I can't seem to find such a thing as a Xeon E5-1240, I originally assumed it was an LGA1155 processor, none of which support dual sockets. Some LGA2011 CPUs do of course support dual sockets (and I believe parts with quad socket support is planned).

Indeed, there is only the Xeon E3-1240, which is an LGA1155 Quad Core part at 3.30GHz.

The Xeon E5-1600 series of LGA2011 processors only supports single socket configurations. The E5-2600 series supports dual socket configurations, and the upcoming E5-4600 series will support quad socket configurations.

crazymonkeyzero, if power consumption is important to you, you may also want to consider waiting a few weeks for the Ivy Bridge based E3-1200 V2 series processors. The E3-1240 V2 is a 3.40GHz part that should be around 265 dollars. It's manufactured on the 22nm process, and the TDP is supposed to be 69W, vs 80W for the E3-1270.
 

MisterMac

Senior member
Sep 16, 2011
777
0
0
Indeed, there is only the Xeon E3-1240, which is an LGA1155 Quad Core part at 3.30GHz.

The Xeon E5-1600 series of LGA2011 processors only supports single socket configurations. The E5-2600 series supports dual socket configurations, and the upcoming E5-4600 series will support quad socket configurations.

crazymonkeyzero, if power consumption is important to you, you may also want to consider waiting a few weeks for the Ivy Bridge based E3-1200 V2 series processors. The E3-1240 V2 is a 3.40GHz part that should be around 265 dollars. It's manufactured on the 22nm process, and the TDP is supposed to be 69W, vs 80W for the E3-1270.

So you would advise him the E3 - 3.4 ghz part, without knowing his workload and workload type.

Why not just pick a random cpu?


A large percentile of workloads are threaded, yet never push ANY modern Core past 20% utilization.

Therefor 4 extra threads would give him a respective gain depending on the workload type.



Please don't listen to people that want to compare scores on cinebench for Server workloads.
It's just so wrong.


Many situations where a larger cache/threads will benefit you rather than raw clockspeed.
Keep this in mind.

Monkey:
Please explain your workload - it's alot easier to say what path to choose for optimal performance.
Then you can always do your own judgements, if gain is worth it in the price segment to you.
 

gplnpsb

Member
Sep 4, 2011
25
0
0
So you would advise him the E3 - 3.4 ghz part, without knowing his workload and workload type.

Why not just pick a random cpu?


A large percentile of workloads are threaded, yet never push ANY modern Core past 20% utilization.

Therefor 4 extra threads would give him a respective gain depending on the workload type.



Please don't listen to people that want to compare scores on cinebench for Server workloads.
It's just so wrong.


Many situations where a larger cache/threads will benefit you rather than raw clockspeed.
Keep this in mind.

Monkey:
Please explain your workload - it's alot easier to say what path to choose for optimal performance.
Then you can always do your own judgements, if gain is worth it in the price segment to you.

While you have a point, there is no need to be rude about it. I remind you in my first reply I stated this "Please also note that I only quote one benchmark. I haven't done more in depth testing. Depending on your application, LGA 2011 may be worthwhile, but with the information provided, I doubt it would be."

He stated that his workload is heavily threaded computational work. That doesn't sound like a server workload that doesn't tax cores beyond 20%, though that is entirely possible. To me that suggests something that heavily taxes a modern CPU core, which is why I suggested the higher clocked E3. I acknowledge that I could be entirely wrong here, and you could be correct.

I agree that he needs to provide more information about his workload for a truly accurate recommendation to be made. I also agree that Cinebench is not a reliable measure of server workloads. It is simply a measure of rendering power - how the CPU will perform when its cores are heavily taxed.

Mistermac - I didn't just pick a random CPU, I provided my recommendations with disclaimers, and references to higher clocked single socket six core xeons like the E5-1650. I acknowledge that I ought to have emphasized those more, but there was no need to go stating things like "Why not just pick a random cpu?" or "It's just so wrong."
 

crazymonkeyzero

Senior member
Feb 25, 2012
363
0
0
So you would advise him the E3 - 3.4 ghz part, without knowing his workload and workload type.

Monkey:
Please explain your workload - it's alot easier to say what path to choose for optimal performance.
Then you can always do your own judgements, if gain is worth it in the price segment to you.


I am using this computer to run a software called
Gaussian. http://gaussian.com/ It is used for Chemistry computation of molecules and is highly threaded utilizing up to 48gb of memory and I believe up to 24 cores/threads. In the past the computers typically run 24/7 for 1-3 months at a time (depending on work load and number of jobs) doing calculations, so I think the cpu will be pushed. We currently have dual socket dual core xeons and dual dual core opterons in the lab, but they are aging (like about 8 years old now) and I believe current single cpu solutions can easily beat them in speed.
 

gplnpsb

Member
Sep 4, 2011
25
0
0
I am using this computer to run a software called
Gaussian. http://gaussian.com/ It is used for Chemistry computation of molecules and is highly threaded utilizing up to 48gb of memory and I believe up to 24 cores/threads. In the past the computers typically run 24/7 for 1-3 months at a time (depending on work load and number of jobs) doing calculations, so I think the cpu will be pushed. We currently have dual socket dual core xeons and dual dual core opterons in the lab, but they are aging (like about 8 years old now) and I believe current single cpu solutions can easily beat them in speed.

This document from the NIH suggests that performance in 64-bit Gaussian09 for Windows going from 1->2 processing cores increases by about 1.8x, and from 2->4 cores it increases by another 1.8, but from 4->8 cores it increases by only about 1.5x. The test wasn't run with 6 cores.

This message posted on ccl.net suggests that Gaussian also sees significant gains from increases in clockspeed (perhaps around 0.9X scaling with the increase). I wish I could find a more reputable source to corroborate that assertion, but I could not. From what I've seen about Gaussian, and from my own undergrad lab experience in x-ray crystallography, I don't see why you shouldn't see significant gains with clockspeed in Gaussian09.

I suggest you ask around to other labs using the same software, and see if they get full CPU utilization on modern machines. If they do, you will probably be better off with a higher clocked processor with fewer cores, like the Xeon E3s. If they see lower cpu utilization that is well distributed across all cores, MisterMac would be correct in suggesting that the extra threads with the E5-2620 might be worth your while.

Based on a cursory review of the literature on Gaussian, I don't think that going with the E5-2620 would be worth the increase in platform cost, and I believe you might even see a decrease in performance given the reduced clockspeed and imperfect scaling beyond 4 cores. I do emphasize that you need to confer with others in your field to confirm that suspicion.
 

crazymonkeyzero

Senior member
Feb 25, 2012
363
0
0
Well, I guess I should have mention this earlier, but I am going to be running Gaussian on Linux,as the windows version is limited t utilizing only 4gb of memory, and the computer I am planning to build will have 32gb. So The main question here is whether Gaussian favors more cores or higher clock speeds:/
Our current cpus run at 1.8-2ghz, which used to be considered "fast" in their day. But the E5 has 6 cores +hyper threading and turbo boost enables 2.5ghz. The E3 1270 although having 4 cores with ht, has a huge clock speed advantage being 3.4ghz, and with turbo I believe it goes up to about 3.8ghz.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
Crazy, the Xeon E5 would be worth it if your particular program benefitted from high I/O and memory bandwidth of the platform over E3. However it doesn't seem that's the case: http://software.intel.com/en-us/art...tel-xeon-processor-5500-series-based-servers/

If you don't need the benefits E5 gives you(memory bandwidth, better I/O, dual socket, reliability), go for the E3.

If I was given a choice between 50% more cores or 50% more clocks, I'd take the latter any day. There's no way the former would beat the latter. In this case, we're talking about 70% clock speed difference vs 50% core difference, that's no contest!
 

tynopik

Diamond Member
Aug 10, 2004
5,245
500
126
first of all i'm going to assume you're talking xeon because you want ECC memory

otherwise, you're looking at the wrong processors

the computer I am planning to build will have 32gb

E3 Xeons only support 4 memory slots of UNbuffered memory, so 32GB would require four 8GB unbuffered ECC dimms, which are hella expensive

utilizing up to 48gb of memory

And if you ever want to go to more than 32GB you'll definitely have to go with the E5 series

and if you can find it, the E5-1620 offers even higher performance (3.6GHz)
 

tynopik

Diamond Member
Aug 10, 2004
5,245
500
126
In the past the computers typically run 24/7 for 1-3 months at a time (depending on work load and number of jobs) doing calculations

seriously, if you're spending that much time on calculations and your time is worth ANYTHING at all, go for BOTH cores AND clockspeed

a pair of E5-2687W processors (8-cores at 3.1ghz) would do nicely. Even if the percentage scaling across more cores isn't great, the absolute time difference is still going to be significant
 

MisterMac

Senior member
Sep 16, 2011
777
0
0
CrazyMonkey:

I'd have to wonder how your able to do this work and only have a budget of of building a workstation type.

As tynopik suggest - there's some mighty good deals on dualsocket 1u servers all around with some of the higher clocker E5's.
TBH I'm not finding much that gears towards the scaling % of more threads/cores on gaussian.
It's not very well documented for a science based application.

I'll sell my left nut however if guassian isn't programmaticly created with vectorization/step comparisons in mind - so the E5's cache WILL benefit highly.

If you look at the link, gplnpsb shared.
(Shame on you, for looking memory requirement benchmark and comparing that to actual thread scaling - seriously).
What it looks like is More Threads - Less memory requirement.
What it probably is - however is the extra ammount of L3 cache on the Higher "Grade" 8 Core processors - and also faster cache.
Step-Calculations can be done faster inside the cache before having to be stored in RAM - thus less access and storage of memory needed.


What i'd do, if i were you is:

Note your budget.
Get the most threads/cache and highestclocked within your budget.

That unfourtuantely means E5 route.
To have a balance i'd suggest the E5-2660.

You get the 20 MB of cache for any heavy or light workload and benefit still from a decent clockspeed.


Of course if your budget allows it... a 2867W is a monster beast.
Particularly in a Dual Socket 1u box
 
Last edited:

MisterMac

Senior member
Sep 16, 2011
777
0
0
Oh also.

the 1600 and E3's are not Dual Proc supportive.

Something i'd say is not in favour of your workload.


Get a Tower/1u box with a 2 socket motherboard, 32 gigs of ram or 16 or whatever.
Plugin a 2660.
Down the road, buy some more ram - plugin a new 2660.

Save system cost and keep single system scalability.
 

gplnpsb

Member
Sep 4, 2011
25
0
0
CrazyMonkey:

...

If you look at the link, gplnpsb shared.
(Shame on you, for looking memory requirement benchmark and comparing that to actual thread scaling - seriously).
What it looks like is More Threads - Less memory requirement.
What it probably is - however is the extra ammount of L3 cache on the Higher "Grade" 8 Core processors - and also faster cache.
Step-Calculations can be done faster inside the cache before having to be stored in RAM - thus less access and storage of memory needed.
I'm not sure how you got that more threads = less memory requirement. The graph shows that with the least amount of allocated memory (32mb), the completion time penalty becomes progressively more severe with a larger number of threads. The data is also a valid comparison for thread scaling - note that the caption clearly states that different memory allocations were tested with different numbers of processors. It basically shows that beyond a few hundred megabytes of memory allocation, there is little effect on completion time. The rest of the observable effect results from the number of processors used.

What i'd do, if i were you is:

Note your budget.
Get the most threads/cache and highestclocked within your budget.

That unfourtuantely means E5 route.
To have a balance i'd suggest the E5-2660.

You get the 20 MB of cache for any heavy or light workload and benefit still from a decent clockspeed.


Of course if your budget allows it... a 2867W is a monster beast.
Particularly in a Dual Socket 1u box
+1
 

blckgrffn

Diamond Member
May 1, 2003
9,651
4,229
136
www.teamjuchems.com
Seriously, if this runs on a dual socket, 1.8 ghz Xeon and you have x many dollars to replace them, my vote is to scale out instead of up with the least expensive platform that nets you the ECC ram capacity required.

At the very least, consider that you might get 10 lesser systems versus three greater systems - how much do you value compute density? It sounds like you run multiple computations per box anyway (which I understand to mean are independent jobs running concurrently?), so scaling out should be feasible.

You should perhaps think of the aggregate throughput of the workstations you acquire in addition to the per-machine throughput - and its effect on your budget - that is my main point
 

tynopik

Diamond Member
Aug 10, 2004
5,245
500
126
At the very least, consider that you might get 10 lesser systems versus three greater systems - how much do you value compute density? It sounds like you run multiple computations per box anyway (which I understand to mean are independent jobs running concurrently?), so scaling out should be feasible.

You should perhaps think of the aggregate throughput of the workstations you acquire in addition to the per-machine throughput - and its effect on your budget - that is my main point

excellent point

how many simultaneous jobs are you running? perhaps a bunch of E3-1230 xeons might be the way to go
 

tynopik

Diamond Member
Aug 10, 2004
5,245
500
126
This document from the NIH suggests that performance in 64-bit Gaussian09 for Windows going from 1->2 processing cores increases by about 1.8x, and from 2->4 cores it increases by another 1.8, but from 4->8 cores it increases by only about 1.5x.

It also mentions this:

Gaussian uses several scratch files in the course of its computation. These include the checkpoint file (*.chk), the read-write file (*.rwf), the two-electron integral file (*.int), and the two-electron integral derivative file (*.d2e). These files can become extremely large, and because the program is accessing them constantly, I/O speed is a factor in performance.

I don't wonder if these tests were run on a mechanical drive and 8 cpus were saturating the I/O

Purely speculation on my part, but it would be interesting to see how much of a difference an SSD would make in:
1. speed in general
2. scaling to more CPUs
 

crazymonkeyzero

Senior member
Feb 25, 2012
363
0
0
Well my budget, or rather the labs budget allocated for the workstation is about 2000 dollars give or take, so any 8 core xeon is out of the question right? I have not even considered dual sockets either, (if anyone thinks this would work out in my budget, please suggest which cpus and mobo specifically). The system so far I have planned is as follows.

Intel Xeon 1270 $240
http://www.newegg.com/Product/Produc...82E16819115081
Asus P8B WS mobo $220
http://www.newegg.com/Product/Produc...82E16813131725
Kingston 4x8gb DDR3 1333 memory $270
http://www.newegg.com/Product/Produc...82E16820139077
Seasonic X Series 560W Gold PSU $130
http://www.newegg.com/Product/Produc...82E16817151098
Crucial M4 SSD 256 gb (used for gaussian/linux) $300
http://www.newegg.com/Product/Produc...82E16820148443
Western Digital 500 gb HDD ( used for windows stuff) $100
http://www.newegg.com/Product/Produc...82E16820148443
noctua 92mm cooler $60
http://www.newegg.com/Product/Produc...82E16835608016
case: Antec P280 $120
http://www.newegg.com/Product/Produc...82E16811129179
Windows 7 64 pro $140
Generic DVD drive $30
Bunch of high quality case fans (noctua) $100

So the total is about $1950, with tax and shipping.
Pretty much I'm hoping this computer is at least 3 times faster than the 8 year old ones which run dual socket dual core xeons and opterons at 1.8-2ghz, no hyper threading.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |