Testing for programmers

Conficio · Nov 19, 2010

Anand et. al,
while I love all your tests, I'm generally missing some tests that show the performance characteristics of various components (CPU, motherboards, chipsets, HDD, SSD, etc.) for programming purposes.

Gaming and entertainment is nice but I'd really like to see more coverage of tests that pertain to IDE usage and compile as well as program execution with logging enabled, (small web app and some native app), automated test execution, etc.

If you don't want to go that way, is there any resource you'd recommend where to find such testing?

ivan2 · Nov 19, 2010

programming is all arithmetic operations, i believe office mark can indicate the performance.

Cogman · Nov 19, 2010

The biggest determiner of compiler performance is small file access time, which is almost completely HDD dependent. If you want to see a huge boost in compiling speeds, look at the difference between spindle drives and SSDs. The closest comparison to what happens in a compiler will probably be light compression programs (IE, zipping a bunch of files). The cpu does play a role, but not too big of one.

That is the hard thing about compiling, no other application really compares to it. If you want random benchmarks, then I would suggest http://www.phoronix.com/ they generally do some pretty strange benchmarks that most sites don't do. ( I know they have done compiling benchmarks, I just don't know where.)

Train · Nov 19, 2010

Tom's harware used to use compiling of the linux kernal as a regular benchmark. I'm not sure if they still do, I havent checked their site in years.

Markbnj · Nov 19, 2010

It's not really relevant, imo, other than storage access times which Cogman mentioned. 98% of the rest of what programming tools do is wait for something to happen.

Conficio · Nov 22, 2010

Markbnj said:
98% of the rest of what programming tools do is wait for something to happen.

If that would be the case I would have not asked.

In my experience waiting for input is true for web browsers too and still they are part of Anand's test suite, aren't they?

In my daily programming I'm the one waiting for the system to respond. Respond to syntax previews, code completion, doc requests, automated tests, web server restarts and auto builds.

Conficio · Nov 22, 2010

Cogman said:
The biggest determiner of compiler performance is small file access time, which is almost completely HDD dependent. If you want to see a huge boost in compiling speeds, look at the difference between spindle drives and SSDs. The closest comparison to what happens in a compiler will probably be light compression programs (IE, zipping a bunch of files). The cpu does play a role, but not too big of one.

I suspect too that small writes for edits and compile of smaller classes, play a major role. However there is also a lot of larger writes going on, for compile, compression and a lot of reads for loading into programs when starting the app for testing.

Cogman said:
That is the hard thing about compiling, no other application really compares to it. If you want random benchmarks, then I would suggest http://www.phoronix.com/ they generally do some pretty strange benchmarks that most sites don't do. ( I know they have done compiling benchmarks, I just don't know where.)

Thanks, they still have some SQLLite, Postgres, and PHP compile benchmarks.

Markbnj · Nov 22, 2010

Conficio said:
If that would be the case I would have not asked.

In my experience waiting for input is true for web browsers too and still they are part of Anand's test suite, aren't they?

In my daily programming I'm the one waiting for the system to respond. Respond to syntax previews, code completion, doc requests, automated tests, web server restarts and auto builds.

I would assume Anand includes web browsers due to the rendering pass, not their input handling.

It may seem to you that you're waiting on your programming tools to finish their work, and in some situations you certainly are (compilation and linking, writing resources, etc). But these are peaky tasks that are almost totally i/o bound, and they just don't make good benchmarks for any particular component. There are better ways to test cpu, memory, video, and disk, so why add programming tools to the test suite? The purpose of the tests is not to inform us what platform will run our tools the fastest.

Conficio · Nov 29, 2010

Markbnj said:
It may seem to you that you're waiting on your programming tools to finish their work, and in some situations you certainly are (compilation and linking, writing resources, etc). But these are peaky tasks that are almost totally i/o bound, and they just don't make good benchmarks for any particular component. There are better ways to test cpu, memory, video, and disk, so why add programming tools to the test suite? The purpose of the tests is not to inform us what platform will run our tools the fastest.

I see your point, but in reality that is what I do with my system the most, programming.

First, if autocomplete in the editor/IDE is slow than that is not so much a peaky task (in the seconds range) but rather an occurance that is annoying [on a quad core 8GB system].

Second if those peaky tasks can shave off a few seconds that is something to consider if you do them hundreds of times a day.

While such things make not a good benchmark for any particular *component*, they do make a great benchmark for a *system* (the combination of components). For example in some of the hard drive (SSD) reviews (as we are talking a lot of I/O here) it is mentioned that the Intel ICH does seem to have better SATA througput than many other solutions. That is a valid datapoint.

And for lack of tests I have no idea if there is any improvement gained in busy IDEs from a better graphics card, even if I don't look at much 3D and suspect no shading goes on. But then all the (to me useless) eyecandy in modern OSes / Windowing systems might just require this in order to perform decent.

Thanks for really understanding why I want such benchmarks as part of the system tests (MoBos, Hard disks, laptops, etc.). I guess you just don't agree that it is so important.

Markbnj · Nov 29, 2010

Conficio said:
I see your point, but in reality that is what I do with my system the most, programming.

First, if autocomplete in the editor/IDE is slow than that is not so much a peaky task (in the seconds range) but rather an occurance that is annoying [on a quad core 8GB system].

Second if those peaky tasks can shave off a few seconds that is something to consider if you do them hundreds of times a day.

While such things make not a good benchmark for any particular *component*, they do make a great benchmark for a *system* (the combination of components). For example in some of the hard drive (SSD) reviews (as we are talking a lot of I/O here) it is mentioned that the Intel ICH does seem to have better SATA througput than many other solutions. That is a valid datapoint.

And for lack of tests I have no idea if there is any improvement gained in busy IDEs from a better graphics card, even if I don't look at much 3D and suspect no shading goes on. But then all the (to me useless) eyecandy in modern OSes / Windowing systems might just require this in order to perform decent.

Thanks for really understanding why I want such benchmarks as part of the system tests (MoBos, Hard disks, laptops, etc.). I guess you just don't agree that it is so important.

I understand why they are important to you, but if you turn it around and look at it from the point of view of the benchmark designers and their goals, there are just better ways to go about it.

If you're regularly getting autocompletion delays in the seconds then you have something else wrong. I would try reinstalling Visual Studio, or whatever IDE you use.

Cogman · Nov 29, 2010

Markbnj said:
I understand why they are important to you, but if you turn it around and look at it from the point of view of the benchmark designers and their goals, there are just better ways to go about it.

If you're regularly getting autocompletion delays in the seconds then you have something else wrong. I would try reinstalling Visual Studio, or whatever IDE you use.

Even auto completion delays are going to be more the result of slower HD speed (Usually the result of having a cache miss). CPU speed may play a role, but not a major one. (not with newer CPUs).

The issue is that programming tools don't stress a particular component. That is what hardware reviews are all about, stressing particular components and getting reliable repeatable results that can be used to determine components speed.

If you look at most hard drive reviews, they don't review things like game FPS (and they shouldn't) or decompression speed. Those things depend on other factors besides the hard drive. In fact, the hard drive is a particularly easy to measure component, there are only 4 factors that matter, sequential read speed, sequential write speed, random read speed, and random write speed. Those are things that are very easy to measure very accurately. They translate directly to real world performance as well.

CPUs and GPUs, on the other hand, have no such simple measurements, hence the reason that you will frequently see a whole salvo of software used to test these components and not some standard synthetic test suite. These parts can have vastly different results all depending on how the software was programmed to use them.

Software development tools are almost completely I/O dependent. So the only test you would use them for is to see how fast the hard drive is. The problem is, there isn't really a good program that you can say "Compile this to see how good the hard drive is!". The thing is, you could have your software stored sequentially or it could be distributed throughout the hard drive. From a reviewers standpoint, it is nearly impossible to change this. So from hardware to hardware, setup to setup, you are going to have vastly differing speeds.

Not only that, but there are vast variances in things like "Where is the hard drive head when the compiler starts?" or "Was some file already compiled which gives a speed up?". The repeatability of the tests is staggeringly bad, and worse with larger projects. All for what? Something that a specialty program can already give precise and repeatable results for.

The next problem comes from "What sort of tests could be performed?". You mentioned intelisense. How, exactly, would you perform a test for that? Just about the only thing that would be measurable is the compile time. What would a reviewer be able to say beyond "It felt pretty snappy."

You say you want full system testing, but do you really? Do you really want anand to go through every permutation of hardware combinations available and test how the software suite works on them? It is impossible to do, there are at least 10,000! combinations (the ! is a factorial).

If you ever run a review site, go ahead and test compile times. But we are already telling you, they are going to be closely linked to hard drive speed while being varied enough to give you no conclusions about anything.

Any_Name_Does · Nov 29, 2010

Cogman said:
Even auto completion delays are going to be more the result of slower HD speed (Usually the result of having a cache miss). CPU speed may play a role, but not a major one. (not with newer CPUs).

The issue is that programming tools don't stress a particular component. That is what hardware reviews are all about, stressing particular components and getting reliable repeatable results that can be used to determine components speed.

If you look at most hard drive reviews, they don't review things like game FPS (and they shouldn't) or decompression speed. Those things depend on other factors besides the hard drive. In fact, the hard drive is a particularly easy to measure component, there are only 4 factors that matter, sequential read speed, sequential write speed, random read speed, and random write speed. Those are things that are very easy to measure very accurately. They translate directly to real world performance as well.

CPUs and GPUs, on the other hand, have no such simple measurements, hence the reason that you will frequently see a whole salvo of software used to test these components and not some standard synthetic test suite. These parts can have vastly different results all depending on how the software was programmed to use them.

Software development tools are almost completely I/O dependent. So the only test you would use them for is to see how fast the hard drive is. The problem is, there isn't really a good program that you can say "Compile this to see how good the hard drive is!". The thing is, you could have your software stored sequentially or it could be distributed throughout the hard drive. From a reviewers standpoint, it is nearly impossible to change this. So from hardware to hardware, setup to setup, you are going to have vastly differing speeds.

Not only that, but there are vast variances in things like "Where is the hard drive head when the compiler starts?" or "Was some file already compiled which gives a speed up?". The repeatability of the tests is staggeringly bad, and worse with larger projects. All for what? Something that a specialty program can already give precise and repeatable results for.

The next problem comes from "What sort of tests could be performed?". You mentioned intelisense. How, exactly, would you perform a test for that? Just about the only thing that would be measurable is the compile time. What would a reviewer be able to say beyond "It felt pretty snappy."

You say you want full system testing, but do you really? Do you really want anand to go through every permutation of hardware combinations available and test how the software suite works on them? It is impossible to do, there are at least 10,000! combinations (the ! is a factorial).

If you ever run a review site, go ahead and test compile times. But we are already telling you, they are going to be closely linked to hard drive speed while being varied enough to give you no conclusions about anything.

Well said,described,explained.

degibson · Nov 30, 2010

Cogman said:
Even auto completion delays are going to be more the result of slower HD speed (Usually the result of having a cache miss). CPU speed may play a role, but not a major one. (not with newer CPUs).

Time to monkey-wrench the machinery.

HDs are slow, yes, but HDs are cached. If you're actually hitting the disk after the first compile, then you're operating on more source than the entire linux kernel (ballpark 10M LOC * 80 char/line). The real problem is more likely to be the file system, not the HD specifically.

Cogman · Nov 30, 2010

degibson said:
Time to monkey-wrench the machinery.

HDs are slow, yes, but HDs are cached. If you're actually hitting the disk after the first compile, then you're operating on more source than the entire linux kernel (ballpark 10M LOC * 80 char/line). The real problem is more likely to be the file system, not the HD specifically.

Not really. Cache (generally) works on principles of spacial locality, in other words, recent data that is near other data is what gets loaded into the cache. The is pretty much how it must be for hard drive caches. It will generally grab one chunk of data here and store it in cache, and another chunk there and store it in cache.

Now, if the user is getting data that is very spread out and small throughout the cache, the hard drive has no choice but to dump the cache it has and go get new data. That nearly perfectly describes what auto completion technology does (though it can be made much more efficient by creating a library of references and loading all those into ram, which is what visual studios does to some extent). It is going through lots of small files looking for matching references, an arduous task.

Couple that with the fact that HD caches are generally pretty small, and you have a nightmare if the user ever goes out and runs another program besides the one he is working on in visual studios. (or one that isn't superfetched.)

I'm not saying that hard drive caching doesn't help, it does tremendously (that is why they have it). But ultimately the hard drive cache itself is pretty slow compared to flash drives, and provides little benefit to random read access. and minimal benefit to sequential read access. If the data you are accessing is over 8 mb, and not in a sequence on the hard disk, you can almost bet that you are getting hard drive cache misses. Not so unlikely when you think of the prospect of dealing with every library and reference in the .Net architecture. (Again, this is why they try to load this stuff into the ram at startup).

CPU speed can have some effect on auto sensing, but I would suspect that hard drive speed still plays a fair to major role (especially for libraries and function the user writes vs built in libraries.)

Markbnj · Nov 30, 2010

All interesting, but... auto-completion times in the seconds? Must be a huge project. My system is far from leading edge (E8500, WD SATAIII drive, 8 GB ram), and I don't ever see delays of that magnitude.

Just for fun I loaded up a solution I'm currently working on (approx. 25 projects, 7400 files) and messed around with auto-completion intentionally typing in names from distant namespaces that had not previously appeared in the document I was editing. The response was instant, with very little disk activity or CPU used. This would tend to support the idea that VS is caching the symbols, and in fact it may have to do that anyway for debugging purposes.

So either the OP has some other issues, or he's working on one heck of a large solution, and might want to consider breaking it up.

Cogman · Nov 30, 2010

Markbnj said:
All interesting, but... auto-completion times in the seconds? Must be a huge project. My system is far from leading edge (E8500, WD SATAIII drive, 8 GB ram), and I don't ever see delays of that magnitude.

Just for fun I loaded up a solution I'm currently working on (approx. 25 projects, 7400 files) and messed around with auto-completion intentionally typing in names from distant namespaces that had not previously appeared in the document I was editing. The response was instant, with very little disk activity or CPU used. This would tend to support the idea that VS is caching the symbols, and in fact it may have to do that anyway for debugging purposes.

So either the OP has some other issues, or he's working on one heck of a large solution, and might want to consider breaking it up.

Or another problem, he isn't using visual studios. Visual studios auto completion is very well programmed, so it doesn't surprise me that it works without hickups for you. This can't be said for every IDE out there (*cough* eclipse)

It really wouldn't surprise me if VS was loading all references into ram on a project load, in fact, it would make sense (really, how much space can it take? Even though they probably use some sort of hash table, I'll bet that memory consumption due to reference data is like 20 mb max, even for a large project.)

More, my point in the above note was that if auto completion is done by looking at the hard drive at all, the hard drive will most likely be the bottleneck, especially for large project.

Markbnj · Nov 30, 2010

Cogman said:
More, my point in the above note was that if auto completion is done by looking at the hard drive at all, the hard drive will most likely be the bottleneck, especially for large project.

Agreed, and in fact disk storage is the choke point for most things these days, at least those that aren't bound on network i/o. When flash drives come down to commodity pricing levels that will be it for spinning platters.

Cogman · Nov 30, 2010

Markbnj said:
Agreed, and in fact disk storage is the choke point for most things these days, at least those that aren't bound on network i/o. When flash drives come down to commodity pricing levels that will be it for spinning platters.

Yep, the biggest issue that flash drives have right now is their price. I would love to get one, but can't justify the cost/benefits ratio. It is almost inevitable that they will eventually be cheaper/more dense than hard drive as they have much larger room for capacity growth for their power budget.

degibson · Nov 30, 2010

Cogman said:
Not really. Cache (generally) works on principles of spacial locality... (snip)

Caches work for spatial or temporal locality. Compile the same thing more than once, after the first time, it hits in the cache.

Couple that with the fact that HD caches are generally pretty small, and you have a nightmare if the user ever goes out and runs another program besides the one he is working on in visual studios. (or one that isn't superfetched.)

On-disk caches are tiny. But modern file systems use empty RAM as a cache of disk.

Conficio · Dec 1, 2010

Hmm, I'm surprised that for all the clear answers of the kind "dev environments need not platform testing" we have such a variety of opinions what actually drives the performance (or lack thereof).

The autocomplete/syntax help was (only one) an example and yes my projects are large (>8500 java files, ~>8500 class files in jars and source zips) and I'm using Eclipse. And yes there is quite some code generation going on as well.

If I look at a typical dev cycle for some server side app, you are working with edit, compile, restart server (if changes don't permit hot loading), running a unit test, may be debugging, rinse and repeat.

I can envision some automated performance tests of IDEs and a typical dev cycle based on some of the functional testing tools, such as Visual Studio Functional UI testing http://blogs.msdn.com/b/dannawi/archive/2009/05/06/visual-studio-2010-function-ui-testing.aspx or SWTBot http://www.eclipse.org/swtbot/

If such tools would be run against a fairly large project and with no pause, the results would give some real world indication of what system does perform better than others.

It might turn out that this is not a benchmark that needs to be run with every system, but lets find out instead of speculate based on (well resoned) assumptions.

Thanks to all that replied, I found the discussion alone very interesting.

Testing for programmers

Junior Member

Diamond Member

Lifer

Lifer

Elite Member <br>Moderator Emeritus

Junior Member

Junior Member

Elite Member <br>Moderator Emeritus

Junior Member

Elite Member <br>Moderator Emeritus

Lifer

Member

Golden Member

Lifer

Elite Member <br>Moderator Emeritus

Lifer

Elite Member <br>Moderator Emeritus

Lifer

Golden Member

Junior Member