Benchmark comparison test. David H. Ahl.
For years, the national speed limit has been 55 mph and drag racing has been outlawed on public roads. Yet two key measures in automobile road testare the 0 to 60 time and the standing start quarter mile elapsed time. In a sense, both measures are unrealistic because they do not reflect the way a car is actually used, and no one would think of buying a new car based on these measures alone. But they are valuable as comparative measures of one aspect of the performance of different cars.
Likewise, the benchmark program presented here is not representative of the way computers are actually used; it measures only a few aspects of performance, and no one should buy a computer based solely on the results of these measures. Yet, the results provide some interesting comparative data. The Benchmark Program
The program is just six lines long on computers which permit multiple statements on one line. On computers permitting only one statement per line such as the TI 99/4A, the program is still only 14 lines long. Thus, it is quick and easy to use.
Before entering the program, it is important to know how the random number function works on the computer to be tested. The RND function used must produce different numbers between 0 and 1. On some computers, RND(0) produces different number and RND(1) always produces the same one; on other computers, these functions are reversed. Some computers require only RND with no argument at all. In the benchmark program, be sure to use the function that produces different numbers or you will get erroneous results. What Is Measures
The program measures three things, the most important of which is execution speed in Basic. This speed measure is the time required to execute a main loop 100 times (lines 20 to 50) with two ten-statement loops inside (lines 30 and 40).
The first inside loop (line 30) takes the square root of the value of the outside loop index (N) ten times. In other words, if the program is executing the 49th outside loop, the first square root of 49 is 7, the second of 7 is 2.6457513, the third of 2.6457513 is 1.6265765, and so on. These successive values are stores in A.
The second inside loop (line 40) then takes the final value of A from the first loop and squares it ten times. If the computer was absolutely accurate, the final result should equal the starting fvalue of the index. For instance, in the above example, the final value should be 49.
In line 50, all the values of A from 1 to 100 are summed and stores in S. The integers 1 to 100 when added together equal 1010. Line 60 prints out how much S differs from the correct total of 1010. This is a measure of accuracy.
One might argue that this is not a very good measure of accuracy since one value of A might be high and another low, and they would cancel out the errors. indeed, this is what happens to some extent. However, checking four computers with a more sophisticated (and much longer) measure of accuracy gave very similar results to this much simpler approach.
The last thing measured, the quality of the random number generator, is the least reliable. In this case, the program simply adds the value of 2000 random numbers keeping the running total in R. The values should be randomly distributed between 0 and 1. More simply, half of the values should be between 0 and 0.5 and the other half between 0.5 and 1. Thus the total of 2000 numbers should be 2000 X 0.5 or 1000. Line 60 prints out how much R differs from this theoretical value of 1000.
Since random numbers should be truly random, one cannot say that 2000 of them should be perfectly distributed around 0.5. Indeed, if a random number generator simply produced 0.1 and 0.9 alternately, if would give a perfect result in this test, yet it would hardly be producing random numbers.
On the other hand, the majority of programs that use random numbers do not use 500 numbers, much less 2000. Thus, from a purely pragmatic point of view, it is desirable that random number be uniformly distributed right from the start. Hence, this measure has some value, although it must be taken with a grain of salt. Reading the Results
Naturally, the faster the execution time, the better. Times around two minutes were average. Anything under one minute is excellent while times over four minutes are quite slow.
The measure of accuracy should be as close to zero as possible. In the chart, all the exponential values (2.018 E-07, for example) have been converted to decimals. A value of 0.001 is about the norm. Anything larger than that (0.18, for example) is poor, while smaller values (0.00000021, for example) is very good.
the measure of randomness should also be as close to zero as possible. Anything under 10 is quite good. Values between 10 and 20 are acceptable. Values over 20 are not as good, but the random number generator may still be producing acceptably random numbers. In the chart, the numbers have been rounded to one decimal place.
As with nearly everything, there are tradeoffs. Some computers are fast, but not especially accurate. Others are accurate but slow. Those that rank high on several facets tend to be more expensive than those that do not. Some of the slower units do all their calculations in double precision and can gain a bit of speed by specifying single precision variable; of course, this speed improvement will be at the expense of accuracy. Actual Results
The fastest computer tested to date is the Olivetti M20, a 16-bit computer using Microsoft-like Basic with PCOS, a proprietary operating system. However, its accuracy is nothing to brag about.
Nevertheless, the M20 is about twice as fast as the IBM PC which has virtually the same accuracy. In the same price and speed range, the Computer Devices DOT is marginally faster than the IBM PC, but has twice the accuracy.
The Vectrex tested was a prototype add-on to the GCE Vectrex video game unit; production units may not be exactly the same. The Laser 2001, and Apple work-alike from Hong Kong was also a prototype.
Fastest computer in the low price category, and also one of the most accurate, is the Panasonic JR200. At $300, this is a remarkable performer.
The Aquarius and NEC 8201 were the least accurate of the computers tested. It is interesting to compare the virtually identical TRS-80 Model 100 to the NEC 8201; accuracy of the Model 100 is considerably better because all calculations are done with four-byte (double-precision) numbers, but speed suffers badly.
Most of the 6502-based machines complete the timed loop in roughly the same time and produce identical accuracy figures. This includes the Apple, Vic, and Commodore 64. Likewise, the Z80- and 6800-ased machines are quite similar within families.
The calculator-like machines are among the slowest, the TI CC-40 and Casio FP-200. However, the TI more than makes up for its slow speed with its excellent accuracy, whereas the Casio does not.
Slowest of all the machines tested is the Atari 800 (and identical 400). On the other hand, Atari Basic has some features not found in some other Basic interpreters which may partially make up for its leisurely performance. Additional Entries
In updates of this chart, we would like to include as many computers as possible. We would be pleased to receive benchmark results from readers who have machines that we have not listed. We would be especially interested to include the results of the test on minicomputers and mainframes. Be sure to use an accurate stopwatch for timing. Alternatively, if your computer has a real-time clock, as the NEC 8201 does, it can be used.