[Future Technology Research Index] [SGI Tech/Advice Index] [Nintendo64 Tech Info Index]


[WhatsNew] [P.I.] [Indigo] [Indy] [O2] [Indigo2] [Crimson] [Challenge] [Onyx] [Octane] [Origin] [Onyx2]

Ian's SGI Depot: FOR SALE! SGI Systems, Parts, Spares and Upgrades

(check my current auctions!)

250MHz R10000 Performance Comparison
Between O2, Octane and Origin2000

Last Change: 11/Aug/1998

SPEC's Introduction to SPEC95

SPECfp95 Analysis

SPECint95 Analysis


(Note: the 2D bar graphs shown here for the various SPEC95 tests have been drawn to the same scale)
(the graphs are also to the same scale as those given on other R10000 comparison pages)

250MHz R10000 SPECfp95 Performance Comparison

The study given here is similar to the 195MHz R10000 discussion. Since the same concepts are relevant, I won't repeat all the background details, so please see the 195 page for all of the detailed observations, architectural discussions, illucidation of issues relating to cache access and memory latency, etc.

As before, there is a 3D Inventor model of the data available; screenshots of this are included below. You can download the 3D model (822bytes gzipped) if you wish: load the file into SceneViewer or ivview and switch into Orthographic mode (ie. no perspective). Rotate the object 30 degrees horizontally and then 30 degrees vertically (use Roty and Rotx thumbwheels) - that'll give you the standard isometric view. I actually found slightly smaller angles makes things a little clearer (15 or 20 degrees) so feel free to experiment. Note that newer versions of popular browsers may be able to load and show the object directly, although such browsers may not offer Orthographic viewing.

All source data for this analysis came from www.specbench.org.

Given below is a comparison table of the various R10000/250 SPECfp95 test results. Faster systems are leftmost in this table (in the Inventor graph, they're placed at the back). After the table and 3D graphs is a short-cut index to the original results pages for the various systems.

Key:

O2000 = Origin2000
      System:   O2000    Octane    O2
      L2:        4MB      1MB      1MB

      tomcatv    34.6     29.4     10.2
      swim       50.0     46.3     14.4
      su2cor     15.6     11.2     5.40
      hydro2d    16.6     11.4     3.26
      mgrid      23.5     18.5     7.26
      applu      14.4     13.2     6.49
      turb3d     19.4     16.9     11.1
      apsi       21.1     16.0     11.6
      fpppp      37.8     37.1     37.2
      wave5      33.7     27.4     12.8

SPECfp95 Comparison Table for MIPS R10000 250MHz

[Left Isometric View] [Right Isometric View]

(click on the images above to download larger versions of the views shown)

[Test Suite Description | O2000 | Octane | O2]


Next, a separate 2D comparison graph for each of the ten SPECfp95 tests:

tomcatv:

tomcatv comparison graph

swim:

swim comparison graph

su2cor:

su2cor comparison graph

hydro2d:

hydro2d comparison graph

mgrid:

mgrid comparison graph

applu:

applu comparison graph

turb3d:

turb3d comparison graph

apsi:

apsi comparison graph

fpppp:

fpppp comparison graph

wave5:

wave5 comparison graph

Observations

These are easier to spot from the graphs, which is why I made them in the first place:


250MHz R10000 SPECint95 Performance Comparison

As usual, you can download a 3D performance graph (gzipped) if you wish: load the file into SceneViewer or ivview and switch into Orthographic mode (ie. no perspective), etc.

The rationale and method for this examination were the same as for SPECfp95. Thus, given below is a comparison table of the various R10000/250 SPECint95 test results. After the table and 3D graphs is a short-cut index to the original results pages for the various systems.

Key:

O2000 = Origin2000
      System:   O2000    Octane    O2
      L2:        4MB      1MB      1MB

      go         14.9     14.1     13.9
      m88ksim    14.2     14.1     14.5
      gcc        13.5     12.5     10.7
      compress   15.0     13.9     12.0
      li         12.3     11.9     11.9
      ijpeg      12.9     12.6     11.5
      perl       16.7     16.4     15.7
      vortex     19.5     13.8     9.74

SPECint95 Comparison Table for MIPS R10000 250MHz

[Left Isometric View] [Right Isometric View]

(click on the images above to download larger versions of the views shown)

[Test Suite Description | O2000 | Octane | O2]


Next, a separate 2D comparison graph for each of the eight SPECint95 tests:

go:

go comparison graph

m88ksim:

m88ksim comparison graph

gcc:

gcc comparison graph

compress:

compress comparison graph

li:

li comparison graph

ijpeg:

ijpeg comparison graph

perl:

perl comparison graph

vortex:

vortex comparison graph

As with R10K/195, the results show a different variance compared to the SPECfp95 results given above. The important observations are discussed on the 195 page. What is of more interest here with respect to R10K/250 is the O2 results and the data for vortex for the three systems.

Prior to the release of R10K/250 for O2, I'd said it would be interesting to see how R10K/250 O2 performed compared to Octane and Origin, given the good int results O2 shows for R10K/195. From the figures, it's clear that R10K/250 O2 does very well, even beating both Origin2000 and Octane for m88ksim (the actual figures are well within typical margins of error, given the nature of compiler optimisation). Naturally, as the CPU becomes faster overall, the better memory latency of the Origin design is beginning to show through, with Octane starting to edge ahead for gcc, compress, perl, etc. (remember that Octane uses the Origin architecture).

Here is a comparison table for the differences between Octane and O2, for R10K/195 and R10K/250 (I'm comparing O2 to Octane because it has the same L2 size). The figures denote how much faster Octane is over O2 for each test:

          R10K/195        R10K/250
Test    %Difference     %Difference

go          3.64            1.44
m88ksim     1.80           -2.76
gcc         12.0            16.8
compress    6.60            15.8
li          1.81            0.00
ijpeg       8.02            9.57
perl        0.00            4.46
vortex      36.6            41.7

For those tests which show a significant difference, one would expect a general increase in difference levels when moving from R10K/195 to R10K/250 (this clearly applies to gcc, compress, ijpeg and vortex). Other tests are well within margins of error. To be sure though, I need SPEC95 data for R10K/225, which isn't available yet for O2 or Octane (the CPU is, but not the test results).


All this analysing is fine and fair enough, but John's comments on the 195 page about the nature of these tests, namely that cache misses aren't occuring with most of the tests because the data sets are small, do pose a question: if only vortex is using a non-trivial data set, just how relevant is SPECint95 anyway? That's a difficult question to answer. For you the reader, you'd have to ask, "How big is my data set? Does the CPU keep having to access main RAM, jumping across a wide memory space? Is memory latency important to my task?"

If your data set is small and cache misses don't happen much, then you wouldn't see much benefit from using Origin or Octane over O2. I can imagine the image processing of NTSC movie frames would come into this category (each frame would fit into a 1MB L2). Ironically, PAL frames would not fit into a 1MB L2 cache (1.26MB per frame compared to 0.90MB per frame for NTSC).

Possible tip: if you're running int processing jobs on Origins, Octanes and O2s, try swapping the jobs around. You might get better performance for some of the tests because they may benefit from Origin's larger L2, or the better memory latency and outstanding cache miss support of Origin/Octane, etc. Meanwhile, a task like m88ksim which doesn't seem to benefit from these extra features would run just as well if it was moved from an Origin/Octane to O2. Thus, one could increase the performance of some tasks without making the remaining tasks any slower than they originally were. An extreme example would be if one had m88ksim-type task running on an Origin (call it task X) and a vortex-type task running on an O2 (task Y) - swapping the tasks over would give the same performance for task X, but task Y would speed up by a significant margin.


Ian's SGI Depot: FOR SALE! SGI Systems, Parts, Spares and Upgrades

(check my current auctions!)
[WhatsNew] [P.I.] [Indigo] [Indy] [O2] [Indigo2] [Crimson] [Challenge] [Onyx] [Octane] [Origin] [Onyx2]
[Future Technology Research Index] [SGI Tech/Advice Index] [Nintendo64 Tech Info Index]