By Fabiha Hannan on 11 March 2014

[edited/corrected version]

I looked at the miss and stall rates of the benchmark programs when varying the number of sets (called NumSet) and ran on the GTX480 machine. In particular, I looked at a Unicore (or one core cluster) with a 64 byte block size and 4-way set associativity. The NumSet was varied as 2, 4, 8, 16, and 60.

Here is a table of the miss and stall rate data:

datatable

Here is a bar graph for the original miss rates. Note that there is basically no variability between BFS and STO.

missgraph

Here is a zoomed in version so that we can look at the lower miss rates:

missgraphzoom

Here is a bar graph for the original stall rates.

missgraph

Here is a zoomed in version so that we can look at the lower stall rates:

missgraphzoom

The earlier benchmarks have significantly lower miss and stall rates than the later ones. Little difference is seen in comparison. The benchmark STO is hardly affected by this and RAY has a much higher stall rate (indicating more reservation fails than misses) than its miss rates, and is relatively very affected.

category: tags:


blog comments powered by Disqus