While reading many research-papers comparing parallel implementations of algorithms on different machines/architectures, I have noticed that the performance comparison is always listed in terms of GFlop/s and not the actual wall-clock time for the run in seconds. I am curious why this convention is used.
My only guess is that since every company advertises its device as having a certain peak flop-counts/second such research papers investigate how much of its "potential" has been achieved by listing the performance as "GFlop/s" for the particular application at hand.
Is this correct?
Also, say the performance of a $m$ x $n$ Matrix -- $n$ x $1$ Vector multiply has been stated as 4 GFlop/s. Is it reasonable to obtain the wall clock time in seconds by the following formula?
$$\frac{m(2n-1)}{4 * 10^9} \hspace{3mm} \text{seconds}$$ where $m(2n-1)$ is the number of floating point operations for the matrix-vector multiplication