Why does Table take so much longer than its constituent computations?

Question

I have a 10x5xn list, which I'll call data,

data = RandomInteger[{-500, 500}, {10, 5, n}];

and a 5x2 list

coeffs={{0., -0.951057, -0.587785, 0.587785, 0.951057},
        {1., 0.309017, -0.809017, -0.809017, 0.309017}}

For each 5xn nestedList inside data, I want to map it using coeffs.nestedList, so that my overall code would look like

newData=Table[coeffs.nestedList,{nestedList,data}]

However, for large n, we start to see a huge difference in the time it takes to run the above line of code and the time it takes to run the constituent computations.

componentTime = Total@Table[First@RepeatedTiming[coeffs.nestedList], {nestedList, data}];
tableTime = First@RepeatedTiming[Table[coeffs.nestedList, {nestedList, data}]];

Consider the difference between componentTime and tableTime as n increases:

I would expect tableTime to be slightly longer than componentTime, but not by that much. Why is there such a large discrepancy between the summed time it takes for the components in Table to execute and the time it takes for the table itself to be constructed? Is there a way to minimize this discrepancy, or a way to formulate the mapping so that it avoids it altogether?

Michael E2 · Accepted Answer · 2022-08-22T19:57:31.010

6

I think it has to do with how often data is copied out of the packed array data. If we unpack data to level 1, the timings are equal.

n = 1 * 10^7;
data = Identity /@ RandomInteger[{-500, 500}, {10, 5, n}];
componentTime = 
  Total@Table[
    First@RepeatedTiming[coeffs . nestedList], {nestedList, data}];
tableTime = 
  First@RepeatedTiming[Table[coeffs . nestedList, {nestedList, data}]];
{componentTime, tableTime}
(*  {0.594324, 0.590828}  *)

A packed array is basically a C language array plus metadata (see below for more). An unpacked array is a linked list of lists or packed subarrays, which can be referenced by pointers instead of being copied. Having RepeatedTiming outside of Table means the subarrays represented by nestedList are repeatedly copied, whereas when RepeatedTiming in inside Table, each subarray is copied once and reused.

References:

edited Aug 22 '22 at 19:57

answered Aug 22 '22 at 17:30

Michael E2

235,386
17
334
747

Hi, could you explain what is unpacking the data? Or is there any reference? Thanks! – H. Zhou Aug 22 '22 at 19:23
1

@H.Zhou See updated answer. :) – Michael E2 Aug 22 '22 at 19:57
Regarding your edit: since having RepeatedTiming outside of Table copies the data each time, does this mean that the extra time in tableTime is just an artifact of RepeatedTiming and not time that one would actually have to worry about when running the code normally? Or is there an actual performance benefit to unpacking the data in the code? – az123p Aug 22 '22 at 20:37
1

@az123p Yes, I think the extra time is an artifact, or more accurately, the faster time is an artifact. The table has 10 data reads plus 10 dot products. The data reads are not repeated when Repeated timing is inside Table; only the dot products are repeated. – Michael E2 Aug 22 '22 at 20:48
I see! Thanks for the help. – az123p Aug 22 '22 at 20:53
@az123p You're welcome. Thanks for the accept. :) – Michael E2 Aug 22 '22 at 20:56

Why does Table take so much longer than its constituent computations?

1 Answers1