11

I am considering identities involving t[a, b, c, d, ...], where number of indices is fixed. t has the cyclic property so that t[3, 4, 1, 2] is equal to t[1, 2, 3, 4].

When $k=4$, all possible elements are generated by

basis = (# /. {List -> t}) & /@ Permutations[Range[4]];
basis = basis /. {t[a___, 1, b___] -> t[1, b, a]} // Union

Here comes the output:

{t[1, 2, 3, 4], t[1, 2, 4, 3], t[1, 3, 2, 4], t[1, 3, 4, 2], t[1, 4, 2, 3], t[1, 4, 3, 2]}

I want to convert expressions like t[1, 2, 3, 4] + t[1, 3, 2, 4] - t[1, 4, 3, 2] into a coefficient matrix to do some linear algebra. I tried the following code:

identity = {t[1, 2, 3, 4] + t[1, 3, 2, 4] - t[1, 4, 3, 2], 
  t[1, 2, 4, 3] + t[1, 3, 2, 4], 
  t[1, 3, 4, 2] - t[1, 2, 3, 4] - t[1, 4, 2, 3]};

coeffmatrix = Coefficient[identity, #] & /@ basis // Transpose

The output is

{{1, 0, 1, 0, 0, -1}, {0, 1, 1, 0, 0, 0}, {-1, 0, 0, 1, -1, 0}}.

Efficiency does not matter for this small example. However, when I increase number of indices and identities, getting coeffmatrix becomes very slow and spends a huge amount of memory. For the real case, t has 10 indices and the size of coeffmatrix is approximately $362880 \times 362880$.

Here comes my question: Coefficients are always restricted to {-1, 0, 1} for some reasons. Would this fact probably help me to boost up the performance? Could anyone give me a suggestion for better efficiency?

Joonho Kim
  • 445
  • 2
  • 9
  • Would using a SparseArray help? – Tobias Hagge Apr 10 '13 at 00:58
  • @TobiasHagge I am not familiar with SparseArray. What is the benefit to use SparseArray? – Joonho Kim Apr 10 '13 at 01:06
  • more efficient storage and faster computations on matrices for which most of the coefficients are zero. – Tobias Hagge Apr 10 '13 at 04:56
  • @TobiasHagge Is it possible to calculate the matrix rank directly from SparseArray? – Joonho Kim Apr 10 '13 at 05:47
  • I haven't used sparse arrays much, but my understanding is that most of mathematica's linear algebra functions are implemented to transparently work with them. CoefficientArrays, by the way, produces a SparseArray, so if you want to test performance you can compute the rank using the matrices computed by both your algorithm and Mr. Wizard's, and see which is faster. – Tobias Hagge Apr 10 '13 at 15:29
  • If you try to do something with sparse arrays mathematica can't handle it'll convert the matrix to a non-sparse form before proceeding. In that case sparse arrays are slower than non-sparse. – Tobias Hagge Apr 10 '13 at 15:30

2 Answers2

8

Is this faster?

CoefficientArrays[identity, basis][[2]] // MatrixForm

$\left( \begin{array}{cccccc} 1 & 0 & 1 & 0 & 0 & -1 \\ 0 & 1 & 1 & 0 & 0 & 0 \\ -1 & 0 & 0 & 1 & -1 & 0 \end{array} \right) $


Responding to Jens' elegant answer it should be noted that performance of CoefficientArrays is better optimized for this task as one would hope.

basis = (# /. {List -> t}) & /@ Permutations[Range[8]];
basis = basis /. {t[a___, 1, b___] -> t[1, b, a]} // Union;

size = {5000, 30};
identity = Total[RandomInteger[{-1, 1}, size]*RandomChoice[basis, size], {2}];

(r1 = CoefficientArrays[identity, basis][[2]];) // RepeatedTiming // First
(r2 = D[identity, {basis}];)                    // RepeatedTiming // First

r1 == r2
0.0517

0.43

True

In this example the difference in memory consumption is far more significant:

ByteCount /@ {r1, r2}
Divide @@ % // N

{1639856, 608080968}

0.00269677
Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
5

To convert a list of linear expressions to a matrix containing the coefficients the following is easier to write than CoefficientArrays, but seems to be a little slower:

D[identity, {basis}]

$$\left( \begin{array}{cccccc} 1 & 0 & 1 & 0 & 0 & -1 \\ 0 & 1 & 1 & 0 & 0 & 0 \\ -1 & 0 & 0 & 1 & -1 & 0 \\ \end{array} \right)$$

What I did here is to use the fact that for a linear map the matrix of coefficients is identical to the Jacobian. The latter is what I calculate.

Jens
  • 97,245
  • 7
  • 213
  • 499
  • This is beautiful but in my testing it is slower than CoefficientArrays. I guess you just noticed that too. It also produces a dense array that could take up quite a bit of memory in some cases. – Mr.Wizard May 28 '15 at 00:51
  • 1
    @Mr.Wizard Yes, my initial timing included the duration of the keystrokes... – Jens May 28 '15 at 00:53
  • 1
    A metric I use myself if you haven't guessed. :D – Mr.Wizard May 28 '15 at 00:55
  • You already have my vote but I thought it worthwhile to note the performance caveats so I added an example to my answer. If you feel it is an unfair example please let me know; I chose it without much consideration. – Mr.Wizard May 28 '15 at 01:03