1

I have a matrix with thousands of rows and want the submatrix comprising the rows of the original matrix that have, e.g. a negative element in column 3. How to do that?

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
user7541
  • 53
  • 1
  • 4

2 Answers2

7

I think this question is reasonably a duplicate of: How to find rows that have maximum value? (or several similar questions) and I will delete this answer if it is closed as such. Nevertheless, again for reference:

SeedRandom[1];
a = RandomInteger[{-3, 5}, {20, 5}]

Pick[a, Negative @ a[[All, 3]]]
{{-2, 1, -3, 4, -3}, {-2, 0, -1, -2, 3}, {1, 0, -3, -2, 0}, {2, 0, -3, 0, -1},
 {0, 2, -2, 2, -1}, {0, -2, -3, 1, 1}, {-2, 2, -1, 4, 5}, {0, -1, -2, -2, 3},
 {2, 3, -3, 4, -2}, {0, -2, -1, 2, 5}, {1, 4, -3, 4, 4}}

Faster in versions 8+ should be to use UnitStep, due to packed array optimizations:

Pick[a, UnitStep @ a[[All, 3]], 0]

In version 7 optimal speed may be had with:

a[[SparseArray[BitXor[UnitStep @ a[[All, 3]], 1]]["AdjacencyLists"]]]

Timings compared to other methods proposed, in version 7:

a = RandomInteger[{-3, 5}, {1500000, 5}];

Pick[a, Negative @ a[[All, 3]]] // Timing // First

a[[SparseArray[BitXor[UnitStep @ a[[All, 3]], 1]]["AdjacencyLists"]]] // Timing // First

(col3 = a[[All, 3]]; rowsToGet = Flatten[Position[col3, _?((# < 0) &)]]; a[[rowsToGet]]) //
   Timing // First

Select[a, #[[3]] < 0 &] // Timing // First

0.234

0.031

0.905

1.17

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
  • I'm curious, what computer do you run? For Pick picks a second almost on my i7 (+ Win7 and Mma 9). Can I tweak my system any towards the performance of yours? Another slower method, by a factor of 3.5 with my timing, is to: Scan with levelspec {-2} and Sow rows with negative third element. – BoLe May 20 '13 at 14:07
  • @BoLe I use an i5-2500K running at a maximum (single core) of 4.6 GHz. I know there have been changes to Pick since version 7 but I thought they would make things faster rather than slower. – Mr.Wizard May 20 '13 at 14:50
  • @BoLe I added two new methods optimized for speed. Please try both and let me know how they perform. I expect Pick[a, UnitStep @ a[[All, 3]], 0] to be best on v9. – Mr.Wizard May 20 '13 at 14:58
  • @Mr.Wizard I get (on v9.0.0) {1.896415, 0.146656, 3.255081, 3.815207} for your 4 tested values and 0.051483 for Pick[a, UnitStep @ a[[All, 3]]] – Jacob Akkerboom May 20 '13 at 16:57
  • @Mr.Wizard It must be clocking and a newer model I guess, mine is a bit old already (2009) and running at 2.93 GHz. My times are {0.952, 0.0780, 1.84, 2.04} which make approximately same ratios as yours, somewhere 10 : 1 : 25 : 30. P.S. Pick with UnitStep is the fastest here with 0.047. – BoLe May 20 '13 at 17:51
  • @BoLe I have to remember not to write code I don't test as I usually make mistakes; please try the corrected version: Pick[a, UnitStep @ a[[All, 3]], 0] and tell me how it performs. Thanks. – Mr.Wizard May 21 '13 at 05:35
  • @Jacob You too, if interested. – Mr.Wizard May 21 '13 at 05:35
  • @Jacob should have been Pick[..., 0] - I just can't get it right tonight. I'm going to do something else. – Mr.Wizard May 21 '13 at 08:23
  • 0.113797 then :). Take it easy(?) – Jacob Akkerboom May 21 '13 at 08:29
  • @Mr.Wizard 0.093601 – BoLe May 21 '13 at 09:39
4
m  = RandomInteger[{-1, 1}, {10, 10}];
m1 = Select[m, #[[3]] < 0 &];

Show it:

ArrayPlot[#, ColorRules -> {1 -> Red, 0 -> Blue, -1 -> Yellow, _ -> Gray}] & /@ {m, m1}

enter image description here

Dr. belisarius
  • 115,881
  • 13
  • 203
  • 453