1

I often get a data table with column names and row names. For example:

Gene_Name   ID         CHR   Sample_1    Sample_2    Sample_3   ...
Itm2a       NM_00134   chrX  0.00        1.23        2.45       ...
Fam109a     NM_02342   chr7  1.44        4.44        2.14       ...
...         ...        ...   ...         ...         ...        ...

I'd like to divide it into 4 blocks, namely the data block (with all the numeric data reads), column names, row names and corner names. Currently I usually do something like this:

rw = Import["table.csv"];
colnames = rw[[1 ;; 1, 4 ;;]];
rownames = rw[[2 ;;, 1 ;; 3]];
cornernames = rw[[1 ;; 1, 1 ;; 3]];
data = rw[[2 ;;, 4 ;;]];

which is quite clumsy and feels not "mathmatica". Can anyone do better?

xzhu
  • 704
  • 4
  • 9
  • If you do this often, the best approach might be to define a function in a personal package that does it, then use that function. – Szabolcs Jul 12 '14 at 03:05
  • Part is fast in Mathematica so there's no wrong in using it. If you're looking for something more compact you can get away with just one Part: {cols, rows, corners, data} = Import["table.csv"][[{span 1, span 2, span 3, span 4}]] – C. E. Jul 12 '14 at 03:12

2 Answers2

2
(rw = Join[
      Prepend[
       Array[StringForm["`1`(`2`,`3`)", "row", Sequence @@ (ToString /@ {##})] &, {9, 3}], 
       Array[StringForm["`1`(`2`)", "corner", Sequence @@ (ToString /@ {##})] &, {3}]], 
     Prepend[
       Array[StringForm["`1`(`2`,`3`)", "data",Sequence @@ (ToString /@ {##})] &, {9, 7}], 
       Array[StringForm["`1`(`2`)", "col", Sequence @@ (ToString /@ {##})] &, {7}]], 2]) 
  // MatrixForm

enter image description here

Using Undocumented form of Extract (credit: @rasher):

{corners, columns, rows, data} = 
       Rest@Extract[rw, {{}, {1, ;; 3}, {1, 4 ;;}, {2 ;;, ;; 3}, {2 ;;, 4 ;;}}];

Row[MatrixForm /@ {corners, columns, rows, data}, Spacer[5]]

enter image description here

kglr
  • 394,356
  • 18
  • 477
  • 896
0

How does this work for your matrix?

colNames = Rest@First@rw
rowNames = Rest[First /@ rw]
cornerName = First@First@rw
data = Rest[Rest /@ rw]

It's a bunch of variations on First, Rest, and Map.

EDIT: I misunderstood the question. Here's a more general matrix splitter, which divides it into four parts. The divisions are made after rowSplit and colSplit.

rowSplit = 1;
colSplit = 3;
cornerNames = Take[rw, rowSplit, colSplit];
data = Drop[rw, rowSplit, colSplit];
rowNames = Drop[Take[#, colSplit] & /@ rw, rowSplit];
colNames = Take[Drop[#, colSplit] & /@ rw, rowSplit];
phosgene
  • 400
  • 1
  • 8
  • Thanks. But this is essentially the same approach except that I have 3 columns of row names (should have called it "row annotations"). I'm hoping there's a way to do it like this: – xzhu Jul 12 '14 at 02:36
  • MatrixDivide[mat, {1,3}] and it gives me a depth-4 list with blocks of matrix. Kind of like the reverse of ArrayFlatten. – xzhu Jul 12 '14 at 02:37
  • Ah, I misunderstood. I will try to cook something up with Partition for the case where more than just the first column and first row contain 'name' data. – phosgene Jul 12 '14 at 02:39