12

I have two lists, L1 and L2, each with a key and some data. Let us say the key is a person's name, a string, and the data follows. (To respond to rasher's query:) Let us also assume the lists are sorted by key:

L1 = {
       {"Joseph O'Rourke", data1, data2, ... },
       ...
     }

I would like to "align" the two lists in the following sense. If L1 has a name A that is not in L2, then L2 is padded to include a "blank" record for A. And vice versa: If L2 has a name B that is not in L1, then L1 is modified to include a "blank" record for B. Then I will have two lists that "align":

L1' = { {A,...}, {0,...}, {C,...},  {D,...}, ...}
L2' = { {A,...}, {B,...}, {0,...},  {D,...}, ...}

where maybe 0 == {}. With the lists aligned in this fashion, I could make a two-column table (one column per list) that would directly compare one list against the other. My question is:

What is a clean method for accepting L1 and L2 as input, and returning L1' and L2' as output, with the latter two lists aligned as above?

I can accomplish this via tedious list For-loops, but I suspect the cognoscenti :-) will offer more concise and efficient methods. Thanks for your help!

Joseph O'Rourke
  • 4,731
  • 25
  • 42

3 Answers3

7

If the lists are long (several hundred or more), Alternatives will get to be slow. Here's another way that will be faster on longer lists:

align[list1_, list2_] := Module[{base, replace1, replace2},
  base = Union[First /@ list1, First /@ list2];
  With[{arg = First[#]}, replace1[arg] = #] & /@ list1;
  replace1[_] = {0, {}};
  With[{arg = First[#]}, replace2[arg] = #] & /@ list2;
  replace2[_] = {0, {}};
  {replace1 /@ base, replace2 /@ base}

Example:

data = {"that", "natural", "cowards", "delay", "country", "himself", 
   "my", "will", "cast", "office", "native", "is", "awry", "s", "ay"};

l1 = {#, ToCharacterCode[#]} & /@ Sort@RandomSample[data, 10]
l2 = {#, {StringLength[#]}} & /@ Sort@RandomSample[data, 10]
(*
  {{"ay", {97, 121}}, {"cast", {99, 97, 115, 116}},
   {"cowards", {99, 111, 119, 97, 114, 100, 115}}, {"delay", {100, 101, 108, 97, 121}},
   {"himself", {104, 105, 109, 115, 101, 108, 102}}, {"is", {105, 115}},
   {"my", {109, 121}}, {"native", {110, 97, 116, 105, 118, 101}},
   {"that", {116, 104, 97, 116}}, {"will", {119, 105, 108, 108}}}

  {{"cast", {4}}, {"cowards", {7}}, {"delay", {5}}, {"is", {2}}, {"my", {2}},
   {"native", {6}},{"natural", {7}}, {"s", {1}}, {"that", {4}}, {"will", {4}}} 
*)

align[l1, l2];
Grid[Transpose@{newl1, newl2}]

Mathematica graphics

Michael E2
  • 235,386
  • 17
  • 334
  • 747
  • +1, and we both over-thought it (within apparent constraints of data in OP) - see my update. – ciao Jun 02 '14 at 05:29
6

I'm late to the party but I like this kind of problem so I'm going to answer anyway.

I propose this:

f1[a_List, b_List, fill_: {0, {}}] :=
  With[{all = a ⋃ b},
    Replace[
      Join[all, #] ~GatherBy~ First,
      {{_} -> fill, {__, x_} :> x},
      1
    ] & /@ {a, b}
  ]

Test:

a = {{1, 7}, {3, 7}, {5, 2}, {8, 7}};
b = {{3, 1}, {6, 6}, {8, 7}, {9, 3}};

f1[a, b] // Grid

$ \begin{array}{cccccc} \{1,7\} & \{3,7\} & \{5,2\} & \{0,\{\}\} & \{8,7\} & \{0,\{\}\} \\ \{0,\{\}\} & \{3,1\} & \{0,\{\}\} & \{6,6\} & \{8,7\} & \{9,3\} \end{array} $

Note that this sample includes keys with both identical ({8, 7}) and divergent ({3, 7}, {3, 1}) data.

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
  • +1, clean and pretty fast. And just elegant, as always. – ciao Jun 02 '14 at 08:31
  • Nice! Could you (or someone) please explain ~GatherBy~, especially the tildas. Thanks! – Joseph O'Rourke Jun 02 '14 at 10:35
  • 2
    @JosephO'Rourke x ~ f ~ y is equivalent to f[x,y]. It's called "Infix notation" – Dr. belisarius Jun 02 '14 at 12:10
  • @rasher Thanks. Since you like (at least some aspect of) my style, would you mind me refactoring the code in your answer to reduce duplication? Not only would that make it shorter, but IMO it would be clearer too, as the differences from one line to the next would be directly apparent as arguments. – Mr.Wizard Jun 02 '14 at 20:48
  • @Joseph Yes, it's infix notation, and I've been confusing people with it since at least 2011. :o) I find nested brackets very hard to read and I often use infix notation to reduce their use. I also like way that code reads left-to-right when binary operations are strung together. – Mr.Wizard Jun 02 '14 at 20:51
  • @Mr.Wizard: Feel free - honestly, consider this as carte blanche forward to prettify any of my stuff... I seldom aim for beauty here. PS - might want to test Union vs the DeleteDuplicates for the complement - did not re-test for speed, but one may be better than other (and vice versa) depending on key structure... – ciao Jun 02 '14 at 21:36
  • Could someone explain the pattern line, {{_} -> fill, {__, x_} :> x}, ? – Joseph O'Rourke Jun 03 '14 at 00:14
  • 1
    @Joseph After gathering the elements: a list with a single occurrence, which is matched by the pattern {_}, represents a missing key in the list under operation and needs to be replaced with the fill expression (provided as an optional argument with a default value). A list with multiple occurrences, matched by the pattern {__, x_}, represents a present key, and we want the last element gathered (matched by x_) because it came from the list under operation. If this is still not clear, or you are having trouble understanding the patterns themselves, let me know and I'll try again. – Mr.Wizard Jun 03 '14 at 01:29
  • Thank you, that is quite clear! – Joseph O'Rourke Jun 03 '14 at 10:24
2

TemporalData + ResamplingMethod

ClearAll[timeAlign]
timeAlign = Module[{td = TemporalData[{##}, ResamplingMethod -> {"Constant", {}}], times},
  times = Union @@ td["TimeList"];
  Thread[{times, #}] & /@ Through @ td["PathFunctions"] @ times]&;

Example: using Mr.Wizard's example lists

a = {{1, 7}, {3, 7}, {5, 2}, {8, 7}};
b = {{3, 1}, {6, 6}, {8, 7}, {9, 3}};

timeAlign[a, b]

{{{1, 7}, {3, 7}, {5, 2}, {6, {}}, {8, 7}, {9, {}}},
{{1, {}}, {3, 1}, {5, {}}, {6, 6}, {8, 7}, {9, 3}}}

Association + KeyUnion

KeyValueMap[List]/@KeyUnion[Association/@(Rule @@@ # & /@ {a, b})]/. _Missing -> {}

{{{1, 7}, {3, 7}, {5, 2}, {8, 7}, {6, {}}, {9, {}}},
{{1, {}}, {3, 1}, {5, {}}, {8,7}, {6, 6}, {9, 3}}}

kglr
  • 394,356
  • 18
  • 477
  • 896