7

I have a series of frequencies of values, e.g. 5 times 1, 10 times 2 and 5 times 3, as in

list={{1,5},{2,10},{3,5}}

and I would like to convert this to long notation as in

list2={1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3}

what is the most elegant way to do this in Mathematica?

Tom Wenseleers
  • 897
  • 5
  • 16

3 Answers3

11

I added timings - 3rd from the bottom is fastest. I am sure there are faster versions. If speed is important you can parallelize or come up with a Compile-ed solution.

In[1]:= list = RandomInteger[{3, 12}, {10^7, 2}];

In[2]:= list // Developer`PackedArrayQ
Out[2]= True

In[3]:= Table[#1, {#2}] & @@@ list // Flatten; // AbsoluteTiming
Out[3]= {22.015290, Null}

In[4]:= Join @@ (Table[#1, {#2}] & @@@ list); // AbsoluteTiming
Out[4]= {18.528328, Null}

In[13]:= Join @@ ConstantArray @@@ list; // AbsoluteTiming
Out[13]= {18.261945, Null}

In[5]:= ConstantArray[#1, #2] & @@@ list // Flatten; // AbsoluteTiming
Out[5]= {43.177745, Null}

In[6]:= NestList[# &, #1, #2 - 1] & @@@ list // Flatten; // AbsoluteTiming
Out[6]= {30.278883, Null}

In[7]:= Join @@MapThread[ConstantArray, Thread[list]]; // AbsoluteTiming
Out[7]= {15.465663, Null}

In[8]:= Flatten@ MapThread[ConstantArray, Thread[list]]; // AbsoluteTiming
Out[8]= {40.184748, Null}

In[9]:= Join @@ MapThread[Table[#1, {#2}] &, Thread[list]]; // AbsoluteTiming
Out[9]= {18.716637, Null}

In[3]:= Inner[ConstantArray, Sequence @@ Transpose@list, Join]; // AbsoluteTiming
Out[3]= {16.525300, Null}
Vitaliy Kaurov
  • 73,078
  • 9
  • 204
  • 355
9

Internal`RepetitionFromMultiplicity

list = {{1, 5}, {2, 10}, {3, 5}};
Internal`RepetitionFromMultiplicity @ list

{1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3}

This is faster than the fastest method in @Vitaliy's post.

list = RandomInteger[{3, 12}, {10^7, 2}];
res1 = Join @@ MapThread[ConstantArray, Thread[list]]; // AbsoluteTiming // First

8.5771

res2 = Internal`RepetitionFromMultiplicity[list]; // AbsoluteTiming // First

6.6958

res1 == res2

True

kglr
  • 394,356
  • 18
  • 477
  • 896
5

kglr made a very interesting find with Internal`RepetitionFromMultiplicity. However, Internal`RepetitionFromMultiplicity produces unpacked arrays and that tells me that it is not as efficient as it could be.

Here is an attempt to produce a compiled version that also allows for parallelization:

getRepetitionFromMultiplicity = 
  Compile[{{list, _Integer, 2}, {start, _Integer}, {stop, _Integer}},
   Block[{a, x, y, c = 0},
    a = Table[0, {i, 1, Total[list[[start ;; stop, 2]]]}];
    Do[
     x = Compile`GetElement[list, i, 1];
     y = Compile`GetElement[list, i, 2];
     Do[c++; a[[c]] = x, {i, 1, y}],
     {i, start, stop}
     ];
    a
    ],
   CompilationTarget -> "C",
   RuntimeAttributes -> {Listable},
   Parallelization -> True,
   RuntimeOptions -> "Speed"
   ];

repetitionFromMultiplicity[list_?MatrixQ, jobs_: 1000] := 
 Module[{len, starts, stops},
  If[jobs <= Length[list],
   len = Floor[Length[list]/jobs];
   starts = len Range[0, jobs - 1] + 1;
   stops = len Range[1, jobs];
   stops[[-1]] = Length[list];
   Join @@ getRepetitionFromMultiplicity[list, starts, stops]
   ,
   getRepetitionFromMultiplicity[list, 1, Length[list]]
   ]
  ]

These are the timings (on a quad core machine):

list = RandomInteger[{3, 12}, {10^7 + 1, 2}];
res2 = Internal`RepetitionFromMultiplicity[list]; // AbsoluteTiming // First
res3 = repetitionFromMultiplicity[list]; // AbsoluteTiming // First
Developer`ToPackedArray@res2 == res3

4.85631

0.586881

True

corey979
  • 23,947
  • 7
  • 58
  • 101
Henrik Schumacher
  • 106,770
  • 7
  • 179
  • 309