Convert frequency counts to long notation in Mathematica

Question

I have a series of frequencies of values, e.g. 5 times 1, 10 times 2 and 5 times 3, as in

list={{1,5},{2,10},{3,5}}

and I would like to convert this to long notation as in

list2={1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3}

what is the most elegant way to do this in Mathematica?

Vitaliy Kaurov · Accepted Answer · 2014-04-12T12:24:39.223

11

I added timings - 3rd from the bottom is fastest. I am sure there are faster versions. If speed is important you can parallelize or come up with a Compile-ed solution.

In[1]:= list = RandomInteger[{3, 12}, {10^7, 2}];

In[2]:= list // Developer`PackedArrayQ
Out[2]= True

In[3]:= Table[#1, {#2}] & @@@ list // Flatten; // AbsoluteTiming
Out[3]= {22.015290, Null}

In[4]:= Join @@ (Table[#1, {#2}] & @@@ list); // AbsoluteTiming
Out[4]= {18.528328, Null}

In[13]:= Join @@ ConstantArray @@@ list; // AbsoluteTiming
Out[13]= {18.261945, Null}

In[5]:= ConstantArray[#1, #2] & @@@ list // Flatten; // AbsoluteTiming
Out[5]= {43.177745, Null}

In[6]:= NestList[# &, #1, #2 - 1] & @@@ list // Flatten; // AbsoluteTiming
Out[6]= {30.278883, Null}

In[7]:= Join @@MapThread[ConstantArray, Thread[list]]; // AbsoluteTiming
Out[7]= {15.465663, Null}

In[8]:= Flatten@ MapThread[ConstantArray, Thread[list]]; // AbsoluteTiming
Out[8]= {40.184748, Null}

In[9]:= Join @@ MapThread[Table[#1, {#2}] &, Thread[list]]; // AbsoluteTiming
Out[9]= {18.716637, Null}

In[3]:= Inner[ConstantArray, Sequence @@ Transpose@list, Join]; // AbsoluteTiming
Out[3]= {16.525300, Null}

edited Apr 12 '14 at 12:24

answered Apr 12 '14 at 10:10

Vitaliy Kaurov

73,078
9
204
355

Thx millions for the many possible solutions!! – Tom Wenseleers Apr 12 '14 at 10:25
@TomWenseleers I added some timings ;-) – Vitaliy Kaurov Apr 12 '14 at 10:30
Great - many thx! – Tom Wenseleers Apr 12 '14 at 10:39
I am getting better timings with Join @@ ConstantArray @@@ list: about 50 percent of the timings for ConstantArray[#1, #2] & @@@ list // Flatten and about as good as Join @@MapThread[ConstantArray, Thread[list]]. (+1) – kglr Apr 12 '14 at 11:00
... Similarly for Inner[ConstantArray, Sequence @@ Transpose@list, Join] :) – kglr Apr 12 '14 at 11:56
@kguler thanks, i added those - still 15 sec. on my machine are leading. Mac OSX. – Vitaliy Kaurov Apr 12 '14 at 12:25

score 9 · Answer 2 · edited Jun 16 '20 at 09:23

9

Internal`RepetitionFromMultiplicity

list = {{1, 5}, {2, 10}, {3, 5}};
Internal`RepetitionFromMultiplicity @ list

{1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3}

This is faster than the fastest method in @Vitaliy's post.

list = RandomInteger[{3, 12}, {10^7, 2}];
res1 = Join @@ MapThread[ConstantArray, Thread[list]]; // AbsoluteTiming // First

8.5771

res2 = Internal`RepetitionFromMultiplicity[list]; // AbsoluteTiming // First

6.6958

res1 == res2

True

edited Jun 16 '20 at 09:23

Community

1

answered Jan 14 '18 at 09:08

kglr

394,356
18
477
896

3

Where did you dig that one out? – Henrik Schumacher Jan 14 '18 at 09:42
2

@HenrikSchumacher, some time back, searching (I think) for *Repeated*, I mistyped ?? *`*Repet* and this was one in a short list of results. As the name suggests what it does and the syntax, my first or second guess worked:) – kglr Jan 14 '18 at 10:17
Great intuition! =D – Henrik Schumacher Jan 14 '18 at 10:20

score 5 · Answer 3 · edited Jan 14 '18 at 09:53

kglr made a very interesting find with Internal`RepetitionFromMultiplicity. However, Internal`RepetitionFromMultiplicity produces unpacked arrays and that tells me that it is not as efficient as it could be.

Here is an attempt to produce a compiled version that also allows for parallelization:

getRepetitionFromMultiplicity = 
  Compile[{{list, _Integer, 2}, {start, _Integer}, {stop, _Integer}},
   Block[{a, x, y, c = 0},
    a = Table[0, {i, 1, Total[list[[start ;; stop, 2]]]}];
    Do[
     x = Compile`GetElement[list, i, 1];
     y = Compile`GetElement[list, i, 2];
     Do[c++; a[[c]] = x, {i, 1, y}],
     {i, start, stop}
     ];
    a
    ],
   CompilationTarget -> "C",
   RuntimeAttributes -> {Listable},
   Parallelization -> True,
   RuntimeOptions -> "Speed"
   ];

repetitionFromMultiplicity[list_?MatrixQ, jobs_: 1000] := 
 Module[{len, starts, stops},
  If[jobs <= Length[list],
   len = Floor[Length[list]/jobs];
   starts = len Range[0, jobs - 1] + 1;
   stops = len Range[1, jobs];
   stops[[-1]] = Length[list];
   Join @@ getRepetitionFromMultiplicity[list, starts, stops]
   ,
   getRepetitionFromMultiplicity[list, 1, Length[list]]
   ]
  ]

These are the timings (on a quad core machine):

list = RandomInteger[{3, 12}, {10^7 + 1, 2}];
res2 = Internal`RepetitionFromMultiplicity[list]; // AbsoluteTiming // First
res3 = repetitionFromMultiplicity[list]; // AbsoluteTiming // First
Developer`ToPackedArray@res2 == res3

4.85631

0.586881

True

Convert frequency counts to long notation in Mathematica

3 Answers3

Internal`RepetitionFromMultiplicity

Linked

Related