Say I have a list of original data (for now generating random list as follows):
r[i_] := RandomReal[{0.1, 1}];
R[i_] := RandomReal[1, 5];
data = Table[{r[j], R[j]}, {j, 10^5}];
Now, I want to divide the original data into three categories:
(i) category#1: first entry of data less than 0.2 (ii) category#2: first entry of data less than 0.5 but greater than 0.2 (iii) category#3: first entry of data less than 0.9 but greater than 0.5;
So I append them to three different lists as:
choose[ii_] := data[[ii]];
new1 = {}; new2 = {}; new3 = {};
Do[output = choose[ii];
If[output[[1]] < 0.2, AppendTo[new1, output]];
If[0.2 < output[[1]] < 0.5, AppendTo[new2, output]];
If[0.5 < output[[1]] < 0.9, AppendTo[new3, output]];, {ii, 1,
Length@datax}]; // AbsoluteTiming
This requires a huge time:
{62.3863, Null}
My practical data set is so huge that this method is of no use. I tried with ParallelDo, which seems not to collect anything.
My question is: How can I parallelize the above code and make it super fast?
Thank you in advance :))
Select?{new1, new2, new3} = {Select[data, #[[1]] < 0.2 &], Select[data, 0.2 < #[[1]] < 0.5 &], Select[data, 0.5 < #[[1]] < 0.9 &]};. You may want to use less than or equal in some of those selector functions to capture all possible values. – MarcoB Dec 13 '21 at 21:54GroupBy/GatherByto avoid the multipleSelecttoo like this:demux[y_] := With[{x = y[[1]]}, Which[x < 0.2, 1, 0.2 < x < 0.5, 2, 0.5 < x < 0.9, 3, True,Missing[]]] assoc = GroupBy[data, demux]and then it's justnew1 = assoc[1]; new2 = assoc[2]etc. – flinty Dec 13 '21 at 22:51