8

Dealing with spectral data (2D lists) where offsets (corresponding 1D list) must be removed from each individual spectrum. Looking for feedback on improvements over my current method.

    offset = {53.0617, 52.5185, 53.2469, 52.8025, 53.2716, 53.284, 53.2716, 53.6049, 53.5062, 53.642};

    data = {{-0.0617284, -1.51852, -2.24691, -1.80247, -2.2716, -0.283951, -2.2716, -4.60494, -2.50617, -2.64198}, {0.938272, -1.51852, -0.246914, 0.197531, -0.271605, 0.716049, 1.7284, 0.395062, -0.506173, 1.35802}, {-0.0617284, 0.481481, -0.246914, 0.197531, -0.271605, -0.283951, -1.2716, -2.60494, -2.50617, -0.641975}, {-4.06173, -3.51852, -2.24691, -1.80247, -2.2716, -2.28395, -2.2716, -2.60494, -2.50617, -0.641975}, {-0.0617284, -1.51852, -2.24691, -1.80247, -2.2716, -2.28395, -2.2716, -2.60494, -2.50617, -4.64198}, {-2.06173, -1.51852, -0.246914, 0.197531, -2.2716, -2.28395, -1.2716, -0.604938, -0.506173, -2.64198}, {-4.06173, -3.51852, -2.24691, -3.80247, -4.2716, -3.28395, -2.2716, -3.60494, -4.50617, -4.64198}, {-3.06173, -3.51852, -2.24691, -3.80247, -3.2716, -2.28395, -2.2716, -3.60494, -4.50617, -2.64198}, {-2.06173, -3.51852, -2.24691, -3.80247, -3.2716, -2.28395, -2.2716, -2.60494, -2.50617, -4.64198}, {-1.06173, -3.51852, -2.24691, -1.80247, -4.2716, -3.28395, -1.2716, -0.604938, -0.506173, -0.641975}};

I currently always use the following MMA code snippet for processing.

    data = (# - offset) & /@ data;

Is there a better use of Thread, Map, etc. that may be considered? The data sets typically include 1000's of spectra each 1000-2000 points long. So 2D list with some million values.

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
OpticsMan
  • 576
  • 2
  • 8

3 Answers3

7
{n, m} = {10^4, 10^4};
offset = RandomReal[1, n];

data = RandomReal[1, {m, n}];

cf = Compile[{{v, _Real, 1}, {offset, _Real, 1}}, 
   Table[v[[i]] - offset[[i]], {i, Length[v]}],
   RuntimeAttributes -> {Listable}, CompilationTarget -> "C", 
   RuntimeOptions -> "Speed"];

r1 = (# - offset) & /@ data; // RepeatedTiming
r2 = Plus[data, ConstantArray[-offset, m]]; // RepeatedTiming
r3 = ArrayReshape[Outer[Plus, Developer`ToPackedArray@{-offset}, data, 1], 
      {m, n}]; // RepeatedTiming
r4 = cf[data, offset]; // RepeatedTiming

r1 == r2 == r3 == r4

Output

{1.08, Null}

{0.557, Null}

{0.233, Null}

{0.20, Null}

True

chyanog
  • 15,542
  • 3
  • 40
  • 78
6

A slightly faster method uses KroneckerProduct to create a suitable matrix of offsets. Some data:

{n, m} = {10^4, 10^4};
offset = RandomReal[100, n];
data = RandomReal[100, {m, n}];

Your method:

r1 = (#-offset)& /@ data; //AbsoluteTiming

{1.80568, Null}

Using KroneckerProduct:

r2 = data + KroneckerProduct[ConstantArray[-1., m], offset]; //AbsoluteTiming

{0.830738, Null}

Check:

r1 == r2

True

Carl Woll
  • 130,679
  • 6
  • 243
  • 355
4

The Map version looks efficient compared with MapThread.

data2 = Flatten[ConstantArray[data, 100000], 1];
First[Timing[data3 = (# - offset) & /@ data2;]]

0.384383

First[Timing[
  data4 = MapThread[
     Plus, {data2, -ConstantArray[offset, Length[data2]]}];]]

2.80672

data3 == data4

True

Chris Degnen
  • 30,927
  • 2
  • 54
  • 108