4

i have a large compositional dataset which contains non-zero and zero values. here is a sample:

  data = {{22054., 70.62, 0.37, 14.21, 2.89, 0.6, 2.05, 4.18, 4.04}, 
         {22055., 67.84, 0.52, 14.32, 3.77, 0.91, 3., 2.72, 4.62}, 
         {22581., 62.79, 0.62, 13.79, 7.27, 0.46, 2.92, 1.21, 7.97}, 
         {27601., 0., 0., 0., 0., 0., 0., 0., 0.}, 
         {27602., 0., 0., 0., 0., 0., 0., 0., 0.}, 
         {27603., 0., 0., 0., 0., 0., 0., 0., 0.}, 
         {28681., 0., 0., 0., 0., 0., 0., 0., 0.}, 
         {22757., 61.06, 0.77, 16.62, 5.72, 1.66, 4.66, 2.76, 3.46}, 
         {22803., 64.55, 3.01, 16.3, 0.48, 0.09, 0.05, 0.25, 14.29}};

i'm trying to convert the values from weight percent to mol percent. i wish to use the following function to generate a new array of the same length (at level 1) as the original:

     mw = {60.084, 79.866, 101.961, 71.844, 40.304, 56.077, 61.98, 94.2};

     molPct[oxw_, mw_] :=
        Module[{oxcomp, divmw},
        oxcomp = Drop[oxw, None, 1];
        divmw = Transpose[Transpose[oxcomp]/mw];
        (divmw/ Total[divmw, {2}])*100
        ];

     dataMolPct = molPct[data, mw]

unfortunately, this function fails due to the zero elements within the array (where it tries to divide by zero..). this code seems to work fine when i delete the rows containing zero-values.

I tried the following in an attempt to ignore the 'zero' rows:

     molPct[oxw_ /; oxw > 0, mw_] :=
        Module[{oxcomp, divmw, oxmol},
        oxcomp = Drop[oxw, None, 1];
        divmw = Transpose[Transpose[oxcomp]/mw];
        (divmw/ Total[divmw, {2}])*100
        ];

...no luck

as i said, it is important that i end up with an array of the same size (as i will join the new data onto the original array).

it would be nice to know how to do something similar for excluding negative numbers.

any suggestions?

geordie
  • 3,693
  • 1
  • 26
  • 33
  • Would something like: With[{tot = Total@#}, If[tot == 0, ConstantArray[0, Length@#], 100 #/tot]] & /@ (#/mw & /@ Rest /@ data) produce the output you desire? – Pinguin Dirk Mar 18 '13 at 07:27
  • @PinguinDirk, i'm not sure... what would it replace in the code? – geordie Mar 18 '13 at 07:34
  • like: molPct[data_,mw_]:=With[{tot = Total@#}, If[tot == 0, ConstantArray[0, Length@#], 100 #/tot]] & /@ (#/mw & /@ Rest /@ data) - I am just checking if I understood the problem correctly. If it works & you like it, I will post a longer answer – Pinguin Dirk Mar 18 '13 at 07:36
  • Yes, it seems to work... (not that i have a deep understanding of why?...). i'm both impressed and baffled that you don't need to specify the level.. Thanks. – geordie Mar 18 '13 at 07:49

2 Answers2

5

As discussed in the comments:

Based on your function, you could for example write something like:

molPct[oxw_, mw_] := 
  Module[{oxcomp, divmw}, oxcomp = Drop[oxw, None, 1];
    divmw = Transpose[Transpose[oxcomp]/mw];
    With[{tot = Total@#}, 
    If[tot == 0, ConstantArray[0, Length@#], 100 #/tot]] & /@ 
    divmw];

Note that I only changed the last bit, where I use Map (or shorter: /@) to map the With-bit over your divmw. I suggest you read the documentation on Map, it's a very powerful tool and useful. Intuitively, it goes row-by-row over divmw and then first calc's the total tot - and then executes the If (either constant array of 0's or the division).

As also noted in the comments, one might use the following function to get to the same result:

molPct2[data_, mw_] := 
  With[{tot = Total@#}, 
  If[tot == 0, ConstantArray[0, Length@#], 
  100 #/tot]] & /@ (#/mw & /@ Rest /@ data)

I personally find that easier to write (and read) and (I didn't test it) supposedly is faster. There are many other ways (smarter ones) to put it, I guess.

Maybe a bit of explanation is appropriate here:

The first bit is the same as in the code I used above in molPct. So what about the rest? This is used to create what you named divmw - how?

(#/mw & /@ Rest /@ data)

It starts with data. We Map the function Rest on that (what you used Drop for, see documentation). Then we Map the function (#/mw)& on it, to divide by mw. I guess the documentation is a perfect source of explanation on how that mapping actually works (I am terrible at explaining these things).

Pinguin Dirk
  • 6,519
  • 1
  • 26
  • 36
  • on the contrary, i think you explanation is pretty clear. most of the time I find the document center is only useful once I know what i'm doing :-) thanks again! – geordie Mar 18 '13 at 08:18
  • actually, i'm a little confused about the behavior of # in this function. Total@# and Length@# seem to be sampling the entire row (less the first column), whereas #/tot and #/mw seem to be looking at atoms. is this a correct assessment? if so, where does the switch in directive occur? you might have guessed by now that i'm fairly new to this... – geordie Mar 18 '13 at 10:44
  • # is Slot, and "is used to represent arguments or formal parameters in pure functions of the form body& or Function[body]." (see documentation) - so Total@# is missing something. Also note the use of "@" (Prefix) and "/@" (Map). Try f@{a, b} versus f/@{a, b}, you see a difference. So the function #/mw& I used was mapped on the "matrix" (list of list) and thus "applied" to each row - do I make sense? :) – Pinguin Dirk Mar 18 '13 at 11:08
  • see also: http://mathematica.stackexchange.com/questions/19035/ – Pinguin Dirk Mar 18 '13 at 11:11
3

Edit: copied code was invalid; fixed!

For speed you might try something like this:

fn[data_, mw_] :=
 With[{x = (Rest[data\[Transpose]]/mw)\[Transpose]},
   With[{t = Total[x, {2}]}, 100 x t^(1 - 2 Sign@t)]
 ]

The double-Transpose you started with is usually one of the fastest methods. It also looks better in a Notebook than it does here. The rest is handled numerically (Sign etc.) which should be faster on packed arrays of Reals.

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371