How to only work on sublists with non-zero (or positive) values

Question

i have a large compositional dataset which contains non-zero and zero values. here is a sample:

  data = {{22054., 70.62, 0.37, 14.21, 2.89, 0.6, 2.05, 4.18, 4.04}, 
         {22055., 67.84, 0.52, 14.32, 3.77, 0.91, 3., 2.72, 4.62}, 
         {22581., 62.79, 0.62, 13.79, 7.27, 0.46, 2.92, 1.21, 7.97}, 
         {27601., 0., 0., 0., 0., 0., 0., 0., 0.}, 
         {27602., 0., 0., 0., 0., 0., 0., 0., 0.}, 
         {27603., 0., 0., 0., 0., 0., 0., 0., 0.}, 
         {28681., 0., 0., 0., 0., 0., 0., 0., 0.}, 
         {22757., 61.06, 0.77, 16.62, 5.72, 1.66, 4.66, 2.76, 3.46}, 
         {22803., 64.55, 3.01, 16.3, 0.48, 0.09, 0.05, 0.25, 14.29}};

i'm trying to convert the values from weight percent to mol percent. i wish to use the following function to generate a new array of the same length (at level 1) as the original:

     mw = {60.084, 79.866, 101.961, 71.844, 40.304, 56.077, 61.98, 94.2};

     molPct[oxw_, mw_] :=
        Module[{oxcomp, divmw},
        oxcomp = Drop[oxw, None, 1];
        divmw = Transpose[Transpose[oxcomp]/mw];
        (divmw/ Total[divmw, {2}])*100
        ];

     dataMolPct = molPct[data, mw]

unfortunately, this function fails due to the zero elements within the array (where it tries to divide by zero..). this code seems to work fine when i delete the rows containing zero-values.

I tried the following in an attempt to ignore the 'zero' rows:

     molPct[oxw_ /; oxw > 0, mw_] :=
        Module[{oxcomp, divmw, oxmol},
        oxcomp = Drop[oxw, None, 1];
        divmw = Transpose[Transpose[oxcomp]/mw];
        (divmw/ Total[divmw, {2}])*100
        ];

...no luck

as i said, it is important that i end up with an array of the same size (as i will join the new data onto the original array).

it would be nice to know how to do something similar for excluding negative numbers.

any suggestions?

Would something like: With[{tot = Total@#}, If[tot == 0, ConstantArray[0, Length@#], 100 #/tot]] & /@ (#/mw & /@ Rest /@ data) produce the output you desire? — Pinguin Dirk, Mar 18 '13 at 07:27
@PinguinDirk, i'm not sure... what would it replace in the code? — geordie, Mar 18 '13 at 07:34
like: molPct[data_,mw_]:=With[{tot = Total@#}, If[tot == 0, ConstantArray[0, Length@#], 100 #/tot]] & /@ (#/mw & /@ Rest /@ data) - I am just checking if I understood the problem correctly. If it works & you like it, I will post a longer answer — Pinguin Dirk, Mar 18 '13 at 07:36
Yes, it seems to work... (not that i have a deep understanding of why?...). i'm both impressed and baffled that you don't need to specify the level.. Thanks. — geordie, Mar 18 '13 at 07:49

score 5 · Accepted Answer · answered Mar 18 '13 at 08:03

As discussed in the comments:

Based on your function, you could for example write something like:

molPct[oxw_, mw_] := 
  Module[{oxcomp, divmw}, oxcomp = Drop[oxw, None, 1];
    divmw = Transpose[Transpose[oxcomp]/mw];
    With[{tot = Total@#}, 
    If[tot == 0, ConstantArray[0, Length@#], 100 #/tot]] & /@ 
    divmw];

Note that I only changed the last bit, where I use Map (or shorter: /@) to map the With-bit over your divmw. I suggest you read the documentation on Map, it's a very powerful tool and useful. Intuitively, it goes row-by-row over divmw and then first calc's the total tot - and then executes the If (either constant array of 0's or the division).

As also noted in the comments, one might use the following function to get to the same result:

molPct2[data_, mw_] := 
  With[{tot = Total@#}, 
  If[tot == 0, ConstantArray[0, Length@#], 
  100 #/tot]] & /@ (#/mw & /@ Rest /@ data)

I personally find that easier to write (and read) and (I didn't test it) supposedly is faster. There are many other ways (smarter ones) to put it, I guess.

Maybe a bit of explanation is appropriate here:

The first bit is the same as in the code I used above in molPct. So what about the rest? This is used to create what you named divmw - how?

(#/mw & /@ Rest /@ data)

It starts with data. We Map the function Rest on that (what you used Drop for, see documentation). Then we Map the function (#/mw)& on it, to divide by mw. I guess the documentation is a perfect source of explanation on how that mapping actually works (I am terrible at explaining these things).

on the contrary, i think you explanation is pretty clear. most of the time I find the document center is only useful once I know what i'm doing :-) thanks again! — geordie, Mar 18 '13 at 08:18
actually, i'm a little confused about the behavior of # in this function. Total@# and Length@# seem to be sampling the entire row (less the first column), whereas #/tot and #/mw seem to be looking at atoms. is this a correct assessment? if so, where does the switch in directive occur? you might have guessed by now that i'm fairly new to this... — geordie, Mar 18 '13 at 10:44
# is Slot, and "is used to represent arguments or formal parameters in pure functions of the form body& or Function[body]." (see documentation) - so Total@# is missing something. Also note the use of "@" (Prefix) and "/@" (Map). Try f@{a, b} versus f/@{a, b}, you see a difference. So the function #/mw& I used was mapped on the "matrix" (list of list) and thus "applied" to each row - do I make sense? :) — Pinguin Dirk, Mar 18 '13 at 11:08
see also: http://mathematica.stackexchange.com/questions/19035/ — Pinguin Dirk, Mar 18 '13 at 11:11

Mr.Wizard · Answer 2 · 2013-03-18T10:39:44.723

3

Edit: copied code was invalid; fixed!

For speed you might try something like this:

fn[data_, mw_] :=
 With[{x = (Rest[data\[Transpose]]/mw)\[Transpose]},
   With[{t = Total[x, {2}]}, 100 x t^(1 - 2 Sign@t)]
 ]

The double-Transpose you started with is usually one of the fastest methods. It also looks better in a Notebook than it does here. The rest is handled numerically (Sign etc.) which should be faster on packed arrays of Reals.

edited Mar 18 '13 at 10:39

answered Mar 18 '13 at 09:32

Mr.Wizard

271,378
34
587
1,371

it's a very elegant answer. thanks! i really like the use of Signas a power to vet the data. is there an easy way to modify the implementation so that it also ignores negative numbers? – geordie Mar 18 '13 at 10:35
@geordie Negative numbers in the totals you mean? – Mr.Wizard Mar 18 '13 at 10:38
@geordie hm... I had invalid code in my answer; sorry! – Mr.Wizard Mar 18 '13 at 10:40
yes, in the totals but also more generally within a row (it would be nice to flag such instances but i wouldn't alter the values). perhaps this is another question.... – geordie Mar 18 '13 at 10:52
@geordie okay, let me think about that – Mr.Wizard Mar 18 '13 at 10:54
@geordie as a simple patch for this method you could use 100 x Clip[t^(1 - 2 Sign@t), {0, \[Infinity]}] but that's not too elegant. I'm still thinking about the best way to handle/filter negative values in general. – Mr.Wizard Mar 18 '13 at 11:27
this approach causes negative to become zero which is not really ideal. i'm aiming to preserve them unmodified in the new array. – geordie Mar 18 '13 at 11:50
let us continue this discussion in chat – geordie Mar 18 '13 at 12:10

How to only work on sublists with non-zero (or positive) values

2 Answers2