As I have advocated in another thread, usually the best way (by far) to speed up a calculation is first to improve the algorithm. Then--if it remains necessary--you can focus on platform-specific methods to speed up execution.
Here, newMat and weights are viewed as lists of vectors to be processed in parallel and (according to a comment to the question) sortMat is really a sequence $(y_0, y_0 + dy, y_0 + 2 dy, \ldots, y_0 + (n-1) dy)$. For each $y$ in this sequence, the algorithm seeks the sum of weights associated with elements $x$ of newMat less than or equal to $y$.
[To find those elements, we change the units of measurement of x and sortMat so that the new sortMat is the integral sequence $(1, 2, \ldots, n)$ and we sort x (and sort the weights in parallel with x to keep their association intact). After these preliminaries, the integer part of any entry in x tells us precisely how many elements of sortMat are less than it. Thus--work a few simple examples to check--the differences in this sorted, re-expressed x tell us how many times each cumulative weight ought to appear in the output. The larger sortMat is relative to x, the more computation this approach saves.]
An efficient approach would capitalize on a preliminary sorting of newMat (which requires sorting the corresponding vector weights in parallel). Do this with Ordering. This reduces the amount of computation to the length of each row of newMat rather than the length of sortMat and it reduces a triple nested loop to a double nested loop (which is the minimum possible, given that the output is a rank 2 array).
ClearAll[process];
process[x_, w_, {y0_, dy_, ny_}] := Module[{x0, w0, v, i, f},
x0 = Min[ny, Max[0, Ceiling[(# - y0)/dy]]] & /@ x;
x0 = {0}~Join~Sort[x0]~Join~{ny};
w0 = {0}~Join~Accumulate[w[[Ordering[x]]]];
f := Function[{v, i}, Flatten[ConstantArray[#[[1]], #[[2]]] & /@ ({v, i}\[Transpose])]];
f[w0, Differences[x0]]
];
Let's check the timing and the correctness of this solution. Because it's much faster, let's make newMat have ten times as many rows as before:
bNum = 200;
nNum = 100;
weights = Table[RandomReal[{-10, 10}], {b, 1, bNum}, {n, 1, nNum}];
newMat = Table[RandomReal[{-10, 10}], {b, 1, bNum}, {n, 1, nNum}];
sortMat = Table[i, {i, -10, 10, 0.1}];
Now:
tab = Table[Sum[weights[[k, i]]*
If[newMat[[k, i]] <= sortMat[[j]], 1., 0.], {i, 1, nNum}],
{k, 1, bNum}, {j, 1, Length[sortMat]}]; // AbsoluteTiming
{8.2744732, Null}
and (describing sortMat with the list $(y_0, dy, n)$ = $(-10, 0.1, 201)$)
tab0 = MapThread[process[#1, #2, {-10, 0.1, 201}] &, {newMat, weights}]; // AbsoluteTiming
{0.0920053, Null}
It's 89 times faster, yet produces the same result to within floating point error:
Plus @@ (Abs[Chop[#]] & /@ (Flatten[tab] - Flatten[tab0])) == 0.0
True
Now, if you wish to speed up the improved algorithm with a compilation, go ahead.
Edit
The comments and a further chat session (linked in the comments) uncovered some implicit assumptions in the original solution and led to an improvement that more closely matches the original. The two algorithms are not completely identical when applied to machine-precision numbers, though, due to differences in how the values in sortMat are computed and compared to the values in newMat. Testing indicates they do agree except when elements of newMat appear to equal elements of sortMat: an equality test might or might not return True in such cases.
sortMatalways contain an evenly spaced list of numbers? If so, a significant speedup is possible. Otherwise, the best solutions will likely first sort the rows ofnewMatandweightsin parallel and accumulate the weights before doing any testing: that should get a few orders of magnitude speedup :-). – whuber Oct 05 '12 at 16:55