Numeric mixed derivatives

Question

In short

I need NumericCalculus`ND extended to mixed partial derivatives. This can be done by nesting them, but care has to be taken to evaluate underling ND only when the parameters become numeric. There are examples in the answers on "Numerical partial derivative" and "Numeric calculation of Hessian" (make sure to notice the discussion on the efficiency). The challenge is to build such patterns with arbitrary number of parameters.

Long story

I am forced to take the derivative of my 200k leafcount expressions numerically. I am aware of NumericCalculus`ND which works great, but does not allow mixed derivatives. There are some examples on how to do it efficiently in this and this answer. I would like to play around with a greater number of parameters and use a Scale option for each of them. So I would like to have a nice top level function.

An example of what I would like it to be capable of:

nd[Sin[x Conjugate[y]] Cos[u v] /.{u->u0, v->v0}, {x, x0}, {y, y0}, Scale -> {xScale, yScale}]

To define such a function with variable number of parameters and understanding the syntax in some of the examples is beyond my capabilities. Could someone provide the function? It is a pity that Mathematica does not ship with one.

Update to the long story: Both of the related answers I linked discuss how the underling ND should be called only, when all parameters are finally numeric. This avoids big number of ND calls with partially numeric expressions. I would like the method to have the same feature and not just nesting the ND calls.

Update 2 As I did not consider it relevant, I did not mention that I have complex variables. In this case finite difference method is ill defined and some simple reimplementation of ND would not do.

The Mathematica code in your example contains a syntax error -- Sin[x y*]. This makes it hard to understand what you actually want. Please correct your example. — m_goldberg, May 27 '14 at 01:55
Related: Numeric calculation of Hessian, Numerical partial derivative — Michael E2, May 27 '14 at 03:12

score 3 · Accepted Answer · edited Apr 13 '17 at 12:55

3

Initial answer

I have similar situation (but I numerically compute only first derivatives) and found that the most reliable and efficient solution is do not rely on ND but implement numerical derivatives by yourself using the definition

$$f'(a)=\lim_{h\to 0}\frac{f(a+h)-f(a)}{h}$$

Such approach requires at the first step finding optimal $h$ for each of the parameters which gives maximum precision. It cannot be too small because in this case the precision of the difference $f(a+h)-f(a)$ will vanish. In my case defining $h$ as

$$h=\frac{\log_2(|a|)}{2^n}$$

reduces the problem to finding optimal value of $n$ which is equal for all the parameters. For this purpose I compute numerical approximations for all partial derivatives with $n$ running from 15 to 60 (I compute with high WorkingPrecision) and see where the precision of these approximations becomes sufficient or maximal for all the partial derivatives.

UPDATE: efficient use of `ND`

Thanks to acl's great answer now it is clear that ND with Terms->1 uses exactly the same finite difference formula:

Needs["NumericalCalculus`"]
ND[f[u], u, x, Terms -> 1, Scale -> h, WorkingPrecision -> Infinity]

(-f[x] + f[h + x])/h

But one should be careful here: increasing the number of terms leads to division of the step $h$ by powers of 2 in successive higher order terms:

ND[f[u], u, x, Terms -> 2, Scale -> h, WorkingPrecision -> Infinity] // Simplify

-((3 f[x] - 4 f[h/2 + x] + f[h + x])/h)

ND[f[u], u, x, Terms -> 3, Scale -> h, WorkingPrecision -> Infinity] // Simplify

(-21 f[x] + 32 f[h/4 + x] - 12 f[h/2 + x] + f[h + x])/(3 h)

It means that for getting the best possible approximation with ND one should estimate the step $h$ by the method described in the initial section of this answer, then multiply it by $2^{t-1}$ (where $t$ is the value of the Terms option of ND) and use as value for the Scale option.

These considerations can be compiled into a function with more intuitive and consistent behavior:

Options[myND] = {Terms -> 1, WorkingPrecision -> MachinePrecision};
myND[expr_, x_, x0_, h_, opts : OptionsPattern[]] := 
 ND[expr, x, x0, Scale -> h*2^(OptionValue[Terms]-1), opts]

edited Apr 13 '17 at 12:55

Community

1

answered May 28 '14 at 10:48

Alexey Popkov

61,809
7
149
368

1

This is exactly why I want to use ND, as it is using more advanced methods which I hope are more stable than just finite difference method of first order. When I use correct Scale and stability can be improved by increasing the Terms. – Johu May 28 '14 at 11:55
Also I need a method, which works with complex variables. As I asked for reusage of ND I did not consider this point relevant before. – Johu May 28 '14 at 12:03
@Johu How do you estimate the correct Scale for usage with ND? This parameter is critical for getting correct derivatives but I have not found any information on this topic in the Documentation. – Alexey Popkov May 28 '14 at 14:20
You probably know the general considerations of round off vs rounding errors. So I have a guess from foreknowledge about my function and check if the result is insensitive (relevant change small) to Scale parameter (around the correct value). I guess the real error you make you can only estimate when you can also take the analytic derivative? – Johu May 28 '14 at 15:21
@Johu I do not take the analytic derivative. I just consider how derivatives depend on the parameter $n$. For relatively small $n$ this dependence is smooth and must asymptotically converge to the true value, but when $n$ becomes large the finite-precision errors introduce random noise which quickly becomes large with increasing $n$. It is easy to see where random noise becomes larger than the variation of a derivative with increasing $n$. So I choose an optimal $n$ at which random noise is an order of magnitude smaller than variation of derivatives with $n$. – Alexey Popkov May 28 '14 at 15:39
@Johu Do you think that Scale basically has the same meaning as $h$ in the finite difference formula? – Alexey Popkov May 28 '14 at 15:46
Yes, this is what I assumed. Sounds like another support ticket to Wolfram to ask for a improvement (this time in documentation).
We agree on how to estimate the h/Scale error. I just did not care for the optimal value. If error was small compared to value was good enough ("result is insensitive to Scale").
– Johu May 28 '14 at 16:10
@Johu It was also my first thought but Wikipedia correctly says that typical choice of $h$ for double-precision arithmetics is of order $\sqrt{\varepsilon}x$ where the $\varepsilon$ is machine epsilon. It is not clear how it is related to default value 1 for the Scale option. – Alexey Popkov May 28 '14 at 16:27
@Johu Documentation for ND and Scale says that Scale is the "size at which variations are expected" and that "the setting for Scale specifies the initial step size to use during Richardson extrapolation." I am not sure how to interpret these statements. – Alexey Popkov May 28 '14 at 16:27
Let us continue this discussion in chat. – Johu May 28 '14 at 16:30
2

@Johu and Alexei, belisarius and I played with this some time ago, see if you find that useful. – acl May 28 '14 at 17:50
2

@acl Big thanks! It clarifies the meaning of Scale. So @Johu is right: it means exactly the same as $h$ in the finite difference formula (with Terms -> 1). But it is important to know that it is divided by 2 in successive higher order terms: ND[f[u], u, x, Terms -> 2, Scale -> .1]. – Alexey Popkov May 28 '14 at 18:14
I am puzzled now. Is the documentation just wrong? It has nothing to do with "Richardson's extrapolation"? – Johu May 28 '14 at 19:18
1

@Johu It seems that ND uses forward (and backward for negative Scale) difference formulas obtained from Richardson extrapolation, see terminology here and the general formula here. – Alexey Popkov May 28 '14 at 20:00
@Johu and Alexey, forgot to mention that the code for ND is in AddOns/Packages/NumericalCalculus/NLimit.m so you can inspect it. – acl May 28 '14 at 23:08

score 2 · Answer 2 · answered May 27 '14 at 03:17

If I understand correctly, you're trying to figure out how to generate expressions like

ND[ND[ND[g[x, y, z], x, 1], y, 1], z, 1]

(taken from the first link you provided). If so, I think what you want is something along these lines:

ClearAll[nd];
SetAttributes[nd, HoldFirst];
Options[nd] = {Scale -> Automatic};
nd[func_, params__List, OptionsPattern[]] :=
  With[
   {vars = {params}[[All, 1]], inits = {params}[[All, 2]]},
   Fold[
    NDFunc[#1, Sequence @@ MapAt[Scale -> # &, #2, -1]] &,
    Apply[func, vars],
    Transpose@{
      vars,
      inits,
      OptionValue[Scale] /. 
       Automatic :> ConstantArray[1, Length@vars]}]];

So:

nd[f, {x, x0}, {y, y0}, {z, z0}]

NDFunc[NDFunc[NDFunc[f[x, y, z], x, x0, Scale -> 1], y, y0, Scale -> 1], z, z0, Scale -> 1]

nd[f, {x, x0}, {y, y0}, {z, z0}, Scale -> {xscale, yscale, zscale}]

NDFunc[NDFunc[NDFunc[f[x, y, z], x, x0, Scale -> xscale], y, y0, Scale -> yscale], z, z0, Scale -> zscale]

Thank you, but this is only half way there. See the discussion in either links on avoiding repeated ND calls before all parameters are numeric. I will also add this clarification in to the question. — Johu, May 27 '14 at 08:43

Numeric mixed derivatives

2 Answers2

Initial answer

UPDATE: efficient use of `ND`

Linked

Numeric mixed derivatives

2 Answers2

Initial answer

UPDATE: efficient use of ND

Linked

UPDATE: efficient use of `ND`