I want to write some code to simulate a damped oscillator that is just as fast as code written using Numba's @njit decorator. I've written the mathematica code and mine is 20-40x slower than the python code written by YouTuber Jack of Some.
Here is the code from Jack of Some's video on speeding up Python code with Numba; I've changed it a bit to run in just one jupyter cell:
import numpy as np
from numba import jit, njit, types, vectorize
@njit
def friction_fn(v, vt):
if v > vt:
return - v * 3
else:
return - vt * 3 * np.sign(v)
@njit
def simulate_spring_mass_funky_damper(x0, T=10, dt=0.0001, vt=1.0):
times = np.arange(0, T, dt)
positions = np.zeros_like(times)
v = 0
a = 0
x = x0
positions[0] = x0/x0
for ii in range(len(times)):
if ii == 0:
continue
t = times[ii]
a = friction_fn(v, vt) - 100*x
v = v + a*dt
x = x + v*dt
positions[ii] = x/x0
return times, positions
_ = simulate_spring_mass_funky_damper(0.1)
%time _ = simulate_spring_mass_funky_damper(0.1)
The output is
CPU times: user 1.38 ms, sys: 337 µs, total: 1.72 ms
Wall time: 1.72 ms
vs my Mathematica code
ClearAll[friction, simulateSpring, jacksSpring];
friction = Compile[{{v, _Real}, {vt, _Real}},
If[v > vt,
-v3.0,
-vt3.0*Sign[v]]
];
simulateSpring =
Compile[{{x0, _Real}, {t, _Real}, {dt, _Real}, {vt, _Real}},
Module[{τ, times, positions, v = 0.0, a = 0.0, x = x0},
τ = t;
times = Range[0.0, t, dt];
positions = ConstantArray[0.0, Length@times];
positions[[1]] = x0/x0;
Do[
τ = times[[i]];
a = friction[v, vt] - 100*x;
v = v + a*dt;
x = x + v*dt;
positions[[i]] = x/x0;
,
{i, 2, Length@times}];
{times, positions}
]
];
jacksSpring[x_] := simulateSpring[x, 10.0, 0.0001, 1.0];
Print["CPU time: ", Timing[jacksSpring[0.1]][[1]]*1000, " ms"]
from which we have
CPU time: 27.703 ms
AbsoluteTimingthat you want, notTiming. – C. E. Jul 04 '20 at 20:57SetOptions[Compile, CompilationTarget->"C" ,RuntimeOptions->"Speed"]I get AbsoluteTiming down to about 3 ms, on Windows. – Rolf Mertig Jul 04 '20 at 21:132.639 mson my machine. I'm still interested if anyone has any ideas to beat the python attempt, especially if they can rewrite the code in a more idiomatic fashion. – Diffycue Jul 04 '20 at 21:19FunctionCompile. Though probably it will be slower thanCompile. – Rolf Mertig Jul 04 '20 at 21:25By the way, thank you for FeynCalc-- it really helped me out in my QFT courses.
– Diffycue Jul 04 '20 at 21:40FunctionCompile, and the results were so close as to not be worth it on my machine (~6.1ms with Compile and the options above, ~6.2ms with FunctionCompile). I had to wrapRangein aKernelFunctionas the built-in Native`Range does not support Real arguments, but since that's only called once I would be surprised if that were the bottleneck. Is there some way to replace theDowith some matrix maths? – Carl Lange Jul 04 '20 at 22:38