1

I'm trying to do a little Monte-Carlo, and the code is working great, but slows down when I try to increase the number of iterations, any suggestions on how I can parallelize this? Every way I've tried actually takes longer to run than the non-parallelized version.

n = 500
errors = {};
time = 60 60; (*seconds*)
velocity = 3.352;(*meters per second, about 8 minute/mile*)
gpserror = 4;(*meters*)
angles = {};
rands = {};
diss = {};
cf2 = Compile[{{gpserrx, _Real}, {gpserry, _Real}, {gpserr2x, _Real}, \
{gpserr2y, _Real}, {velocity, _Real}}, 
   Sqrt[(gpserrx - gpserr2x - velocity)^2 + (gpserry - gpserr2y)^2]];
For[i = 1, i <= n, i++, {
  gpspos = {0, 0},
  gpsdistance = 0,
  angle = RandomReal[{0, \[Pi]}, time + 1],
  rand = 
   RandomVariate[
    NormalDistribution[0, gpserror/(Sqrt[2] InverseErf[19/20])], 
    time + 1],
  rands = 
   Append[rands, 
    Table[rand[[k]]*{Cos[angle[[k]]], Sin[angle[[k]]]}, {k, time}]],
  DistributeDefinitions[rand, angle, cf2],
  gpsdistance =
   Sum[cf2[rand[[j]]*Cos[angle[[j]]], rand[[j]]*Sin[angle[[j]]], 
     rand[[j + 1]]*Cos[angle[[j + 1]]], 
     rand[[j + 1]]*Sin[angle[[j + 1]]], velocity], {j, 1, time}],
  diss = Append[diss, gpsdistance],
  errors = Append[errors, gpsdistance - velocity time]
  }]
xzczd
  • 65,995
  • 9
  • 163
  • 468
Daniel
  • 115
  • 6
  • Could you explain what your code does, and where the bottleneck is? – MarcoB Dec 12 '23 at 02:27
  • It's a Monte-Carlo exploring the affects of GPS errors. Each GPS measurement has errors, and I'm investigating how using GPS to measure distance adds up over the course of a run/bike-ride. The slowest step in a simulation appears to be at calculating the variable gpsdistance. But the slowest step really doesn't matter to me, I'd like to run simulations in parallel. – Daniel Dec 12 '23 at 02:33
  • Thank you, that helps. Can you be more explicit about what the steps in your code do? – MarcoB Dec 12 '23 at 02:39
  • Everything before the For loop just gives variables for all simulations to use. GPSpos and GPSdistance initialize the simulation. Angle and rand define the set of errors of each measurement in one simulation, in terms of an angle and magnitude. – Daniel Dec 12 '23 at 02:55
  • rand assumes a normal distribution centered about 0, this isn't totally accurate but close enough for now while I work on the code for the proper function to calculate quickly. Rands then turns rand and angle into a set of all the errors in terms of cartesian coordinates. gpsdistance then calculates the sum of all the distances between each steps, this uses pythagorean thereom on each gps measurement error, as well as the predefined constant velocity. – Daniel Dec 12 '23 at 02:55
  • diss compiles the distance from all simulations into one list, errors also compiles the total error from each simulation into one list(the error between GPS calculated distance, and the true distance being distance*time) – Daniel Dec 12 '23 at 02:55
  • Oh GPSerror=4 defines the standard deviation on the normal distribution such that 95% of the errors will fall within 4 meters of the true value. – Daniel Dec 12 '23 at 02:58
  • One immediate thing that may offer big improvement is using Reap and Sow rather than Append – evanb Dec 13 '23 at 08:36

1 Answers1

5

…You're simply in wrong way. The first thing should do is to write code in Mathematica style. I suggest reading the following as a first step:

10 Tips for Writing Fast Mathematica Code

If you insist on parallelization, at least first read this post:

Why won't Parallelize speed up my code?

The following is my solution without compiling or parallelization, which is about 50 times faster than yours:

(n = 500;
  time = 60 60; 
  velocity = 3.352;
  gpserror = 4;
  gpsdistance = 0;
  angle = RandomReal[{0, π}, {time + 1, n}];
  rand = RandomVariate[
    NormalDistribution[0, gpserror/(Sqrt[2] InverseErf[19/20])], {time + 1, n}];
  gpserrx = rand Cos[angle];
  gpserry = rand Sin[angle];
  rands = Transpose[{gpserrx, gpserry}, 1 <-> 3];
  diss = Sqrt[(Most@gpserrx - Rest@gpserrx - velocity)^2 + (Most@gpserry - 
         Rest@gpserry)^2] // Total;
  errors = diss - velocity  time;) // AbsoluteTiming
(* {0.164374, Null} *)

Please read it carefully and think about why I'm modifying the code in this manner. If you still feel confused, feel free to continue to ask in the comment.


Update

As pointed out by Henrik in comment below, a better definition for diss is

diss = Total@Sqrt[(Differences[gpserrx] + velocity)^2 + Differences[gpserry]^2];
xzczd
  • 65,995
  • 9
  • 163
  • 468
  • 1
    (+1) diss = Total[Sqrt[(Differences[gpserrx] + velocity)^2 + (Differences[gpserry] + velocity)^2]]; might be a bit more idiomatic and faster. (Tiny bit, though.) – Henrik Schumacher Dec 13 '23 at 12:17
  • @Henrik It should be diss = Total[Sqrt[(Differences[gpserrx] + velocity)^2 + (Differences[gpserry])^2]]; :) . Then, this is suprising, I thought Differences will be slower, but actually for n = 10^4, the RepeatedTiming is 3.07619(Most&Rest) v.s. 2.43373 (Differences) on my machine! – xzczd Dec 13 '23 at 12:57
  • Thanks, I wasn't aware RandomVariate was able to take in Tensors of rank 2, or I would've known how to solve this from the start. I still run out of memory and having to pagefile with this, but was able to get around it by having your code run with n= the highest number my RAM can support, put into a Do loop so that I can get as many runs as I want to get the number of iterations I desire. – Daniel Dec 15 '23 at 03:21