32

The Gillespie SSA is a Monte Carlo stochastic simulation algorithm to find the trajectory of a dynamic system described by a reaction (or interaction) network, e.g. chemical reactions or ecological problems. It was introduced by Dan Gillespie in 1977 (see paper here). It is used in case of small molecular numbers (or species abundance) where numerical integration of the related differential equation system is not appropriate due to hard stochastic effects (i.e. the death of a single individual might make a large impact on the population).

Can you do it with Mathematica? Is it general enough? Is it fast?

István Zachar
  • 47,032
  • 20
  • 143
  • 291

1 Answers1

42

Yes you can. Below is a fairly general, Mathematica-compiled, fast and robust version.

Examples

1. Michaelis-Menten kinetics

Michaelis-Menten kinetics for enzyme-directed substrate conversion. The enzyme (e) converts the susbtrate (s) through an enzyme-substrate complex (c) to the product (p). For comparison, I've included the deterministic ODE system solved by NDSolve.

ClearAll[e, s, c, p, t];
reactions = {e + s -> c, c -> e + s, c -> e + p};
vars = {e, s, c, p};
rates = {1.1, .1, .8};
init = <|e -> 100, s -> 100, c -> 0, p -> 0|>;

det = NDSolveValue[{
    e'[t] == .9 c[t] - 1.1 e[t] s[t], 
    s'[t] == .1 c[t] - 1.1 e[t] s[t], 
    c'[t] == 1.1 e[t] s[t] - .9 c[t],
    p'[t] == .8 c[t],
    e[0] == 100, s[0] == 100, c[0] == 0, p[0] == 0}, vars, {t, 0, 10}];
sto = GillespieSSA[reactions, init, rates, {0, 10}];
op = {PlotStyle -> Thick, PlotTheme -> "Scientific"};
Row@{Plot[Evaluate@Through@det@t, {t, 0, 10}, Evaluate@op,
          PlotLabel -> "deterministic ODE"], Spacer@10, 
     Plot[Evaluate@Through@sto@t, {t, 0, 10}, Evaluate@op,
          PlotLabel -> "stochastic SSA"]}

Mathematica graphics

2. Lotka-Volterra predator-prey dynamics

Lotka-Volterra dynamics. I omit the conversion to continuous differential equations from now on, leaving it to the "educated reader". x -> Null indicates that the species is removed without producing waste material (at least one that is tracked). Similarly, Null -> x indicates a zero-order reaction where x is generated spontaneously (or is entering from the external environment).

ClearAll[y, x];
reactions = {y -> 2 y, y + x -> 2 x, x -> Null};
vars = {y, x};
rates = {1, .005, .6};
init = <|y -> 50, x -> 100|>;

sto = GillespieSSA[reactions, init, rates, {0, 100}]

Mathematica graphics

3. Circadian cycle

ClearAll[a, p, i];
reactions = {a -> 2 a, a -> a + p, p -> i, a + i -> i, i -> Null};
vars = {a, p, i};
init = <|a -> 100, p -> 100, i -> 100|>;
rates = {1, .08, .6, .01, .4};

sto = GillespieSSA[reactions, init, rates, {0, 400}];

Mathematica graphics

4. Oregonator

The Oregonator is a model of the Belousov-Zhabotinsky reactions. Here I used the resolution argument to speed up evaluation (only every 100th step is stored).

ClearAll[a, b, c];
reactions = {b -> a, a + b -> Null, a -> 2 a + c, 2 a -> Null, c -> b};
vars = {a, b, c};
init = <|a -> 1, b -> 2, c -> 3|>
rates = {2, .1, 104, .016, 26};

sto = GillespieSSA[reactions, init, rates, {0, 4, 100}];

Mathematica graphics

Usage

The function accepts the following arguments:

 GillespieSSA[
  {r1, r2, ...}, (* list of elementary reaction steps as Rules *)
  <|y1 -> y1[0], y2 -> y2[0], ...|>, (* variables with initial values at t==0 *)
  {c1, c2, ...}, (* reaction rate constants, for each reaction *)
  <|y1 -> f1, y2 -> f2, ...|>, (* linear in/outflux for each variable; 
                      can be left out, in which case 0 will be used*)
  {mint, maxt, res} (* {start time, end time, step resolution}; 
                      resolution can be omitted, defaulting to 1 *)
 ]

It returns an InterpolatingFunction (with InterpolationOrder -> 0) as each variable's solution function, to comply with the result produced by NDSolve. Initial values are taken to be the values at t == mint. The maximal allowed step size (10^7) is hardcoded in iterations, change it for your needs.

Note, that the Gillespie method is stochastic: it will convert continuous reaction rates into discrete-valued reaction propensities. It cannot accept reactions where stoichiometric factors are not integer numbers. Also, it gives a different realization every time it is run (if the random generator is not reseeded) due to its stochastic nature. Moreover, at small molecular/species amounts, stochastic effects could cause extinction and produce different results as in the deterministic, continuous case (NDSolve).

Code

ClearAll[GillespieSSA];
GillespieSSA[res : {__Rule}, in_Association,
   rateconst_?VectorQ, influx_Association: <||>,
   {mint_?NumberQ, maxt_?NumberQ, dstep_Integer: 1}] := Module[
   {vars, reactant, product, balance, propensities, initialValues,
    fluxRates, symRates, rep, stepList, iterations = 10^7, step,
    compiled, times, rest},

   (* Pre-generating a list is much faster than iteratively calling one-by-one. *)
   stepList = N@RandomVariate[ExponentialDistribution@1, iterations + 1];

   {vars, initialValues} = {Keys@in, Values@in};
   {reactant, product} = 
    Outer[Coefficient[#1, #2] &, #, vars, 1] & /@ 
     Transpose@(List @@@ res);
   balance = product - reactant;
   propensities = 
    Inner[Binomial[#2, #1] &, reactant, vars, Times]*
     PadRight[rateconst, Length@res, 1];
   fluxRates = If[influx === <||>, 0 & /@ vars, vars /. influx];

   Block[{count},
    rep = Thread[vars -> Table[Indexed[count, i], {i, Length@vars}]];
    symRates = propensities /. rep;
    compiled = ReleaseHold[
      Hold@Compile[{
          {init, _Integer, 1}, {flux, _Integer, 1}, {bal, _Integer, 2},
          {dtList, _Real, 1}, {min, _Real}, {max, _Real},
          {iter, _Integer}, {resol, _Integer}},
         Module[{
           count = init, rates, i = 1, c = 1, t = min, dt, ff, f, range,
           r, fReal = 0. & /@ flux, rateSum, data = Internal`Bag[]},
          rates = "SymbolicRates";
          rateSum = N@Total@rates;
          range = Range@Length@rates;
          Internal`StuffBag[data, Internal`Bag[Join[{t}, N@count]]];

          While[Total@count > 0. && rateSum > 0. && t <= max && i <= iter,
           i++;
           dt = dtList[[i]]/rateSum;
           t = t + dt;
           r = RandomChoice[rates -> range];
           (* Fractional part is carried over to minimize undersampling error *)
           ff = (flux*dt) + fReal;
           f = IntegerPart@ff;
           fReal = ff - f;
           (* `count` is maintained as an integer not to loose precision. *)
           count = Max[0, #] & /@ (count + bal[[r]] + f);
           If[Mod[i, resol] == 0, c++;
            Internal`StuffBag[data, Internal`Bag[Join[{t}, N@count]]];];
           rates = "SymbolicRates";
           rateSum = N@Total@rates;
           ];
          If[t < max && i < iter, c++;
           Internal`StuffBag[data, Internal`Bag[Join[{max}, N@count]]]];
          Table[Internal`BagPart[Internal`BagPart[data, j], All], {j, c}]
          ],
         Parallelization -> True , RuntimeAttributes -> Listable, 
         RuntimeOptions -> "Speed", 
         CompilationOptions -> {"InlineExternalDefinitions" -> True, 
           "InlineCompiledFunctions" -> True}
         ] /. "SymbolicRates" -> symRates];

    {times, rest} = {First@#, Rest@#} &@Transpose@compiled[
        Round@initialValues, N@fluxRates, Round@balance,
        N@stepList, N@mint, N@maxt, Round@iterations, dstep];
    Interpolation[Transpose@{times, #}, InterpolationOrder -> 0] & /@ rest
    ]];
István Zachar
  • 47,032
  • 20
  • 143
  • 291
  • 5
    I am very curious about GillespieSSA[] runs faster than a corresponding purely-C-implementation.:) – xyz Jul 01 '16 at 08:02
  • I upvoted, but I think your choice of system is rather boring compared to Lotka-Volterra or the Oregonator. – J. M.'s missing motivation Jul 01 '16 at 11:38
  • @ShutaoTANG Concerning the C code: my C implementation is certainly not the fastest as I'm not an expert C user, but for my simple benchmarks, the compiled Mathematica version is faster by some 10%. WIll investigate the issue. – István Zachar Jul 01 '16 at 11:59
  • Thanks for your reply. In my mind, the C/C++ are all the high performance programming language, while for the Mathematica, although it is elegant, its performance is much lower than low-level language sometimes. – xyz Jul 01 '16 at 13:01
  • @J.M. Please see added examples. There are of course an infinitely many interesting systems, please feel free to add/recommend some! – István Zachar Jul 02 '16 at 08:39
  • 2
    @ShutaoTANG It turned out that the C code, though qualitatively similar, does not exactly give comparable results to the Mathematica code. The latter is still faster on my system, but for now I removed the incriminated line from the intro not to raise suspicion : ) Also, I should mention that the C code was actually generated, compiled and called by Mathematica (which might explain some discrepancies in behaviour), but even with subtracting the overhead it was slower. This will require more time to investigate. – István Zachar Jul 02 '16 at 08:44
  • 1
    For reference, you might want to explain the special role of Null in your examples. In any event: very nicely done! – J. M.'s missing motivation Jul 02 '16 at 12:25
  • @J.M. Thanks for the pointer, I've just included that! – István Zachar Jul 02 '16 at 16:46
  • 3
    Super cool function! Minor observation: due to the way InterpolationOrder->0 works discussed here, the value of the InterpolatingFunction seems displaced by one step. That is, the resulting InterpolatingFunction sto takes on the next value before the transition actually happens. For plotting purposes, ListLinePlot[sto, InterpolationOrder -> 0] avoids this issue. – Chris K Oct 19 '17 at 00:57
  • 1
    @ChrisK Thanks! Actually I knew about the issue, but still decided to enforce zero interpolation, not to misleadingly suggest that the output is a continuous function (either in time and in values). Perhaps a TemporalData (or plain list) output should have been used, but those are less known than InterpolatingFunction. (Plus I really wanted it to be the discrete equivalent of a continuous ODE model solved by NDSolve). But your caveat is very important! – István Zachar Oct 19 '17 at 15:56
  • @IstvánZachar I have used your GilespieSSA function to simulate a chemical reaction in my PhD thesis. How should I reference it to give you credit? – Delphine Aug 21 '18 at 16:16
  • 1
    Thanks for using my code and for the credit @Delphine. Congratulations on your finished thesis! This code is published here as is, and no journal publication exists otherwise. Citing should be straightforward using the link under the share button. There is a meta thread here about citing work from Mathematica.SE, and incidentally (and mistakenly) my name is used as example :) If you insist on citing something official, take a pick from my Google Scholar page that fits your field. – István Zachar Aug 21 '18 at 18:35
  • @ChrisK To get the values expected, you can change "Interpolation[Transpose@{times, #}, InterpolationOrder -> 0]" to "Interpolation[ Transpose@{Join[times, {times[[Length[times]]] + 10^-4}], Join[{#[[1]]}, #]}, InterpolationOrder -> 0]" (10^-4 can be smaller, it just needs to be tiny compared to the time scale). Alternatively, the same section could be changed to "StepFunction[Transpose@{times, #},Right]" using the StepFunction as defined by CarlWoll here but it returned some unexpected diagonal steps that I couldn't understand. – Delphine Aug 28 '18 at 16:39