Is this the most effective and efficient programming style for prototyping (a ray tracer) in Mathematica?

Question

I am trying out Mathematica as a prototyping tool. As an initial exercise I have put together a brute force ray tracer, with a view to using Mathematica's built in probability distribution functions to test out some fancy shaders. (My starting point was the F# ray tracer available http://code.msdn.microsoft.com/ParExtSamples.)

As this is a first attempt at using Mathematica, other than as a fancy calculator, I would welcome some guidance or critique on whether this coding style is going to be effective. (You will detect from the code that I am accustomed to an object-oriented paradigm.)

My own observations, in no particular order, are:

It's an enjoyable way to work, as a lot of the busy type definitions and additional syntax required by other languages is not needed here; the code is quite compact and the intention fairly readable (IMHO).
I have built the ray tracer from scratch, without using the built-in graphics primitives as that would defeat the learning purpose of the exercise and I would probably not be able to roll my own shaders. As a simple example, addition is not defined for RGBColor, for example.
Probably as a consequence, the tracer is very slow, even if I use all the 4 kernels that are available to me. I dare say that the existing code could be speeded up considerably by removing all the type / pattern matching, but the code would be much harder to follow, I think, and certainly much harder to debug. My approach seems to mean that I cannot take advantage of Compile optimization, so far as I can see. If I wanted a fast ray tracer I would write it in C++, but I wonder whether there are any easy optimizations that I have missed (and that don't involve mangling what I have to the extent that the prototyping benefits of Mathematica -- easy refactoring -- would be lost). However, I am surprised that even with 4 kernels, the cpu of this 4-core (8-core, if you count hyperthreading) never maxes out. For example, is it worth changing some of the := to =? Things would be even slower if I aliased the results by averaging 4 adjacent traces, for example. Real time ray tracing seems out of reach.
Ray tracing results in a lot of "corner cases" that are dealt with naturally with IEEE maths. Unfortunately Mathematica does produce +/-Infinity for +/-1/0, for example, so some of the code should really be extended to treat those cases properly.
It would be great if Mathematica had more built-in vector algebra so that I could write the equations defining the objects and the rays involved and getting Mathematica to calculate the ray intersection points that are at the heart of the ray tracer. As things stand, Reduce and Solve have not helped me to find better intersection algorithms, producing either nothing at all or something large an unintelligible, depending on how I posed the problem.

Anyway, here is what you get after 110s from raytrace[400, 400, basescene, 6]:

Here is what you get after 110s from <code>raytrace[400, 400, basescene, 6]</code>

... and here is the code:

    (* Colour helpers *)
    black = {0., 0., 0.};
    darkgrey = {.25, .25, .25};
    grey = {.5, .5, .5};
    white = {1., 1., 1.};
    background = black;
    defaultcolor = black;

    brightness[{r_, g_, b_}] = Mean[{r, g, b}]; 

    scale[k_, c: {r_, g_, b_}] = k * c;

    zero = {0.,0.,0.};

    (* Mainly for reference; pattern matching normally used instead *)

    ray /: start[ray[s_, d_]] = s;
    ray /: dir[ray[s_, d_]] = d;

    camera /: pos[camera[p_, l_]] = p;
    camera /: lookat[camera[p_, l_]] = l;
    camera /: forward[camera[p_, l_]] = Normalize[l - p];
    camera /: down[camera[p_, l_]] = {0., -1., 0.};
    camera /: right[c : camera[p_, l_]] := 1.5 * Normalize[Cross[forward[c], down[c]]];
    camera /: up[c : camera[p_, l_]] := 1.5 * Normalize[Cross[forward[c], right[c]]];


    light /: pos[light[p_, c_]] = p;
    light /: color[light[p_, c_]] = c;

    scene /: things[scene[t_, l_, c_]] = t;
    scene /: lights[scene[t_, l_, c_]] = l;
    scene /: camera[scene[t_, l_, c_]] = c;

    surface /: diffuse[surface[d_, s_, re_, ro_]] = d;
    surface /: specular[surface[d_, s_, re_, ro_]] = s;
    surface /: reflect[surface[d_, s_, re_, ro_]] = re;
    surface /: roughness[surface[d_, s_, re_, ro_]] = ro;

    intersection /: thing[intersection[t_, r_, d_]] = t;
    intersection /: ray[intersection[t_, r_, d_]] = r;
    intersection /: dist[intersection[t_, r_, d_]] = d;
    miss = intersection[nothing, ray[zero, zero], Infinity];

    sceneobject /: surface[sceneobject[s_, i_, n_]] = s;
    sceneobject /: intersect[sceneobject[s_, i_, n_]] = i;
    sceneobject /: normal[sceneobject[s_, i_, n_]] = n;

    sphere /: center[sphere[c_, r_, s_]] = c;
    sphere /: radius[sphere[c_, r_, s_]] = r;
    sphere /: surface[sphere[c_, r_, s_]] = s;
    normal[sphere[center_, _, _], pos_] = Normalize[pos - center];

    plane /: normal[plane[n_, o_, s_]] = n;
    plane /: offset[plane[n_, o_, s_]] = o;
    plane /: surface[plane[n_, o_, s_]] = s;
    normal[plane[n_, _, _], _] = n;

    (* Axis-aligned bounding box *)
    (* TODO: not yet used; integrate into tracer *)
    box /: lowerb[box[l_, u_]] := l;
    box /: upperb[box[l_, u_]] := u;

    extendby[box[l_, u_], pt_] :=
        box[MapThread[Min, {l, pt}], MapThread[Max, {u, pt}]];
    size[box[l_, u_]] :=
        u - l;
    majoraxis[b : box[l_, u_]] :=
        Ordering[size[b], -1]; 


    (* TODO: This does not work for cases where dir has 0 compnent as Mathematic returns ComplexInfinity,
       not +/-Infinity for +/-1/0 *)
    intersectboxQ[b : box[l_, u_], r : ray[start_, dir_]] :=
        Module[ {tl = (l - start) / dir, tu = (u - start) / dir, tmin, tmax},

            (* Swap u[i] and l[i] if dir[i] < 0 to avoid 
               erroneous result because 0 == -0 *)
            tmin = Max[MapThread[Min, {tu, tl}]];
            tmax = Min[MapThread[Max, {tu, tl}]];
            Not[tmax < 0 && tmin > tmax];
        (* Use Not to cover some Infinity comparisons *)

        (*  Which[
            tmax < 0, False, (* Intersection at t = tmax, but it's behind us *)
            tmin > tmax, False, (* No intersection *)
            True, True (* Interesection at t = tmin *)
            ]*)
        ];



    intersect[s : sphere[center_, radius_, _], r : ray[start_, dir_], i : intersection[_, _, currentdist_]] :=
        Module[ {eo = center-start, v, dist, disc},
            v = eo.dir;
            dist = If[ v < 0.,
                       0.,
                       disc = radius * radius - (eo.eo - v * v);
                       If[ disc < 0.,
                           0.,
                           v - Sqrt[disc]
                       ]
                   ];
            If[ dist == 0. || dist > currentdist,
                i,
                intersection[s, r, dist]
            ]
        ];


    intersect[p : plane[norm_, offset_, _], r : ray[start_, dir_], i : intersection[_, _, currentdist_]] :=
        Module[ {denom =  norm . dir, candidatedist},
            If[ denom >= 0.,
                i,
                candidatedist = (norm . start + offset) / (-denom);
                If[ candidatedist > currentdist,
                    i,
                    intersection[p, r, candidatedist]
                ]
            ]
        ];


    testray[ray_, scene_] :=
        dist[Fold[intersect[#2, ray, #1]&, miss, things[scene]]];


    traceray[ray_, scene_, depth_, maxdepth_] :=
        shade[Fold[intersect[#2, ray, #1]&, miss, things[scene]], scene, depth, maxdepth];


    shade[miss, _, _, _] :=
        background;     
    shade[intersection[thing_, ray[start_, dir_], dist_], scene_, depth_, maxdepth_] :=
        Module[ {pos = dist * dir + start, n, reflectdir, naturalcolor, reflectedcolor},
            n = normal[thing, pos];
            reflectdir = dir - 2. * n . dir * n;
            naturalcolor = defaultcolor + getnaturalcolor[thing, pos, n, reflectdir, scene];
            reflectedcolor = If[ depth >= maxdepth,
                                 grey,
                                 getreflectioncolor[thing, pos + (0.001*reflectdir), n, 
                                     reflectdir, scene, depth, maxdepth]
                             ];
            naturalcolor + reflectedcolor
        ];

    getreflectioncolor[thing_, pos_, n_, rd_, scene_, depth_, maxdepth_] :=
        reflect[surface[thing]][pos] *
        traceray[ray[pos, rd], scene, depth + 1, maxdepth];

    getnaturalcolor[thing_, pos_, n_, rd_, scene_] :=
        Module[ {addlight, normraydir = Normalize[rd], howrough = roughness[surface[thing]]},
            SetAttributes[addlight, Listable];
            addlight[light[p_, c_]] :=
                Module[ {ldis = p - pos, livec, neatisect, isinshadow, illum, lcolor, spec, scolor},
                    livec = Normalize[ldis];
                    neatisect = testray[ray[pos, livec], scene];
                    isinshadow = neatisect <= Norm[ldis];
                    If[ isinshadow,
                        defaultcolor,
                        illum = livec . n;
                        lcolor = If[ illum > 0.,
                                     illum * c,
                                     defaultcolor
                                 ];
                        spec = livec . normraydir;
                        scolor = If[ spec > 0.,
                                     (spec ^ howrough) * c,
                                     defaultcolor
                                 ];
                        diffuse[surface[thing]][pos] * lcolor + specular[surface[thing]][pos] * scolor
                    ]
                ];
            defaultcolor + Total[addlight[lights[scene]]]
        ];



     raytrace[screenwidth_ : 64, screenheight_ : 64, scene_ : basescene, maxdepth_ : 1] :=
         Module[ {getpoint},
             getpoint[x_, y_, camera_] :=
                 With[ {recenterx = (x - screenwidth / 2.) / (2. * screenwidth), 
                 recentery = -(y - screenheight / 2.) / (2. * screenheight)},
                     Normalize[forward[camera] + recenterx * right[camera] + recentery * up[camera]]
                 ];
             Image[ParallelArray[traceray[ray[pos[camera[scene]], getpoint[#2-1, #1-1, camera[scene]]], 
                 scene, 0, maxdepth]&, {screenheight, screenwidth}]] // AbsoluteTiming
         ];


     (* Harness *)

    (* surface diffuse, specular reflect, roughness *)
    uniformsurface[diffuse_, specular_, reflect_, roughness_] = surface[diffuse&, specular&, reflect&, roughness];

    shiny = uniformsurface[white, grey, .7, 250.];
    matteshiny = uniformsurface[white, darkgrey, .7, 250.];

    checkerboard =
        surface[
        If[ OddQ[Floor[#[[3]]] + Floor[#[[1]]]],
            white,
            black
        ]&, 
        white &, 
        If[ OddQ[Floor[#[[3]]] + Floor[#[[1]]]],
            .1,
            .7
        ]&, 
        150.];



     basescene = scene[{  
         sphere[{0., 1., -.25}, 1., shiny], 
              sphere[{-.5, 1.3, 1.5}, 0.5, matteshiny],
         plane[{0., 1., 0.}, 0., checkerboard]},
          {light[{-2., 2.5, 0.}, {.5,.45,.41}], light[{2.,4.5,2.},{.99,.95,.8}]},
          camera[{2.75, 2.0, 3.75}, {-.6, .5, 0.}]];

To illustrate where I am going with this, here is an example of how one could generate a mesh for a more complex object (a polysphere):

    polyspherepoints[rad_Real, divs_Integer] :=
        With[ {u = -(Pi/2.), v = -Pi, 
          du = Pi/divs, dv = (2.*Pi)/divs},
            rad * 
             Flatten[Table[{Cos[du*i + u]*Cos[dv*j + v], Sin[du*i + u], 
                Cos[du*i + u]*Sin[dv*j + v]}, {j, 0, divs}, {i, 0, 
            divs}], 1]
        ];

    (* Put the polygon vertices in the right order *)
    ordervertices[{{a_, b_}, {c_, d_}}] :=
        {a, b, d, c};

    orderverticestotriangeles[{{a_, b_}, {c_, d_}}] :=
    {{a, b, d}, {a, c, d}}

    (* Generate a list of (polyspherepoint) vertice numbers, 
       partition them cyclically, and then into quads, and associate them
       with Polygons *)

    polyspheremeshtriangles[rad_Real, divs_Integer] :=
        Normal @ GraphicsComplex[polyspherepoints[rad, divs], 
          Map[Polygon, 
           Map[orderverticestotriangeles,
             Partition[Partition[Range[(divs+1)^2], divs+1], {2, 2}, 1, 1], {2}], 1]]; 

    polyspheremeshtriangles[rad_Real, divs_Integer] :=
        Normal @ GraphicsComplex[polyspherepoints[rad, divs], 
          Map[Polygon, 
           Map[orderverticestotriangeles,
             Partition[Partition[Range[(divs+1)^2], divs+1], {2, 2}, 1, 1], {2}], 1]];

(It would have been satisfying to use some of the geometric transform functions built into Mathematica to generate the vertices, but life was too short.)

And here is what Graphics3D @ polyspheremeshtriangles[1., 8] generates:

PolySphere, of trianges

If you use parallelization and not all CPU cores are used at 100% that usually indicates that either there's too much communication overhead, or something "goes wrong" when the results are sent back to the main kernel. I tried to compile a set of reasons why parallelization may not be as efficient as it should be. — Szabolcs, May 27 '14 at 22:26
Can you clarify what sort of verctor algebra you're looking for? Can you give an example of a computation you're trying to simplify? — Szabolcs, May 27 '14 at 22:27
Yes, unfortunately there tends to be a tradeoff between readability (or ease of programming) and performance in Mathematica. One of my pet peeves is that vectorized code is difficult to write and read (without creating a framework to simplify things), even though vectorization can yield amazing performance improvements. — Szabolcs, May 27 '14 at 22:30
Parallelism: Mathematica is communicating by TCP/IP rather than shared memory is one source of inefficienvy, but a more likely issue is that the kernel is not multi-threaded and so only 4 threads run at a time, rather than the full 8 that could, on a hyperthreading cpu. PS: I would love to be able to run the kernels on my "big" Windows pc while editing the notebooks on my macbook air, but I can't find a way of achieving that without having 2 Mathematica licenses. — jrp, May 27 '14 at 22:34
Using MMA for ray tracing is sort of like putting a trailer-hitch on a Ferrari, but +1 for interesting code & result. BTW - if on a CUDA capable machine, I'd venture that's a route to higher performance. — ciao, May 27 '14 at 22:35
Mathematica does use shared memory to communicate with subkernels (parallel kernels) if you're running on a single computer (unless you explicitly instruct it to use TCP/IP, which is not trivial to do, so I don't think you did that). If you are using version 9 or earlier, it will launch 8 subkernels by default on a 4-core machine that supports hyperthreading. You can control how many kernels will be launched manually: LaunchKernels[n] will launch an additional n kernels. — Szabolcs, May 27 '14 at 22:37
Vector Algebra: if you have a ray (r(t) = origin + t * direction), find whether and, if so where, it intersects an axis-aligned )bounding) box defined by two of its diagonal opposing vertices. (There is some code made up by me that attempts to do that.) — jrp, May 27 '14 at 22:39
Multi-core: my 9.0.1 (Home) license seems to be limited to 4 kernels, I think. I'm using Wolfram Workbench and see some traces that suggest that TCP/IP is being used for communicating with the kernels. I have not got around to figuring out how to change that. — jrp, May 27 '14 at 22:42
@rasher Since you brought up the car analogy, I can't resist noting that MMA looks like a Porsche, is priced like a Ferrari but actually performs more like a mid-range VW and crashes like a Chevy :D — rm -rf, May 27 '14 at 22:44
I don't think that it's like a Ferrari and Trailer. The main issue is, I suspect, that there is a lot of overhead in having to do all that run-time binding / pattern matching. I know that I could do things in CUDA but as I explained, if I was after performance I would write in C++ and SIMD with CUDA, but it would not allow me to prototype things so quickly. — jrp, May 27 '14 at 22:48
@jrp You can list the existing links (MathLink connections) using Links[]. If you don't see any strings that look like portNumber@machineName or portNumber@machineIP then all links are shared memory links (screenshot). — Szabolcs, May 27 '14 at 22:49
Thanks. Yes, it looks like I am using shared memory after all. — jrp, May 27 '14 at 22:53
Your code is a bit overwhelming and Mathematica performance is notoriously difficult to predict, but generally using a lot of pattern matching is detrimental to performance (there are surprising exceptions to that). The Workbench has a useful profiler. Have you tried using it? Here's some discussion of what's good and bad for performance, generally. — Szabolcs, May 27 '14 at 22:53
Perhaps you're interested in the Pure language. It is based on term-rewriting just like Mathematica, but its pattern matching seems to be (considerably?) more limited. It uses LLVM and is supposed to have decent performance. I've played with it only very briefly so I might be wrong about some things I said above. — Szabolcs, May 27 '14 at 22:57
Thanks. I looked at the Profiler, but could not see anything untoward. (That said, I found the results a bit hard to interpret.) I'll have a look at Pure: although it may be good for prototyping, it won't give me access to the fancy probability distribution functions that I am hoping to get out of Mathematica. If I was going that route, I'd probably stick to F#, although it isn't cross-platform. — jrp, May 27 '14 at 23:11
You can try [Experimental`NumericalFunction](http://mathematica.stackexchange.com/a/44890/280) for speeding up your code. — Alexey Popkov, May 31 '14 at 16:53
Unrelated tip: you are using = in places where you should use := to avoid possible problems if you aren't on a clean kernel. But it won't change the speed — Rojo, Jun 06 '14 at 02:07
Thanks. I have always struggled with finding a good explanation of delayed versus immediate assignment, particularly when I have no global variables. Pointers welcome. — jrp, Jun 07 '14 at 10:27
Thanks for the Experimental'NumericalFunction pointer. Other potentially relevant, but undocumented, packages are http://mathematica.stackexchange.com/questions/41496/graphicsmeshfindintersections-fails-to-detect-intersections and http://mathematica.stackexchange.com/questions/39184/any-ideas-on-how-to-use-the-region-context Does anyone know whether any of this functionality is likely to make documented status? — jrp, Jun 07 '14 at 10:34
@jrp could we make an identical cross-post on Wolfram Community? The content is simply great by itself and is well worth publishing as an "idea". We can see what feedback we can get there. Post editor is the same - so simple copy paste will work. — Vitaliy Kaurov, Jun 17 '15 at 23:42
The ray tracer code was removed from the Windows site. Can you put a copy in Gthub. I am very interested to review this code and how was applied to WM. In fact, I have been looking for Source and Phase functions from Radiative Transfer Equations . This looks a great options for applications looking to model the Phase function term. — Jose Enrique Calderon, Jun 27 '17 at 19:21

score 1 · Answer 1 · answered Jul 05 '17 at 02:34

1

It's dangerous if you don't know what the argument is, but replacing sin, cos, sqrt with their Taylor series approximations when the argument is small may help (or writing your own Compile Taylor series versions). Another thing which comes to mind but may not be helpful in your case is Fast Inverse Square Root

answered Jul 05 '17 at 02:34

Shane Di Dona

111
3

1

You could probably test the Taylor series idea before posting. And have you seen Internal`ReciprocalSqrt? – Michael E2 Jul 05 '17 at 04:33

Is this the most effective and efficient programming style for prototyping (a ray tracer) in Mathematica?

1 Answers1

Linked