9

My Mathematica front end will freeze occasionally for unreproducible reasons. This seems mostly associated with long (> 1 hour) computations, although not exclusively. If I click on the Mathematica icon from the OSX dock, or alt-tab to it, none of the notebooks become viewable. The OSX activity monitor correctly identifies the front end process "Mathematica" as "Not responding".

In the past if this happened I was forced to kill the front end process, which wiped out any unsaved notebooks. However, I have recently figured out that I can usually preserve the front end, and the notebook data, by just killing the WolframKernel tasks instead. I generally have multiple kernels running, which are indistinguishable within the activity monitor, and I basically have to kill them randomly one at a time until the front end starts responding again. I still lose the kernel, but I no longer lose the front end and the notebook data.

What's going on? Isn't the point having a separate front end process that it can survive crashes and freezing of the kernels?

I run version 10.3.00 on OSX 10.11.2.

Update: I'm still having issues with this. Recently, the kernel computations stopped in the sense that the CPU usage went almost to zero. However, the grey circle "running" indicator in the dock still shows, and the front end remains unresponsive. Killing the problem kernel resurrected the front end and, interestingly, caused the other kernels to resume computation (i.e., their CPU usage went back to 100%). The system memory utilization was low throughout.

Update 2: I can now almost always identify the problem kernel as the one that has low CPU usage (~10%) while the other wolfram kernels have negligible (0%-2%). Interestingly, it's recently happened twice that the problem kernel was not running a computation, but just idling (i.e., associated with a notebook not doing anything). Once it was force quit, the other kernels resumed (~97% CPU). Also, I have upgraded to 10.4.1.0 with no noticeable change.

Update 3: I have now had this issue with dynamic objects disabled, so that doesn't appear to be the root cause. Interestingly, I have twice now "caught" it in the middle of the freezing process, in this sense: The front end suddenly stops responding (spinning beach ball), but at least some of the background computations remain at max CPU. This continues for a while but, one by one, the kernels fall to idle, abruptly dropping from max to min CPU over a short time (1-5 seconds) and then staying there. All of the kernels seem to come to idle within a couple minutes of the front end freezing. Then, if I kill the problem kernel, the remaining kernels resume their computations as before.

Jess Riedel
  • 1,526
  • 10
  • 25
  • Related: http://mathematica.stackexchange.com/questions/74073/how-to-kill-the-kernel-process-from-the-command-line-without-killing-the-fronten?rq=1 and http://stackoverflow.com/questions/8717211/understanding-kernel-frontend-communication-why-does-my-front-end-freeze (I am not using remote kernels.) – Jess Riedel May 24 '16 at 19:18
  • Are you running out of memory? – bbgodfrey May 25 '16 at 04:33
  • I don't think so? I've never seen the "memory pressure" graph get above 70% or so, but my impression is that this is complicated by how the OS uses the swap space and so on. The WolframKernel processes I kill never have more than ~400 MB of uncompressed ram out of 16GB on my machine (as opposed to a hog like chrome at 2.9 GB) but I'll keep an eye out next time this happens. – Jess Riedel May 25 '16 at 11:29
  • 5
    I have had the same experience. I cannot provide an answer from knowledge, but I assume that the Front End is waiting for the Kernel to do something, and that killing the Kernel causes the Front End to break the connection and stop waiting. – Mr.Wizard Jun 01 '16 at 23:04
  • 3
    Do you use Dynamic elements in your Notebook? – Alexey Popkov Jun 02 '16 at 06:05
  • 1
    Alexey, yes. Now that you mention it, that is a likely suspect. I have a handful of Dynamic objects that basically show the status of the computation, e.g., a progress bar. It would be very inconvenient to completely remove them, though, because I wouldn't be able to distinguish a computation that would finish in 2 hours from one that would take a month and needs to be killed. Are there any safer methods I could use to display the status of the computation without allowing the kernel to break the front end? Some sort of raw message passing with MathLink? – Jess Riedel Jun 02 '16 at 10:50
  • 1
    In the old days we simply used Print for monitoring the process of the computation. Currently I'm doing all my actual work with version 8.0.4 due to its stability and higher speed than the latest version 10.4.1. In my Notebook I always have only one Dynamic object at the moment and this object is as simple as Dynamic[Grid[{<header>, <list of variables>}]]. With this setup I NEVER experienced the problem you have described. But with complicated Dynamic and especially with the latest version I do experience it. – Alexey Popkov Jun 02 '16 at 13:33
  • 2
    Another possible source of such FrontEnd freezing is that your code contains something like Rasterize, Export etc. - any function which requires FrontEnd. Such functions usually freeze the FrontEnd during evaluation and (may be unexpectedly but quite logical) killing the Kernel unfreezes the FrontEnd in such situations. – Alexey Popkov Jun 02 '16 at 13:47
  • 3
    May be it is worth to mention that I always use PrintTemporary for creating Dynamic objects: it auto-deletes the Dynamic element when evaluation of the current Cell is finished. – Alexey Popkov Jun 02 '16 at 13:57
  • 2
    Dynamic stuff (which is sometimes created inadvertently – even a simple Histogram has it!) causes the front end to periodically request the kernel to interrupt what it is doing for a while and tend to another evaluation. Things can indeed go wrong with this and the front end can freeze. Sometimes I cannot identify the reason of the hang at all, but sometimes it's because the kernel is doing something that cannot be interrupted at all. One example is LibraryLink stuff you write yourself (and do not explicitly make interruptable). – Szabolcs Jun 02 '16 at 14:18
  • 1
    Do update to 10.3.1 at least. It should be free if you have 10.3.0. – Szabolcs Jun 02 '16 at 14:19
  • All good suggestions. Unfortunately it's not feasible to test them all one-by-one since it's unreproducible, but I will implement them and then try to report back if it continues, or in a few weeks if the problem goes away. – Jess Riedel Jun 02 '16 at 14:42
  • 2
    The front end has its own service kernel and I wonder if these events are crashes/freezes in that kernel rather than the front end process itself. – Simon Woods Jun 08 '16 at 18:15

0 Answers0