22

There are Mathematica packages that must create temporary files to function. If we are implementing such a package ourselves, how can we ensure that the temporary files will get cleaned up when the kernel exits?


Is this really a practical problem? Yes, both MATLink and MaTeX need to do it. Neither are able to do a full cleanup at the moment. Standard packages do it too, e.g. CCompilerDriver`.

Is it really possible to do it? Yes. Compile creates shared libraries when compiling to C code. These do get cleaned up on exit. How is this implemented?

In[1]:= cf = Compile[{{x}}, 2 x, CompilationTarget -> "C"];

In[2]:= FileNames[
 FileNameJoin[{$UserBaseDirectory, "ApplicationData", 
   "CCompilerDriver", "BuildFolder", "*", "*"}]]

Out[2]= {"/Users/szhorvat/Library/Mathematica/ApplicationData/\
CCompilerDriver/BuildFolder/hawkeye-8727/compiledFunction0.dylib"}

In[3]:= Quit

In[1]:= FileNames[
 FileNameJoin[{$UserBaseDirectory, "ApplicationData", 
   "CCompilerDriver", "BuildFolder", "*", "*"}]]

Out[1]= {}

We can use $Epilog for this! $Epilog is by default defined to evaluate << end`. This file does not exist by default (it is not found by FindFile). One possible solution is to create this file and add the cleanup code to it at the time when the package loads. But it is not clear how to this in a robust way, without conflicting with other packages that might also modify the same file.

The cleanup mechanism for shared libraries created by Compile does not seem to use this. Thus I am still hoping to find a better and more robust solution. What does Compile use?


One way to try to find out what happens when existing the kernel is to evaluate On[], followed by Quit[]. This is best done in a terminal session, which does not need to deal with front-end interaction. This trick does not reveal any other evaluations that the one triggered by $Epilog. The default value of $Epilog in Mathematica 10 is If[FindFile["end`"] =!= $Failed, Get["end`"]].

Szabolcs
  • 234,956
  • 30
  • 623
  • 1,263
  • 3
    Silly question: what about cleaning at start? Less silly question: isn't there any way to deal with this on OS level, like attaching a 'listener' to the process that will do this when the process is terminated? – Kuba Jan 19 '16 at 10:46
  • @Kuba To do it at the OS level is a good idea! It can be a separate process communicating through MathLink, or even simpler: a LibraryLink library which does the cleanup on "uninitialization". I would still like to know how the Compile stuff achieves it though. – Szabolcs Jan 19 '16 at 10:55
  • @Kuba I realize I checked the value of $Epilog incorrectly. It does have a value, it will do <<end` if such a file exists. It is not what cleans it up though. What I tried now was to evaluate On[] before exit (in a terminal, no front end!) to see what gets evaluated. I'm trying to sort through that ... – Szabolcs Jan 19 '16 at 11:12
  • I am interested in an answer to this, and in particular your solution by including cleanup code in WolframLibrary_uninitialize. Could you post a sample code that cleans up a few known files? – QuantumDot Jul 08 '17 at 05:09
  • @QuantumDot I'm still travelling for a week, can you remind me after that please? – Szabolcs Jul 12 '17 at 05:20
  • Hi @Szabolcs Would you kindly let me know what you did to clean up temporary files upon kernel exit? – QuantumDot Jul 26 '17 at 17:03
  • @QuantumDot I posted an example. – Szabolcs Jul 27 '17 at 07:47
  • @Szabolcs Your example is gone. Where did it go? – QuantumDot Aug 01 '17 at 15:06
  • @QuantumDot I am sorry, I did not see this message. I deleted it because of the concerns that John Fultz brought up in his comments and because I discovered that the method is not robust: whether it works when quitting the kernel through the Evaluation menu depends on uncontrollable things such as timing. I thought that you had enough reputation to still read it, even if it is deleted. If you don't, here's a pastebin, but do read about the caveats in the comments. – Szabolcs Aug 08 '17 at 19:10

1 Answers1

7

I have found $Epilog utterly unreliable for this. Not all ways of quitting Mathematica evaluates $Epilog. And of course, there is the problem that a different package may (re)set it.

I have no idea how Compile implements this, but I have two suggestions:

  1. Place all your temporary files in a special folder that you empty on load. Be careful how you handle parallel kernels that may load multiple versions of your package at the same or interleaving times.

  2. Write a small MathLink program that you Install on startup, has a function AddFileForDeletion that adds the path for a file to an internal list (in the C/C++ world), which it then iterates over and deletes after losing the MathLink connection.

Suggestion 2 is the most reliable way I've found to do temporary file cleanup.

Malte Lenz
  • 2,471
  • 19
  • 21
  • Thanks! I was also thinking of doing something like 2., but instead of a MathLink program it is also possible to use a LibraryLink library (and put the cleanup code in WolframLibrary_uninitialize. Would it at all help if I sent an email to support and suggested to add a reliable builtin cleanup mechanism? – Szabolcs Apr 21 '16 at 07:40
  • You are probably right about LibraryLink. The reason I jumped to MathLink is that my application already happened to have a MathLink program attached, so it was a comparatively small task to add a new API function for the book-keeping of files. Customer requests are taken into account when prioritizing features, so it sure couldn't hurt :) – Malte Lenz Apr 21 '16 at 07:42
  • Yes, it's pretty much the same thing. The problem with both is that if the package doesn't already have a MathLink or LibraryLink components then it is just too much trouble to add one and compile for all platforms ... With LibraryLink it is easier to automate compilation, but a working C compiler is still needed and won't typically be present on Windows. – Szabolcs Apr 21 '16 at 07:45
  • 1
    I certainly agree, it's ridiculous to have to go to such lengths for something as simple as this. – Malte Lenz Apr 21 '16 at 08:45
  • 5
    I think it's unlikely the LibraryLink proposal would work. The FE kills the kernel by sending MLTerminateMessage, which triggers a signal handler, which promptly exits. The advantage of this is that it always works promptly, no matter how hard the kernel might be hung. The disadvantage is that signal handlers are very limiting. You can't even call malloc or free in one, and definitely file system calls are right out. I would be surprised if the signal handler calls LibraryLink uninitialize functions. That having been said, I'm not an expert in the kernel functionality here. – John Fultz Jul 09 '17 at 05:28
  • 5
    That having been said, it's not a ridiculous suggestion that the FE ought to try to get the kernel to quit on its own, and only terminate it if it fails to do so after a brief timeout. That would enable end.m, WolframLibrary_uninitialize, etc., to run much more frequently...but there would still be no guarantees. – John Fultz Jul 09 '17 at 05:39
  • @JohnFultz I have only seen your comment now. I tried this, and at least with M11.1.1 on OS X, the library uninitialize function does get called even when using the GUI to quit (instead of calling Quit[]). However, it is no longer possible to call back to the kernel from the uninitialize function, which makes the method almost useless. – Szabolcs Jul 27 '17 at 07:35
  • @JohnFultz I posted an answer below to show this. Do you think that this works only accidentally (e.g. may succeed or fail depending on some race condition), or is it robust? – Szabolcs Jul 27 '17 at 08:08
  • 1
    @Szabolcs I've not looked at the implementation for LibraryLink on this. I'm surprised by your results (but not surprised that you're running into limitations), though, and I'm very confident about my statements regarding signal handling and MLTerminateMessage. In fact, I recently fixed a bug in WSTP/MathLink for 11.2 which could cause a zombie kernel because MathLink was...ahem...calling malloc inside of a signal handler. Under Linux, the MathLink signal handler was getting invoked sometimes inside of the kernel calling malloc, and it hung hard. – John Fultz Jul 31 '17 at 18:00
  • @JohnFultz I am not sufficiently familiar with signal handlers (I just looked up why malloc is not allowed in them), but to put it simply: You are saying that using library "uninitialization" this way is just a bad idea, regardless of whether it appeared to work in my simple test. Is this correct? In fact, I just tried to add a pause of 2 seconds at the beginning of the "uninitialization". That causes the rest of the function not to get executed. (So you are right: it does not really work.) – Szabolcs Aug 01 '17 at 13:04
  • @Szabolcs yes, that is what I'm saying. Here's a useful web page with all kind of stuff that anybody writing signal handlers would need to know: http://en.cppreference.com/w/cpp/utility/program/signal I agree that these behaviors are really not ideal...and I think they could be done better. I'll see if I can push on that a bit. Incidentally, YMMV on Windows...I think the Windows kernel may not rely on signal handlers, but instead on threads. A threaded solution would still cause problems with not being able to call the evaluator, but you could at least do other useful things. – John Fultz Aug 08 '17 at 18:21