38

I've written my incredibly complex, incredibly elegant analysis function, that works great on small test data. But when I run it on my real (bigger) data set it keeps running out of memory. It turns out that the analysis function does not free memory, but I can't imagine why. It takes a large number of points, but returns only several scalar values.

memory usage after time

Every time I run this it takes up about 500 MB of memory. (here is another example).

What is the best way to debug memory problems?

I've read the memory management tutorial, turned off caching and verified I have no lingering variables in my contexts and of course I have set $HistoryLength to zero.

Also running Reverse@Sort[{ByteCount[Symbol[#]], #} & /@ Names["`*"]] show no huge memory symbols. Just the data:

{{191816648, "alldata"}, {28184, "before"}, {28184, "after"},
 {24096, "compiledSelectBin"}, {15344, "AppendLeftRight"}, {8840, "compiledSelectBinFunc"}...}

EDIT One can use this code to track memory consumption:

DynamicModule[{pm = {}},
 Dynamic@Refresh[pm = Append[pm, MemoryInUse[]]; 
   If[Length[pm] > 120, pm = Drop[pm, 1]]; 
   ListPlot[pm/1024/1024, AxesLabel -> {"Time [s]", "Memory [MB]"}, 
    PlotRange -> {0, All}], UpdateInterval -> 1, 
   TrackedSymbols :> {}]]

I think I finally have a minimum example. Here it is. Unzip to a folder and evaluate the two cells in LeakP.nb. If you evaluate the second cell multiple times you can watch the memory consumption grow. Could somebody (on win7 64 bit mma 8) confirm this?

EDIT 1

I really hope I have nailed it down. Here is a self contained example:

$HistoryLength = 0;
data = RandomReal[{-1, 1}, {10, 100000, 2}];
data = Developer`ToPackedArray[#] & /@ data;
data = Flatten[data, 1];
Dimensions[data]
HistogramList[data, 30, Automatic];
ClearAll[data]; ClearSystemCache[];

EDIT 2

This is fixed in Mathematica 9.0.0.

Ajasja
  • 13,634
  • 2
  • 46
  • 104
  • 2
    the question you ask is impossible to answer in any non speculative way without seeing the code. Having said that and assuming for a second that you did follow the typical suggestions have you tried to explicitly set all variables in your Module/Block to Null right after they are not used any longer? –  May 29 '12 at 10:28
  • 1
    @rubenko No, not yet. Also, I'm not asking where is my problem, I'm asking how to start tracking it down. (Aha, the question in the image is rhetorical:) This is a largish piece of code (about 600 lines) and I'm at a loss where to begin... – Ajasja May 29 '12 at 10:32
  • I don't have an answer to your question, but in a comparable situation I found that using Share[] in the routine reduced the memory footprint considerably. See for example here – Markus Roellig May 29 '12 at 10:33
  • @Ajasja, you need to simplify your code and monitor memory during while doing that. –  May 29 '12 at 10:36
  • @ruebenko Do you mean just printing MemoryInUse[] in various places? – Ajasja May 29 '12 at 10:51
  • @Ajasja, yes, and MaxMemoryUsed[]. Perhaps something like mem := {MemoryInUse[]/1024.^2, MaxMemoryUsed[]/1024.^2} –  May 29 '12 at 11:01
  • @rubenko Oh, I just remembered, I'm passing large arrays to compiled functions (With CompilationTarget -> C). Are there any known memory leaks when doing that? – Ajasja May 29 '12 at 11:29
  • @Ajasja, not that I know about. –  May 29 '12 at 11:58
  • 2
    Perhaps it's time to try constructing a minimal working example and share it. If the mistake is indeed in your code (and it's not a bug in some Mathematica function you're using), it would be very worthwhile to learn about it, for all of us. (If you post a working example: please note that some of us only have 2 GB of memory, so scale it down a bit :) – Szabolcs May 29 '12 at 12:03
  • 6
    Note that ByteCount[sym] will only show the memory used by the OwnValues for the symbol. I introduced the symbolMemoryUsage function in my answer specifically to address other global properties as well. – Leonid Shifrin May 29 '12 at 13:23
  • @Ajasja, with the edit: you could try the following: add the dummy line: thisDoesNothing; before and after the call to compiledSelectBin and see if the mem consumption goes down. –  May 29 '12 at 13:44
  • @rubenko No change in memory consumption. – Ajasja May 30 '12 at 08:28
  • Did you make progress on this since your last edit? – Szabolcs May 30 '12 at 08:39
  • 1
    I think @acl gave a very sound suggestion: try to refactor your code so that it is built from really small functions and use as little state as possible. It should then be much easier to locate the place where leak is happening. – Leonid Shifrin May 30 '12 at 10:12
  • @Szabolcs Are you on win7? I uploaded an example of the leak... – Ajasja May 30 '12 at 11:02
  • @Ajasja Nope, WinXP and 32 bit, but I'll try. – Szabolcs May 30 '12 at 11:04
  • @Szabolcs It seems that just running steps = Import["steps.mx"]; ClearAll[steps] is enough. Am I missing something obvious? – Ajasja May 30 '12 at 11:14
  • @Ajasja You removed the edit just before I could try it, is it not the cause of the problem after all? – Szabolcs May 30 '12 at 14:02
  • @Szabolcs No, to my shame I forgot to add $HistoryLength = 0; to the sample (I have it set in my package files...) – Ajasja May 30 '12 at 14:05
  • @Ajasja Don't worry, I just managed to do the same myself :-) – Szabolcs May 30 '12 at 14:07
  • @ruebenko I managed to put together a minimal example that leaks memory. Do you perhaps know why? – Ajasja May 30 '12 at 15:19
  • @Ajasja, I don't have access to windows right now. You assume that the leak is can be seen with the QucikDensityHistogram. If so, try to comment out your ListContourPlot and see if it then still leaks. If it does comment out the previous line until you find the line that causes the leak. If you could reduce that to a notebook, that were good. –  May 30 '12 at 15:44
  • @ruebenko I narrowed it down to HistogramList. Please see edit. – Ajasja May 30 '12 at 16:03
  • I can confirm the leak in 32-bit WinXP. – Szabolcs May 30 '12 at 16:20
  • @Ajasja, I forwarded this to a developer but have not heard back yet. –  May 30 '12 at 20:23
  • @rubenko Do you think I need to file a bug report, or will it be fixed in the next version? – Ajasja May 31 '12 at 08:18
  • @Ajasja, thanks for tracking it down. The respective developers are looking and this will hopefully be resolved soon in the development version. So no need for a bug report. –  May 31 '12 at 10:16
  • @Ajasja have you seen this? – rcollyer Mar 14 '13 at 20:11
  • I recently found a memory leak associated with using two native functions, Area and Triangle, together. I found it by eliminating code from the function until the error no longer occurred, eventually isolating it to the one offending line, Area[Triangle[#]]&/@coordinates on a coordinates list. A similar dissection of your function until you find the offending line may work. – Ghersic Dec 29 '19 at 12:06

2 Answers2

37

Preamble

It is hard to say what exactly is causing this without seeing the code, but, assuming that there are no memory leaks in the built-in functions you are using, I am only aware of a very few possible causes for memory leaks in Mathematica. Since almost anything is immutable, the leaks must be associated with some symbols for which definitions are accumulated but not cleared.

I will show here one rather obscure case of leaking of local Module variables, which happens when the variable is referenced by some object / symbol, external w.r.t. its scope. In such cases, such variables are not garbage-collected even after the symbols referencing them get Remove-d, in case if they get assigned DownValues, SubValues or UpValues (OwnValues are ok).

One subtle case with a memory leak

MemoryInUse[]
 17350016
$HistoryLength = 0;

Module[{g}, Module[{f}, g[x_] := f[x]; Do[f[i] = Range[i], {i, 5000}]; ]; g[1]]

 {1}
MemoryInUse[]
 72351376

One way to ensure that this does not happen is to insert Clear[f] at the end of the outer Module, storing the result in a separate variable and returning it afterwards. There are more advanced ways to prevent such things as well. I may elaborate on those at some later time.

Memory leaks associated with UI-building

One common cause of memory leaks which is often ovelooked is when some local symbols make it into UI elements. The problem is that UI elements are Mathematica expressions, which do reference those symbols, and therefore, they are not garbage-collected.

Here is an example I borrowed from this thread

memModule[] := 
  Module[{data, memBefore, mu}, 
     mu := Grid[{{"Memory in use: ", MemoryInUse[]/(2^30.), "GB"}}]; 
     memBefore = mu; 
     data = RandomReal[1, {300000, 20}]; 
     DynamicModule[{d1}, 
        d1 := data[[1]]; 
        Panel[Grid[{{memBefore}, {mu}}]] 
       , UnsavedVariables -> {dl} 
    ] 
 ]; 

Now, every time when it gets executed, more memory is being leaked:

memModule[] 
memModule[] 
memModule[] 

Please see my answer in the linked thread for one way out, in this particular case. Generally, this is something to watch out for.

Monitoring symbols

So, one good place to start is to call

Names["Global`*"]
 {"f", "f$", "f$119", "g", "i", "x", "x$"}

or whatever main context you are using (or other contexts, if you create symbols there), and watch for some symbols with high memory usage. In this particular case, the culprit it f$119.

Here are some utility functions which may help with monitoring symbols:

Clear[$globalProperties];
$globalProperties =
    {OwnValues, DownValues, SubValues, UpValues, NValues, 
     FormatValues, Options, DefaultValues, Attributes, Messages};

ClearAll[getDefinitions]; SetAttributes[getDefinitions, HoldAllComplete]; getDefinitions[s_Symbol] := Flatten@Through[ Map[ Function[ prop, (* Unevaluated needed here just for Options, which is not holding *) Function[sym, prop[Unevaluated @ sym], HoldAll] ], $globalProperties ][Unevaluated[s]] ];

ClearAll[symbolMemoryUsage]; symbolMemoryUsage[sname_String] := ToExpression[sname, InputForm, Function[s, ByteCount[getDefinitions[s]], HoldAllComplete] ];

ClearAll[heavySymbols]; heavySymbols[context_, sizeLim_: 10^6] := Pick[#, UnitStep[# - sizeLim] &@Map[symbolMemoryUsage, #], 1] &@ Names[context <> "*"];

For example, calling

heavySymbols["Global`"]

returns

 {f$119}
Gustavo Delfino
  • 8,348
  • 1
  • 28
  • 58
Leonid Shifrin
  • 114,335
  • 15
  • 329
  • 420
  • +1 Thank you for the example. Now I understand the comment you made here. Unfortunately, there appear to be no heavy symbols. heavySymbols[""], returns {"alldata", "GeoProjectionData"} which is that data I loaded at the beginning and an always present built-in. – Ajasja May 29 '12 at 11:55
  • @Ajasja Well, then I am at a loss. Do you create symbols in other contexts? Or, do you return any expression involving local variables, or embed it into UI, for example? UI-s are another cause of memory leaks, see e.g. my answer in this thread – Leonid Shifrin May 29 '12 at 12:00
  • 1
    @Ajasja I also had a situation with the kernel eating up gigabytes of memory, linearly increasing in time even though I never kept more than 1GB of data in symbols. I also had checked and no symbols had anywhere near the memory consumed. in the end I rewrote my program in a completely different way and now it doesn't eat memory. have no clue what it was. – acl May 29 '12 at 12:13
  • @LeonidShifrin I don't create symbols in any other context. What I'm doing a lot is assigning to the same Module variable multiple times. For example data=rawdata; data=GetDiffrences[data]; data=Mean[data]... Return[data] But I belive this shuld be safe. I'm putting memory print statements all over my code as we speak... – Ajasja May 29 '12 at 12:14
  • @acl Brr... that would be a worst case scenario! I spent two weeks on this code and now I can't get a single result... – Ajasja May 29 '12 at 12:16
  • @Ajasja There does seem to be an unstoppable (inaccessible) memory sink somewhere deep in Mathematica. I've run into it with my own application. The only reliable way around it seems to be re-architecture (cf acl above). In my case, the sink was somewhere under a huge inner product. Leonid eliminated that for me, i.e. http://forums.wolfram.com/mathgroup/archive/2010/Jan/msg00829.html. – Vince May 29 '12 at 12:25
  • @Vince Hi Vince, nice to see that you are here! (I mean, I did not realize before that you are you :)). – Leonid Shifrin May 29 '12 at 13:04
  • @Leonid Thanks Leonid. It's a sporadic presence, and usually rushed. While I'm off-topic, let me say how impressed I am with what has been created here on SE for MMA. A lot of impressive dedication and substantial contribution. – Vince May 29 '12 at 13:49
  • @Vince Thanks Vince! Lots of great folks accumulated here, from Stack Overflow, Mathgroup, WRI, and elsewhere. I think what makes this place stand out and compare favorably to, say, Mathgroup (with all due respect for the latter), is the level of collaboration (by which I also mean a healthy competition, which is often very beneficial for everybody), and the sense of community as a whole, rather than a set of isolated experts. Hope to see a lot of your answers appearing here! – Leonid Shifrin May 29 '12 at 14:19
  • @Ajasja Buried in the link I included above, you can see that my problematic memory profile resembles yours. Not sure if all memory profiles of MMA functions look this way (if not flat). memory profile. (How to display an image in a Comment?) – Vince May 29 '12 at 14:30
  • @Vince: working set is probably not a reliable indicator of actual memory usage. A program that repeatedly allocates and frees memory can result in a highly inflated working set metric, because in a system under a light memory load, the memory manager will just put freed pages on the standby list without actually freeing them in case they're allocated again soon afterward. Just because the working set or virtual size is large, it doesn't necessarily mean that the process is using all that memory. – Oleksandr R. May 30 '12 at 01:59
  • @OleksandrR. I was trying to find the total RAM footprint of jobs which I was running remotely on other users' Windows XP desktops (to maintain their performance). perfmon's "Process > Working Set" seemed to approximate that footprint best. Also, supposedly, MMA tries very hard to recycle memory it has already taken from the system. I didn't see that in my memory profiles. By your clarification, it seems that recycling would keep the Working Set down, since it would eliminate many system memory allocations. Again, that was not apparent in the profiles. But maybe I'm not following. – Vince May 30 '12 at 03:13
  • @Vince: in that case, working set seems like a reasonable metric, considering that a pessimistic estimate is probably better for this application. It's not very useful for assessing possible memory leaks though. Personally I'd tend not to put much faith in claims like "Mathematica tries to..." unless you know what they really mean in practice. Mathematica doesn't have a great deal of control over the OS's memory management policies, which vary by system load, OS version, and many other unpredictable factors. – Oleksandr R. May 30 '12 at 03:25
  • @OleksandrR. The bit about "recycling" was something I got from WRI's tech support. Voodoo to me. – Vince May 30 '12 at 03:28
  • @Vince I got the memory profile from pm = {} Dynamic@Refresh[pm = Append[pm, MemoryInUse[]]; ListPlot[pm/1024/1024, AxesLabel -> {"Time [s]", "Memory [MB]"}, PlotRange -> {0, All}], UpdateInterval -> 1, TrackedSymbols :> {}] so it should be reliable. – Ajasja May 30 '12 at 08:30
  • @Ajasja No doubt it's at least a reliable relative measure. I needed a system-level metric. That is, how well math.exe would play with others in the system process list. Working Set was as close as I could get. (MemoryInUse[] always under-reported.) I'll mention again that, though the scale is different, our memory profiles have very similar shapes. I find that intriguing. But perhaps common, as I also obliquely commented above somewhere. – Vince May 30 '12 at 12:14
  • @Vince I think such a shape could be common. Some memory is allocated (that will not get released) and then some further processing allocates even more memory, but then frees it. BTW, you can't embed images in comments, you can only link to them (to the best of my knowledge). – Ajasja May 30 '12 at 14:02
  • @Ajasja The sticking point is in your first parenthetical remark. My inner products, though large, were very simple operations and did not leave residual temporaries (at the user level). Yet the size of the final product was much smaller than indicated in the memory profile. Though please take these observations carefully; it's been a long time since I've run this diagnostic. – Vince May 30 '12 at 14:55
  • When I run heavySymbols["Global\", 10^6]` after some code some functions from my code start evaluating. Could some HoldAll be missing? Or any idea what might be causing this? – Kvothe Jun 18 '21 at 08:45
  • Also how does this work with different Wolfram Kernels? I can see, in Gnome-Monitor, a high memory usage (3.2GB) in 6 different kernels (I called 6 kernels in a ParallelMap so that part makes sense). Apparently these are still alive after the parallel computation has already finished. In addition I see the main Kernel which has an memory usage that is close to the value returned by MemoryInUse[] (~1GB). So do these debugging tools only show the usage in the main kernel. It seems my problem isn't there. – Kvothe Jun 18 '21 at 08:52
  • Ah profiling memory usage in subkernels is simple you just need to use ParallelEvaluate so in this case for example ParallelEvaluate[heavySymbols["Global\", 10^6]]` – Kvothe Jun 18 '21 at 09:26
  • 1
    @Kvothe There was indeed an evaluation leak coming from Options not holding its argument - which would show up for symbols with OwnValues, such as e.g. a := Print["***"] - then calling getDefinitions on a would lead to printing. I have made and edit that should fix it, check it out. Regarding parallel, it looks like you already found an answer. – Leonid Shifrin Jun 18 '21 at 09:48
  • Thanks, indeed your update fixed the evaluation leak. – Kvothe Jun 19 '21 at 17:16
  • @Kvothe Excellent. Glad it was helpful. – Leonid Shifrin Jun 20 '21 at 12:04
13

This is a memory leak in HistogramList:

You can reclaim the memory by evaluating

Remove["*`*modelData$*"]
Brett Champion
  • 20,779
  • 2
  • 64
  • 121