I have one improvement to suggest. Instead of finding the minimums with MapThread, use a compiled version that threads itself over the lists:
mappedmin = Compile[{{x, _Real, 1}},
Min[x],
RuntimeAttributes -> {Listable}, Parallelization -> True];
We can compare using this with the OP's and with Daniel Lichtblau's improvement.
fmm = Compile[{{n, _Integer}},
Module[{Fk = Table[RandomReal[{0, 1}], {5}, {n}],
k0 = Table[RandomReal[{0, 1}], {n}], TT = 0.1, ksum, i = 0},
While[TT > 0,
ksum = Total@Fk;
k0 = mappedmin[Transpose@{k0, RandomReal[{0, 1}, n] ksum}];
TT = Round[Total@k0, 0.01]; i++;
Fk = Delete[Fk, {1}];
AppendTo[Fk, k0];]; i],
CompilationOptions -> {"InlineExternalDefinitions" -> True}];
f[10^6] // AbsoluteTiming
fmm[10^6] // AbsoluteTiming
(*
{30.814108, 193}
{18.011360, 193}
*)
If Parallelization is turned off, it runs about 1+ sec. slower.
Here is an implementation of Daniel's suggestion, similar to rasher's f2c, without and with mappedmin. (I also got rid of the intermediate variable ksum, since that was a source of one of the CopyTensor calls, and used vectorized multiplication.)
dl = Compile[{{n, _Integer}},
Module[{Fk = RandomReal[{0, 1}, {5, n}],
k0 = RandomReal[{0, 1}, n], TT = 0.1, i = 0},
While[TT > 0,
i++;
k0 = MapThread[Min[#1, #2 RandomReal[{0, 1}]] &, {k0, Total@Fk}];
TT = Round[Total@k0, 0.01];
Fk[[Mod[i, 5, 1]]] = k0;
]; i]
];
dlmm = Compile[{{n, _Integer}},
Module[{Fk = RandomReal[{0, 1}, {5, n}],
k0 = RandomReal[{0, 1}, n], TT = 0.1, i = 0},
While[TT > 0,
i++;
k0 = mappedmin[Transpose@{k0, RandomReal[{0, 1}, n] Total@Fk}];
TT = Round[Total@k0, 0.01];
Fk[[Mod[i, 5, 1]]] = k0;
]; i],
CompilationOptions -> {"InlineExternalDefinitions" -> True}];
dl[10^6] // AbsoluteTiming
dlmm[10^6] // AbsoluteTiming
(*
{25.698229, 193}
{13.146244, 193}
*)
Again mappedmin saves about 12 sec. (40% of the timing of the OP's f).
If we compare mappedmin with MapThread, we see it takes about 55% of the time that a compiled MapThread takes, which does not completely account for the 12 sec. saving above:
mapthreadmin = Compile[{{x1, _Real, 1}, {x2, _Real, 1}},
MapThread[Min, {x1, x2}]
];
data = RandomReal[1, {2, 10^6}];
Do[mapthreadmin[data[[1]], data[[2]]], {18}] // AbsoluteTiming // First
Do[mappedmin[Transpose@data], {18}] // AbsoluteTiming // First
(*
1.010342
0.549180
)*
Fk[[1]] = k0without the deletion, if there is no order dependency (I cannot tell for certain although i suspect it has no such dependency). If there is a dependency on the ordering then you could maintain a "current" index and iterate from there modulo the length e.g. if length is 6 and current is 4 one would iterate over elements 4,5,6,1,2,3 in that order. Standard trick. – Daniel Lichtblau Jun 12 '14 at 20:50AppendToin my original code (http://mathematica.stackexchange.com/questions/50599/garbage-collection-with-nminimize-and-performance). For some, order is not important and for others it is important. For the former, if I doFk[[1]]=k0, then I will be only replacing that one row every subsequent iteration. The 4 other rows stay the same all the time. However, I need that as a new row is added at the bottom, top row is deleted. Latter case not clear to me. In the above code,Fkis always 5X10. How do you get 6 elements and in that order. Please elaborate. – brama Jun 12 '14 at 21:40i. I think usingFk[[Mod[i, 5] + 1]] = k0;instead ofFk = Delete[Fk, {1}]; AppendTo[Fk, k0];would do the trick. – jkuczm Jun 12 '14 at 22:25RotateLeftand then set the last element (previously the first) to the new value. Or the other way around, if you like. Dan's suggestion is more efficient if you don't mind keeping track of where you're supposed to start from. – Oleksandr R. Jun 13 '14 at 00:05RotateLeftis awesome. +1 – brama Jun 13 '14 at 15:35copyTensorremaining. Do you have any thoughts on how I can get rid of even that!! – brama Jun 13 '14 at 16:05