Subsets function takes optional third argument with standard sequence specification. Using this third argument you can take subsets "in chunks".
For example, following code gives three 5-combinations from positions 90000 to 90002, from all 8 trillions 5-combinations of set of 1000 elements:
Subsets[Range[1000], {5}, {90000, 90002}]
(* {{1, 2, 3, 98, 845}, {1, 2, 3, 98, 846}, {1, 2, 3, 98, 847}} *)
Lazy subsets
Using undocumented Streaming` module introduced in v10.1 you can implement lazy list of subsets i.e. an object that generally behaves like ordinary list, but it's not a whole list that needs to be stored in memory. Instead, when needed, it generates subsets in "chunks" of desired length.
Here is a very simple version, based on LazyTuples.
Needs["Streaming`"]
ClearAll[lazySubsets]
lazySubsets[list_, nspec_:All, chunkSize:_Integer?Positive:100000] :=
Module[{ctr = 0, active = False},
(* Test whether given arguments are valid for Subsets. *)
Check[
Quiet[Subsets[list, nspec, {1}], Subsets::take],
Return[$Failed, Module]
];
LazyListCreate[
IteratorCreate[
ListIterator,
(active = True) &,
With[
{taken =
Quiet[
Subsets[list, nspec, {ctr + 1, ctr + chunkSize}],
Subsets::take
]
}
,
ctr += Length[taken];
taken
] &,
TrueQ[active] &,
Remove[active, ctr] &
]
,
chunkSize
]
]
Example of usage:
subs = lazySubsets[Range[5], {3}]
(* « LazyList[{1, 2, 3}, {1, 2, 4}, {1, 2, 5}, {1, 3, 4}, {1, 3, 5}, ...] » *)
You can iterate over subs as if it was an ordinary list:
Scan[Print, subs]
(* {1,2,3}
{1,2,4}
{1,2,5}
{1,3,4}
{1,3,5}
{1,4,5}
{2,3,4}
{2,3,5}
{2,4,5}
{3,4,5} *)
You can Map a function and get another lazy list:
f /@ subs
(* « LazyList[f[{1, 2, 3}], f[{1, 2, 4}], f[{1, 2, 5}], f[{1, 3, 4}], f[{1, 3, 5}], ...] » *)
Get it's Length or certain Part:
subs // Length
(* 10 *)
subs[[5]]
(* {1, 3, 5} *)
Memory required to use this lazy list depends on given chunkSize.
Applying function to ordinary list of all 8 388 608 subsets of set of 23 elements requires over gigabyte of memory to store whole list:
Scan[Identity, Subsets[Range[23]]] // MaxMemoryUsed
(* 1 194 005 632 *)
Applying function to lazy list, that takes 10^5 subsets in chunk, takes much more time, but uses only fifty megabytes:
Scan[Identity, lazySubsets[Range[23], All, 10^5]] // MaxMemoryUsed
(* 55 351 288 *)
Taking 10^4 subsets per chunk uses only seven megabytes of memory:
Scan[Identity, lazySubsets[Range[23], All, 10^4]] // MaxMemoryUsed
(* 7 154 944 *)
Clean up cache after playing with Streaming`:
Scan[LazyListDestroy, LazyLists[]]
Scanning subsets chunks using only documented functions
If you want to apply some function to all k-combinations, but taken in chunks, something like following function can be useful (version with some inspirations from belisarius's comment, a bit more robust than my previous version):
ClearAll[scanSubsetsChunks]
scanSubsetsChunks[f_, data_, nspec_:All, chunkLength_Integer?Positive] :=
Module[
{
i = chunkLength + 1,
getChunk =
Quiet[
Subsets[data, nspec, {#, # + chunkLength - 1}],
Subsets::take
]&,
chunk
},
chunk = Check[getChunk[1], Return[$Failed, Module]];
While[chunk =!= {},
f[chunk];
chunk = getChunk[i];
i += chunkLength;
]
]
scanSubsetsChunks[Print, Range[5], {3}, 3]
(* {{1,2,3},{1,2,4},{1,2,5}}
{{1,3,4},{1,3,5},{1,4,5}}
{{2,3,4},{2,3,5},{2,4,5}}
{{3,4,5}} *)
NextKSubset[]in the oldCombinatorica`package. For background on the algorithm used, see Nijenhuis and Wilf (which is the reference Skiena based his implementation on). – J. M.'s missing motivation Jun 17 '15 at 13:13