5

I have a list, L, of this kind ($n\in \mathbb{N}$):

L=Sort[RandomSample[Subsets[Range[n],{2}],RandomInteger[{1,Binomial[n,2]}]]]

S contains any k-sized subset with no duplicating integers:

S=Select[Subsets[L,{k}],Length[Gather[Flatten[#]]]===2k&] (*k>1*)
(*What I need is Length[S]*)

But when $n$ and $k$ are too big, I think there must be a simpler way to get the length than generating S. Which obtainable properties of L determines Length[S]?

Coolwater
  • 20,257
  • 3
  • 35
  • 64
  • Can you give an example of L and S for small n and k? For some reason your code isn't working for me – Dr. belisarius Nov 13 '13 at 11:33
  • n=13; k=2; L={{1, 12}, {2, 8}, {2, 11}, {3, 4}, {8, 10}, {9, 11}, {10, 11}} Length[S]=15 I edited the missing parantese for Length in S assignment – Coolwater Nov 13 '13 at 11:39

1 Answers1

3

I propose two different solutions with comparable speed.

Some input set

n = 20;
SeedRandom[0];
L = Sort[RandomSample[Subsets[Range[n], {2}], 
    RandomInteger[{1, Binomial[n, 2]}]]];
Length[L]

168

The main idea: let's recursively look for all possible non-duplicated subsets and count them. It is relatively fast and it does not require a lot of memory.

Both solutions:

ClearAll[f];
f[set_List, k_] := f[set, k] =
   Sum[f[DeleteCases[
      set[[i + 1 ;;]], {set[[i, 1]], _} | {set[[i, 2]], _} | {_, 
        set[[i, 1]]} | {_, set[[i, 2]]}], k - 1], {i, 
     Length[set] - 1}];
f[set_List, 1] := Length[set];

f[set_SparseArray, k_, j_: 0] := f[set, k, j] =
   Sum[f[Drop[
      Drop[set[[p[[1]] + 1 ;;]], {p[[2]] - p[[1]] - j}, {p[[1]] + j}],
       None, {p[[2]] - 1}], k - 1, j + p[[1]] - 1], {p, 
     set["NonzeroPositions"]}];
f[set_SparseArray, 1, _] := Length@set["NonzeroValues"];

The first takes L as input. The second takes the sparse array

sp = SparseArray[L -> ConstantArray[1, Length[L]], {Max[L] + 1, Max[L] + 1}];
sp // ArrayPlot

enter image description here

k = 3;
Length@Select[Subsets[L, {k}], Length[Gather[Flatten[#]]] === 2 k &] // AbsoluteTiming

{6.013846, 400127}

f[L, 3] // AbsoluteTiming

{0.578520, 400127}

f[sp, 3] // AbsoluteTiming

{0.545830, 400127}

For k = 4 the difference is much more noticeable: 3 sec vs 10 min and 2 GB RAM.

Some notes:

  1. I use memoization.

  2. In the second solution f drops rows and columns of the sparse array so original indexing shifts after each drop. I use additional variable j to deal with it.

  3. Drop failed to drop the row of the sparse array if it has only 1 row. So I pad input array with additional empty row and column by Max[L] + 1.

ybeltukov
  • 43,673
  • 5
  • 108
  • 212