5

My question consists of two parts.

  1. How to create a random undirected connected graph s.t. the probability of the event "The degree of a vertex equals k" is equal to $\frac{k^{-\gamma }}{\zeta (\gamma )},\,k=1,2,\dots,\,\gamma >1$? Of course, I looked in RandomGraph , but didn't find the answer. My attempt RandomGraph[ZipfDistribution[2]] is not successful.
  2. Additionally, let the probability of the event "The vertices of degrees $k_1$ and $k_2,\, k_1\ge k_2,\,k_2\ge 1$ are connected by the edge" be equal to $$\frac {k_1^{-\gamma}k_2^{-\epsilon}} {\sum _{k_2=1}^{\infty } k_2^{-\epsilon } \zeta (\gamma ,k_2)},\,\gamma > 1, \epsilon >1,\,\gamma \neq \epsilon.$$

The question is motivated by the article Lazaros K. Gallos, Chaoming Song, and Hernán A. Makse, Phis. Rev. L. 100, 248701 (2008).

J. M.'s missing motivation
  • 124,525
  • 11
  • 401
  • 574
user64494
  • 26,149
  • 4
  • 27
  • 56

1 Answers1

6

What does it mean to sample random graphs with certain properties? If the property applies to individual graphs, then "random" would mean that we assign the same probability to each graph that satisfies the property (and exclude the rest). Sampling from such a distribution is usually a difficult problem. It is literally a research-level problem for each constraint you come up with. But if you're lucky, someone has already solved that problem for your specific constraint.

The first constraint you mention refers to the degree distribution. Look up the Chung–Lu model for this. In IGraph/M, IGStaticFitnessGame and IGStaticPowerLawGame implement a variant of this. In this model, you can set the expected degree of each vertex. The actual degree of that vertex in a single sample may vary significantly. It is only the average taken across many samples that will match.

Another thing you can do is to sample from the set of graphs which have a specific degree sequence: the degree of each vertex is fixed. First, you can sample degrees from an arbitrary distribution. Then check that a graph with these degrees exists (this is called graphicality, see IGGraphicalQ). Then use IGDegreeSequenceGame to sample graphs with that degree sequence. Do read the IGDegreeSequenceGame documentation and choose the appropriate method! Not all implemented methods sample uniformly; the default one doesn't. The good method choices are the configuration model, which is only usable for small or very sparse graphs (otherwise it's too slow) or the Viger–Latapy method, which samples only connected graphs (in case you need that). Yet another alternative is to create first a single graph, using IGRealizeDegreeSequence, then "shuffle its edges around" using IGRewire. This also leads to uniform sampling provided in the limit of a large number of rewiring steps in IGRewire.

The built-in DegeeGraphDistribution should also be able to sample graphs with certain degrees, but its behaviour is fishy and I was never able to extract a satisfactory response from Wolfram about what is going on: the developers simply refuse to respond. That's a big red flag for me, and I always avoid this function for this reason. You just don't know what you're getting.


As for your second question, if you want to control the probability of connection between vertices of certain degrees, the keyword is "joint degree matrix". I do not have a ready-made program for you, but you can look at https://doi.org/10.1137/130929874 and https://doi.org/10.1145/2133803.2330086


Examples

Sample the degrees from a Zipf distribution:

SeedRandom[15]
degrees = ReverseSort@RandomVariate[ZipfDistribution[1.2], 100]
(* {96, 75, 14, 11, 6, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 2, 2, \
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, \
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1} *)

There may be no simple graph (i.e. no self-loops or multi-edges) having these degrees. IGGraphicalQ tests this:

IGGraphicalQ[degrees]
(* False *)

But we can still use them as input to the Chung–Lu model, as this model only produces these degrees on average. Here's the average of each vertex's degree over many sampled graphs:

Mean@N@Table[
   VertexDegree@
    IGStaticFitnessGame[Total[degrees]/2, degrees, 
      MultiEdges -> True,
      SelfLoops -> True],
   {1000}
   ]


(* {96.031, 75.155, 13.977, 10.881, 6.018, 4.047, 3.862, \
3.999, 4.042, 2.959, 2.857, 2.957, 2.986, 2.955, 3.02, 3.011, 1.974, \
1.998, 2.077, 2.012, 1.979, 2.005, 1.961, 2.066, 1.935, 2.003, 2.031, \
2.009, 1.934, 1.97, 1.956, 2.014, 2.018, 1.917, 2.069, 1.957, 1.991, \
2.064, 2.009, 0.987, 1.002, 1.007, 0.996, 1.068, 1.028, 1.036, 0.99, \
0.953, 1.019, 0.992, 1.043, 1.027, 0.975, 1.004, 0.983, 1.068, 0.992, \
1.044, 0.994, 0.979, 1.006, 1.024, 1.027, 0.968, 0.974, 0.925, 1.07, \
1.009, 0.999, 0.976, 1.018, 1.045, 1.04, 1.014, 0.976, 1.031, 0.995, \
0.979, 0.968, 0.97, 0.991, 1.017, 0.996, 0.973, 0.984, 0.988, 1.04, \
0.999, 1.016, 0.993, 0.99, 1.001, 1.031, 1.024, 0.956, 1.069, 1.005, \
1.026, 0.981, 1.013} *)

In a single sample, the degrees will not match perfectly:

VertexDegree@
 IGStaticFitnessGame[Total[degrees]/2, degrees, MultiEdges -> True, 
  SelfLoops -> True]
(* {83, 74, 16, 9, 3, 4, 1, 5, 7, 6, 2, 2, 5, 4, 5, 3, 0, 2, \
0, 3, 1, 4, 1, 3, 1, 2, 2, 1, 4, 5, 1, 1, 7, 2, 2, 1, 2, 2, 3, 0, 1, \
3, 1, 3, 1, 1, 3, 1, 0, 0, 0, 1, 0, 0, 2, 1, 2, 1, 1, 3, 1, 0, 0, 0, \
2, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 4, 0, 1, 1, 0, 0, 1, 0, 2, 2, 2, 4, \
0, 2, 1, 3, 2, 2, 3, 2, 0, 2, 2, 0, 0} *)

If we disable self-loops and multi-edges, then they won't even match well in the average. This is not that surprising, since this degree sequence wasn't graphical.

Mean@N@Table[
   VertexDegree@IGStaticFitnessGame[Total[degrees]/2, degrees],
   {1000}
   ]
(* {51.954, 44.692, 13.035, 10.82, 6.927, 5.169, 5.095, 5.172, \
5.218, 4.202, 4.167, 4.183, 4.119, 4.094, 4.076, 4.063, 2.971, 3.043, \
3.133, 2.941, 3.033, 2.961, 3.018, 2.945, 2.993, 2.951, 3.006, 2.974, \
3.064, 2.907, 3.017, 2.956, 3.046, 3.043, 3.021, 3.03, 3.024, 2.987, \
2.968, 1.663, 1.681, 1.654, 1.656, 1.643, 1.606, 1.662, 1.604, 1.653, \
1.646, 1.679, 1.673, 1.579, 1.683, 1.702, 1.655, 1.583, 1.6, 1.624, \
1.537, 1.643, 1.662, 1.638, 1.603, 1.69, 1.689, 1.7, 1.609, 1.591, \
1.646, 1.557, 1.559, 1.65, 1.616, 1.661, 1.648, 1.622, 1.643, 1.592, \
1.688, 1.608, 1.644, 1.66, 1.63, 1.646, 1.614, 1.655, 1.622, 1.665, \
1.618, 1.69, 1.602, 1.703, 1.629, 1.635, 1.669, 1.669, 1.627, 1.592, \
1.628, 1.686} *)

Sample a graphical degree sequence from a Zipf distribution:

degrees = 
  ReverseSort@
   IGTryUntil[IGGraphicalQ]@RandomVariate[ZipfDistribution[1.6], 100];

Now we can create a simple graph having exactly these degrees:

IGDegreeSequenceGame[degrees]

enter image description here

VertexDegree[%] == degrees
(* True *)

Note that with the default method, the sampling won't be uniform. To get uniform sampling, use

IGDegreeSequenceGame[degrees, Method -> "ConfigurationModelSimple"]

This will be unusably slow if the exponent of the Zipf distribution is too low. Method -> "VigerLatapy" is much faster and it does sample approximately uniformly, but it only samples connected graphs. To check if there is a connected graph having certain degrees, it is sufficient to check that it has at least as many edges as its vertices minus one.

potenticallyConnectedQ[degrees_] := 
 Total[degrees]/2 >= Length[degrees] - 1

This is not true for this degree sequence.

But we can use yet another approximately uniform sampling method: create one graph with the given degrees and rewire its edges with degree-preserving edge swaps.

IGRewire[
 IGRealizeDegreeSequence[degrees],
 1000
]

To obtain several samples, use something similar to:

NestList[IGRewire[#, 1000] &, IGRealizeDegreeSequence[degrees], 10]

Here I used 1000 rewiring trials. If we use too few, then subsequent samples will not be statistically independent.


Update:

If you only want connected graphs, use IGDegreeSequenceGame[..., Method -> "VigerLatapy"]

degrees = 
  IGTryUntil[potenticallyConnectedQ[#] && IGGraphicalQ[#] &]@
   RandomVariate[ZipfDistribution[1.6], 100];

IGDegreeSequenceGame[degrees, Method -> "VigerLatapy"]

enter image description here

Szabolcs
  • 234,956
  • 30
  • 623
  • 1,263
  • Thank you for your solid answer. First, I don't understand your " Note that individual graphs sampled from this model do not necessarily follow the degree distribution of interest. Only averaging over many samples will give you the distribution you put in. In other words, this is equivalent to an average constraint model". Second, can you kindly present a code (and its result) which answers the first question? TIA – user64494 Mar 16 '20 at 09:30
  • @user64494 That phrasing was very bad and I changed it. As for the example, I'll come back in the evening. – Szabolcs Mar 16 '20 at 10:06
  • I'm really sorry, I'll do this tomorrow morning @user64494. Too tired ... – Szabolcs Mar 16 '20 at 21:01
  • @user64494 I showed some examples. I also found a mistake in the documentation of IGStaticPowerLawGame. It should say that the fitness vector is $f_i = i^{-\alpha}$ where $\alpha = 1/(\text{exponent} - 1)$. This will be corrected in the next release. Let me know if you have questions. – Szabolcs Mar 17 '20 at 11:55
  • Many thanks from me to you for your work! You are as good as your word and a real Mathematica expert. In order to accept your answer, can you add a plot of a graph with 30 vertexes (of course, without loops and multi-edges) which answers my question 1? TIA. – user64494 Mar 17 '20 at 14:24
  • I have never worked with IGraph/M so I requested the plot. +1 at the moment. – user64494 Mar 17 '20 at 15:31
  • Do I correctly see that the plotted graph is not connected? – user64494 Mar 17 '20 at 16:53
  • Yes, it's not connected. I missed the bit that you wanted connected graphs. For that, use the Viger-Latapy method. – Szabolcs Mar 17 '20 at 17:05
  • Accepted. How about question 2? – user64494 Mar 17 '20 at 17:15