2

I am trying to get my head around the evaluation process. One of the general ideas is that whenever an expression is replaced by another the evaluation restarts in the new expression. I can't imagine that this is the case for RandomReal, suppose I ask

RandomReal[{0.,1.},{10^6}]

After replacing this normal expression with the DownValue of RandomReal, a very large numerical vector, would the kernel really attempt to evaluate all parts of this vector just to see that it can't and then return the expression? I tried to look into the attributes of RandomReal to see if there was something suggestive but there is nothing.

So the question is, does the kernel really tries to evaluate the output of RandomReal (of which I am skeptical) and if it doesn't how can I mimic this for my own functions?

user64494
  • 26,149
  • 4
  • 27
  • 56
Felipe
  • 479
  • 8
  • It generates a packed array. My bet is that this created and checked off as "valid", and the only evaluation is to see if it's "valid". (Maybe I've forgotten the proper term for "valid", but the gist is that it's a special data structure.) – Michael E2 Mar 16 '24 at 20:21
  • Can you please be more specific about what you really want to mimic with your functions? What problem are you encountering? Give us your code and describe the problem :) – Domen Mar 16 '24 at 20:23
  • See for instance https://mathematica.stackexchange.com/questions/198378/custom-atomic-expressions-modern-tutorial/198381#198381 – Michael E2 Mar 16 '24 at 20:23
  • @Domen there is no code, I am reading about it and this question occurred to me. The totem is: an expression is replaced by another and then the system attempts to execute term rewriting again in the new expression. As I said in the text, I can not believe that this is done for an output of RandomReal, there must be something that tells the kernel to not evaluate the result. I just want to know what it is to use it in my own functions if see fit. – Felipe Mar 16 '24 at 20:27
  • @MichaelE2 thanks I will read it. – Felipe Mar 16 '24 at 20:28
  • TraceScan[# &, RandomReal[1, 5], _, Print@*Rule] shows you the steps of evaluation (there are only five). The expression on the left of each Rule is the expression being evaluated; the expression on the right is the value that resulted. Note that the elements of the List are not evaluated. Note also that the array computed by RandomReal is probably computed in a library function, outside the standard evaluation loop. Now that does not mean that the elements of the list were not scanned to see if they needed evaluation. With packed arrays, they probably are not (my guess). – Goofy Mar 16 '24 at 22:35
  • @Goofy [1/2] Very Nice! I am not really experienced with trance scan, but from the Output it seems that my intuition was more or less correct, maybe. Thinking about the evaluation process one would start at the Head and then the arguments of the normal expression. There are no OwnValues for either "RandomReal" or "1" and "5" so the system would just return them. This is displayed in the first 3 lines of the output of your code. Line 4 is weird, the DownValue of RandomReal evaluates to itself before being associated as the DownValue of RandomReal (Line5). – Felipe Mar 16 '24 at 23:06
  • @Goofy [2/2] Maybe this is because it is done by the external library. But the main point is the vector goes into itself as a whole. If the TranceScan Output is self consistent, in the case that the elements of the vector are checked one by one and evaluate to itself one would expect five lines similar to the first 3, one for each element. – Felipe Mar 16 '24 at 23:09
  • I do not understand the question. – azerbajdzan Mar 17 '24 at 12:01
  • @azerbajdzan which part of it you didn't understand? – Felipe Mar 17 '24 at 15:26

2 Answers2

3

Ignoring some special cases, the system behaves as if the entire expression is reexamined at every point in the calculation. That is, after each replacement, the entire expression is checked, and the next appropriate replacement is performed, until no more rules apply. The system tries to be smart about which parts of the expressions in needs to recheck at each point, but this is mostly invisible to the user.

The thing to note for your example is that after the initial application of the downvalue of RandomReal, the expression is some huge list of numbers: There is no RandomReal anymore. This means that conceptually, the system at this point will only check for each step where there are any rules to be applied to a list of real numbers (which there isn't), so the list will stay untouched.

Lukas Lang
  • 33,963
  • 1
  • 51
  • 97
  • I am sorry, but what do you mean by "the system at this point will only check for each step where there are any rules to be applied to a list of real numbers" – Felipe Mar 16 '24 at 21:59
  • Rereading your question, I think I misunderstood it slightly... Maybe this answers it more directly: You can think of the system re-evaluating every expression after every replacement. And no, the system doesn't actually do this: As you note, it would be very inefficient to try and re-evaluate a list of numbers at every step, because obviously nothing will happen. This optimization of what to actually re-check for evaluation is done fully automatically (your functions also benefit from this), and you can't really affect this. That being said, there is Update, but I never had any use for it. – Lukas Lang Mar 16 '24 at 23:05
  • Thanks. Yeah, that was the question how this optimization works. I think this should be documented, it is a part of the evaluation process. Mathematica has nice sintaxe and structure, but this lack of a clear explanation of how things work makes it very frustrating. – Felipe Mar 16 '24 at 23:44
  • 2
    It would be interesting to understand why this lack of explanation makes Mathematica frustrating for you. The essential point for most people is that it just works as documented and as efficiently as possible. To understand it in total detail (in particular the evaluation process) would be extremely demanding, and in most cases not necessary. Similarly, pilots of a Boeing 747 airplane do typically not understand in detail how the engines of the aircraft are constructed and how they work in detail, but still successfully transport people around the world. – Michael Weyrauch Mar 17 '24 at 11:01
  • @MichaelWeyrauch As you said, "most people", I am just not one of those people. I am like those people that if you want them to learn math you put a little history in between, but what I need is a general guiding principle (looking back I think that is why I went to physics not engineering lol). Just learning what functions Mathematica has for me is like "learning" calculus by looking at an Integral table. I loved that my calculus professor made me proof a bunch of theorems, and went deep into the nasty stuff lol. – Felipe Mar 17 '24 at 13:24
0

I think I found out how it works in some old literature, in particular David Wagner's book which references David Withoff's tutorial. Withoff states that the system keeps track of the time of last evaluation of normal expressions, and mentions that normal expressions are not evaluated again unless a part of them changes. I think that the creation of normal expression itself counts as the "first evaluation". I did some tests with constant array.The question was made with RandomReal but the point was really about unnecessary evaluation of large normal expressions and the output of RandomReal is an atom(I discovered that latter). I wanted something whose output would be a large normal expression. The code is:

test = Table[ConstantArray[a, 5*i*10^3]; // Timing, {i, 1, 10^5, 1000}];
test = Drop[test, None, -1];
test2 = Table[5*i*10^3, {i, 1, 10^5, 1000}];
test = Riffle[test2, Flatten@test] // Partition[#, 2] &;
test = MapIndexed[Prepend[#1, 10^6*Max@#2] &, test, {1}];
ListPlot[test]

I calculated a lot of constant arrays with an undefined symbol "a" (ensuring that the result is a normal expression and that its reevaluation would be a waste of time), and them I plot the time to calculate them against their lengths. enter image description here

The scaling seems linear, if it were the case that the kernel was attempting to evaluate the arrays I would expect a quadratic scaling (horizontal axis is array length). Naturally, I am assuming that the process of creating arrays with the same components has a linear scaling, if someone knows this to be false please let me know. To me this is evidence that the kernel is not trying to evaluate the normal expression once it gets it, and I think the reason is related to Whitoff's statement. I am conjecturing that the creation of the normal expression itself already counts as an "evaluation".

Michael E2 suggested that the reason could be related to a "Valid" flag, associated with System`Private`SetValid. It is hard to say for certain that this is not the case because there is no documentation on it. However, I tried to test applying System`Private`ValidQ in RandomReal, ConstantArray and their outputs and it gives False. So it seems unlikely for me.

Goofy gave a nice application of TraceScan to investigate the question and things get weird here. Goofy's application to a simple ConstantArray[a,10] suggest that the kernel attempts to reevaluate the the output of ConstantArray, because there are 10 rules of the type a->a, apart of a List->List. However, is hard for me to believe that the reevaluation of vectors with about 10^6 elements and more would not introduce a x^2 behavior in the plot. So maybe TraceScan affects somehow what happens, I am not sure because I am not quite familiar with it.

Felipe
  • 479
  • 8