3

Is it possible to have an expression x such that its elements are only evaluated when they are individually accessed, for example when evaluating statements like x[[1]] or f/@x? The basic idea is that x is a list of Get commands, each of which loads a large expression (gigabytes), so that x can be terabytes in size yet it remains fairly manipulable within Mathematica.

What I thought would work was something like:

x={Unevaluated[Get[...]],Unevaluated[Get[...]],...}

The problem with that however is that Unevaluated doesn't get stripped when I need it to, i.e. when calling something like f[x[[1]]], because of Mathematica's evaluator semantics. The only workable alternative is to use Hold instead of Unevaluated, but that requires that I manually call ReleaseHold every time, which is ugly. I was hoping for something entirely transparent.

Mohammed AlQuraishi
  • 1,002
  • 5
  • 15
  • 3
    The answer is yes. Basically,something similar is done in this answer. It can be combined with this great answer on lazy lists, to make the lazy nature of data loading more explicit, but this may not be necessary. To rephrase, the first linked answer currently lacks generic stream interface and centralized memory manager for streams - both can be written rather easily on top of it though. – Leonid Shifrin Jul 02 '13 at 22:17
  • The other limitation of the first linked answer is that it only deals with specific expressions - lists with large sub-lists. Again, this can be generalized to arbitrary expressions and their parts, and I actually plan to do so, but this has not been done yet. – Leonid Shifrin Jul 02 '13 at 22:19
  • Not sure whether to consider this one a duplicate of the one on file-backed lists, let's see what others think. – Leonid Shifrin Jul 02 '13 at 22:21
  • Thanks @LeonidShifrin for the pointers. Am I wrong in concluding that the solutions you mentioned require that I overload Part and related functions, and thus are not general in the sense that I have to know a priori which functions will operate on the expression? If that is the case then it doesn't help me, as the purpose of my question was to have a solution that would work transparently. I may have misunderstood your solutions however. – Mohammed AlQuraishi Jul 02 '13 at 22:42
  • Yes, you will have to know the functions. Or, support all core functions in Mathematica, which is doable but a lot of work. Basically, their is no magic spell here - for example, in order to fully support packed arrays when they were introduced, a huge amount of work was required to overload all important functions in Mathematica, on this (then new) data structure. A lazy stream is no different - since it is not yet offered as a part of the language, one would need to define those functions one want to work on it, oneself. In Java, one would have to implement an interface, etc. – Leonid Shifrin Jul 02 '13 at 22:47
  • What I would do would be to try supporting the core functions (there will likely be fewer of them than one might think), and see whether or not that would cover most of the use cases of interest to you. You can always have a fall-back case where the lazy stream is converted to a normal list (expression) for functions which have not been overloaded explicitly. Note however that many functions will also need a different algorithm when overloaded on lazy streams, to take an advantage of them (e.g.Sort, etc). Making lazy streams an integral and transparent part of the language is no small task. – Leonid Shifrin Jul 02 '13 at 22:51
  • As a simple alternative, you can use Hold or HoldComplete as your container, instead of List: x= Hold[Get[...],...,Get[...]]. Perhaps this is as transparent as it gets, but I am not really sure what this buys you that can not be totally covered by overloading functions such as Part. – Leonid Shifrin Jul 02 '13 at 22:55
  • Thanks for the suggestions. For now I will just use the Hold/ReleaseHold combo, as it does the job with the least amount of work/disruption. What I was hoping for is something like Unevaluated, except that it gets stripped whenever it is an argument to a symbol with DownValues, regardless of what step in the evaluation it is. I think that would have satisfied my need. – Mohammed AlQuraishi Jul 02 '13 at 23:41
  • No problem. Whatever works for you. – Leonid Shifrin Jul 03 '13 at 10:44
  • @Mohammed Would you please give an precise example of your use of Hold/ReleaseHold? – Mr.Wizard Jul 03 '13 at 20:37
  • Sure. I use x={Hold[Get[...]],Hold[Get[...]],...} and then when mapping a function or extracting a part I just make sure to wrap it with ReleaseHold. Nothing fancy.

    Coincidentally, I noticed that I can almost solve the problem using Wrapper /: h_[a___,Wrapper[g_],b___] /; h=!=List := h[a,g,b] which has slightly nicer stripping behavior than Unevaluated, alas it doesn't really work. It works when say mapping a function, but not when extracting a part, because Part takes the whole list as an argument and so doesn't strip Wrapper.

    – Mohammed AlQuraishi Jul 04 '13 at 01:30
  • Your Wrapper is something I was thinking of myself. When you say it fails with Part do you mean when extracting a sub-part of the Wrapped object? – Mr.Wizard Jul 04 '13 at 07:03
  • Yes that's right. The behavior I'm looking for is that Wrapper gets stripped (and thus whatever is inside it gets evaluated, presumably a Get expression) whenever it is extracted as a single element. Having said that, for all the use cases I can think of, it actually does what I want. For example f[{a, Wrapper[b], c}[[2]]] returns f[b], and x = {a, Wrapper[b], c}[[2]] assigns b to x, so maybe I'm being picky. The only time it doesn't work is with {a, Wrapper[b], c}[[2]], which returns Wrapper[b] instead of just b. – Mohammed AlQuraishi Jul 04 '13 at 17:35
  • (Wrapper[x_] /; StackInhibit[Quiet@Stack[][[-2]] =!= List] := x; _Wrapper /; Update[] := Null UpValue by DownValue) – Rojo Jul 05 '13 at 03:21

0 Answers0