By "low-complexity list" I mean a list like the following:
lcl = {17, 17, 17, 17, 17,
21, 21, 21, 21, 21, 21, 21, 21,
11, 11, 11, 11, 11, 11, 11}
It contains 20 elements, but only 3 distinct values, and furthermore, these values occur in homogeneous blocks.
More formally, one can represent any list1 with a "run-length encoding". For example:
rle[list_] := Transpose[{First[#], Length[#]} & /@ Split[list]]
...or, if you prefer, this slightly fancier version:
rle[list_, equal_:SameQ, representative_:First] :=
Transpose[{representative[#], Length[#]} & /@ Split[list, equal]]
For the list lcl shown earlier, rle[lcl] is
{{17, 21, 11}, {5, 8, 7}}
For the sake of this discussion, let's define the "compressibility" of a list list as the ratio
Length[list]/Length[Flatten[rle[list]]]
Low-complexity lists are therefore those for which this compressibility measure is substantially greater than 1.
The compressibility of the list lcl shown above is (only) 20/6 = 3.33. In contrast, I need to work with lists whose compressibilities lie in the range $10^4-10^5$.
Q: Does Mathematica provide any support (data structure, functions) for representing such "low-complexity lists" more compactly, while still being able to use them as lists?
If x is a numeric low-complexity list, as here defined, Differences[x] is sparse (even though x isn't). For example, applying Differences to the example above yields
{0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, -10, 0, 0, 0, 0, 0, 0}
That said, I can't think of a way to take advantage of Mathematica's excellent SparseArray in this context.
EDIT:
Just to be clear, I'm asking whether in the vast Mathematica universe there already exists support for this sort of beast. I just don't want to reinvent wheels, especially for something like this, which would be far better implemented at a lower level than "user-space".
By far, the operation I'm most interested in is indexing (i.e. Part), especially complex indexing form, such as indexing with a list of indices, or with ranges (e.g. 1000;;1999).
For simple indexing, I could base a home-grown solution on this sort of thing
FirstPosition[..., i_ /; # < i] &
...where ... is a placeholder for an internal data structure (part of the low-complexity list representation) that holds the positions of the elements that is different from their predecessors.
I'm also interested in Map and friends (Table, Scan, etc.), but much less.
The very beginning of the implementation I alluded to before could be as follows (but this providing full support for Part/alone would require considerably more code, I think).
lcList[l_List] := Module[
{
values
, counts
, pivots
},
(* the rle function is defined earlier in this post *)
{values, counts} = rle[l]
; pivots = 1 + Accumulate @ counts
; lcList[<|"values" -> values,
"pivots" -> pivots,
"ceiling" -> Last @ pivots|>]
]
lcList /: Length[lcList[a_Association]] := a["ceiling"] - 1
lcList /: Part[lcList[a_Association], i_Integer] :=
a["values"][[First@FirstPosition[a["pivots"], j_ /; i < j]]] /;
0 < i < a["ceiling"]
Then, with the list lcl shown at the top of this post,
rleLcl = lcList[lcl]
rleLcl[[#]] & /@ Range[Length[rleLcl]]
{17, 17, 17, 17, 17, 21, 21, 21, 21, 21, 21, 21, 21, 11, 11, 11, 11, 11, 11, 11}
1 Well, not any list, but rather, any list of elements such that, for any two of them, a and b, we have an adequate equality predicate equal[a, b].
lcListseems like a good start. Make sure your arrays are- and stay packed. Searching for appropriatepivotscan be compiled. If you need to access parts of same list many times, then maybe storingpivotsas some kind of search tree would be justifiable. – jkuczm May 17 '17 at 19:38{First[#], Length[#]} & /@ Split[lcl]. – kjo May 24 '17 at 12:23