Consider the following simplified example (on Mathematica 12):
ds = Dataset@{<|"hell" -> "o"|>, <|"hell" -> "no"|>};
Map[ds&,ds]
This produces the following errors:
During evaluation of In[2]:= MapAt::partw: Part {All,2} of <|hell->o|> does not exist.
During evaluation of In[2]:= MapAt::partw: Part {All,2} of <|hell->no|> does not exist.
During evaluation of In[2]:= MapAt::partw: Part {All,2} of <|ID-><someid>|> does not exist.
During evaluation of In[2]:= General::stop: Further output of MapAt::partw will be suppressed during this calculation.
Out[2]= $Failed
(edited out the DS id out of the output)
Evaluating the second cell a second time produces the expected (pasted as inputform)
Dataset[{Dataset[{<|"hell" -> "o"|>, <|"hell" -> "no"|>}, TypeSystem`Vector[TypeSystem`Struct[{"hell"}, {TypeSystem`Atom[String]}], 2],
<|"ID" -> <ID1>|>], Dataset[{<|"hell" -> "o"|>, <|"hell" -> "no"|>}, TypeSystem`Vector[TypeSystem`Struct[{"hell"}, {TypeSystem`Atom[String]}],
2], <|"ID" -> <ID1>|>]}, TypeSystem`Vector[TypeSystem`AnyType, 2],
<|"Origin" -> HoldComplete[Map, testds2 & , Dataset`DatasetHandle[<ID1>]], "ID" -> <ID2>|>]
If this seems like a contrived example, my current use case is mapping a function that takes the entire dataset and a single row and produces a string.
Observations
- The error appears to manifest when the function being mapped accesses some dataset. e.g. The following does not produce an error:
Map[Echo,ds]
A similar error and nondeterministic evaluation occurs by using
Query[All, ds&]insteadMapproduces a similar messages (also only on the first run) but does produce the expected result. Moreover, theQueryform andMapform do not appear to affect one another, i.e. runningQuerytwice does not prime theMapform, which still produces messages and Failure when first running.I thought there might be an issue with referencing the DS while mapping on it, but the following also fails. Note that the messages in this case refer to the
ds2Dataset, i.e. the one in the function (corrected from previous version where the opposite was claimed):
ds = Dataset@{<|"hell" -> "o"|>, <|"hell" -> "no"|>};
ds2 = Dataset@{<|"hell" -> "yes"|>, <|"hell" -> "no"|>};
Map[ds2&,ds]
Making any changes to the function makes the errors reappear [observation from comment by @lukas-lang], even something as trivial as
(1; ds)&.Once a function has been cached, mapping it over other datasets works fine.
If the dataset does not appear directly in the function argument, the problem is avoided (from comment by @lukas-lang)
Workarounds
- If the mapped over dataset is wrapped with
Normal, the map works as expected. - Making the dataset not appear lexically in the function expression, e.g. by using Downvalues. e.g.
temp[] := ds; Map[temp[]&, ds]works (from comment by @lukas-lang). Edit: Correction - simple SetDelayed won't work, only Downvalues,
Prior art
I found a SO question that seems to have a related issue (in particular, similar messages produced and similar nondeterminism). However, I think that the question was posed in a manner that obstructs the true issue and the proposed solution does not appear applicable here.

Dataset- every time you change the function in any way, the error appears for one evaluation, indicating that the result is being cached. Another workaround is to hide the structure of the function by moving the definition into a symbol, e.g.myfunc[x_]:=dsfollowed byMap[myfunc,ds]works without issues (The reason is that the type deduction system does not peek into the down-values of symbols) – Lukas Lang Oct 25 '19 at 21:11DatasetHead as well as other heads accepted byQuery. Although I am beginning to be somewhat skeptical as to whether I should continue usingDatasetin the first place... – Shwouchk Oct 25 '19 at 21:28