14

I am looking for a way to shuffle the elements of an expl3 sequence variable. The following code which initialises and outputs such a sequence variable could serve as a starting point:

\documentclass{article}
\usepackage{expl3}

\ExplSyntaxOn

%initialize ordered seq
\seq_gset_from_clist:Nn\g_my_seq{0,1,2,3,4,5,6,7,8,9}
%shuffle \g_my_seq
% ? ? ?

\begin{document}
\seq_use:Nnnn\g_my_seq{~and~}{,~}{,~and~}
\end{document}
AlexG
  • 54,894
  • Probably we need to add support for this using an internal call to the random number generator (so will be an issue in XeTeX). – Joseph Wright Apr 09 '18 at 11:42
  • here I've provided an answer which shuffles a sequence (I'm too lazy right now to extract that code...) – Skillmon Apr 09 '18 at 12:23
  • See https://tex.stackexchange.com/a/224559/4427 – egreg Apr 09 '18 at 12:38
  • Thanks, @egreg. +1 for your answer. I should have searched for "shuffle [expl3]" alone, but I added "sequence". – AlexG Apr 09 '18 at 12:53
  • For those who have not yet noticed, \seq_[g]shuffle:N is part of expl3 as of version 2018-04-29 (available in TeXLive-2018). And it is incredibly fast. Thank you @Bruno! – AlexG May 02 '18 at 08:00

5 Answers5

10

I'd use Lua. It's much more readable.

\documentclass{article}
\usepackage{expl3}
\usepackage{luacode}

\begin{luacode*}
function shuffle(list)
   for i = #list,2,-1 do
      local j = math.random(i)
      list[i], list[j] = list[j], list[i]
   end
   tex.sprint(table.concat(list,","))
end
\end{luacode*}

\ExplSyntaxOn

\cs_generate_variant:Nn \seq_gset_from_clist:Nn { Nf }

\cs_new_protected:Npn \seq_shuffle_inplace:N #1
{
  \seq_gset_from_clist:Nf #1 { \lua_now_x:n { shuffle({ [[ \seq_use:Nn #1 { ]] , [[ } ]] }) } }
}

\seq_gset_from_clist:Nf \g_my_seq { 0,1,2,3,4,5,6,7,8,9 }

\seq_shuffle_inplace:N \g_my_seq

\begin{document}

\seq_use:Nnnn\g_my_seq{~and~}{,~}{,~and~}

\end{document}
Henri Menke
  • 109,596
  • It should be \seq_gset_from_clist:Nf rather than :Nn, right? – Manuel Apr 10 '18 at 07:37
  • @Manuel Indeed. Otherwise the resulting seq will have the shuffeled list in a single item. – Henri Menke Apr 10 '18 at 10:04
  • Nice! This is the Durstenfeld/Knuth version of FY with in-place shuffling. – AlexG Apr 10 '18 at 10:25
  • @HenriMenke : I am not a Lua expert, how can we define a command \seq_shuffle:N that uses the Lua code internally and which operates on a sequence variable? – AlexG Apr 11 '18 at 08:12
  • 1
    @AlexG See updated answer. This version should now also work with non-numerical data. – Henri Menke Apr 11 '18 at 08:34
  • By far the fastest solution presented here: user 0m0.645s for shuffling the sequence 0, 1, ..., 999 on my PC. – AlexG Apr 11 '18 at 08:49
  • 1
    @AlexG Now try again with LuaJITTeX ;-) – Henri Menke Apr 11 '18 at 09:10
  • luajitlatex ? /bin/bash: luajitlatex: command not found – AlexG Apr 11 '18 at 09:19
  • @AlexG It's not available by default. You have to edit texmf-dist/web2c/fmtutil.cnf and uncomment the line starting with luajitlatex. Then, build the format using fmtutil-sys -byfmt=luajitlatex and then you can use luajittex -fmt=luajitlatex whatever.tex. – Henri Menke Apr 11 '18 at 09:21
  • @AlexG But don't bother. The Lua end is so fast in this case that you do not see any speedup by switching to JIT, unlike here https://tex.stackexchange.com/a/372669 – Henri Menke Apr 11 '18 at 09:25
  • If I replace the 1 by \error I get an error in expansion of \lua_now_x:n. Is there a way to apply this method with items not compatible with expansion during the shuffling? –  Apr 23 '18 at 08:40
  • @jfbu It's hard (if not even impossible) to pass unexpandable matrial down to Lua and back to TeX. – Henri Menke Apr 23 '18 at 10:27
  • @HenriMenke By using \robustify\error? – Manuel Apr 23 '18 at 12:30
  • 1
    I agree Lua is more readable there, but it seems it ends up being slower in this case, probably it's too expensive to go from TeX to Lua and back for such a (somewhat) simple task. – Bruno Le Floch Apr 30 '18 at 04:10
9

I literally just added \seq_shuffle:N and \seq_gshuffle:N to expl3 (development version: https://github.com/latex3/latex3/). From what I can tell it's about 4 times faster than other solutions here except perhaps the LuaTeX one.

\documentclass{article}
\usepackage{expl3}
\ExplSyntaxOn
\seq_gset_from_clist:Nn\g_my_seq{0,1,2,3,4,5,6,7,8,9}
\seq_gshuffle:N \g_my_seq
\begin{document}
\seq_use:Nnnn\g_my_seq{~and~}{,~}{,~and~}
\end{document}
  • Just to note that this will be in the next expl3 release, likely tomorrow or Tuesday – Joseph Wright Apr 29 '18 at 21:38
  • Just out of curiosity, did using \pdftex_uniformdeviate:D bring a real speed improvment ? –  Apr 29 '18 at 21:53
  • @jfbu IIRC using \pdftex_uniformdeviate:D brought a 10x improvement, but that's because our \int_rand:nn uses the overly general \int_div_truncate:nn under the hood rather than reimplementing the wheel. – Bruno Le Floch Apr 30 '18 at 04:09
  • wow, 10x. Indeed by using it too I obtained 2.5x--3x improvement in the code which was at top of my answer before you announced your own unbeatable way. And for my adaptation of your method to macros, not toks, the impact is 4x. I thus believe your code is about 2.5x or 2x faster than the one at top of my answer but I have not tested the exact thing for lack of the \seq_gset_from_inline_x:Nnn. –  Apr 30 '18 at 09:15
8

Wooden version. No optimization. No “expert” writting this code.

\seq_new:N \l_alexg_origin_seq
\seq_new:N \l_alexg_destiny_seq
\int_new:N \l_alexg_random_int
\int_new:N \l_alexg_current_int
\cs_new_protected:Npn \seq_shuffle:N #1
 {
  \seq_set_eq:NN \l_alexg_origin_seq #1
  \seq_clear:N \l_alexg_destiny_seq
  \prg_replicate:nn { \seq_count:N #1 }
   {
    \int_set:Nn \l_alexg_random_int { \int_rand:nn { 1 } { \seq_count:N \l_alexg_origin_seq } }
    \int_zero:N \l_alexg_current_int
    \seq_clear:N \l_tmpa_seq
    \seq_map_inline:Nn \l_alexg_origin_seq
     {
      \int_incr:N \l_alexg_current_int
      \int_compare:nNnTF { \l_alexg_current_int } = { \l_alexg_random_int }
       { \seq_put_right:Nn \l_alexg_destiny_seq { ##1 } }
       { \seq_put_right:Nn \l_tmpa_seq { ##1 } }
     }
    \seq_set_eq:NN \l_alexg_origin_seq \l_tmpa_seq
   }
  \seq_set_eq:NN #1 \l_alexg_destiny_seq
 }

Complete expl3ification of @jfbu's answer.

\cs_new_protected:Npn \seq_shuffle_inplace:N #1
 {
  \int_zero:N \l_tmpa_int
  \seq_map_inline:Nn #1
   { 
    \int_incr:N \l_tmpa_int 
    \tl_set:cn { l_jfbu_shuffle_ \int_use:N \l_tmpa_int _tl } { ##1 } 
   }
  \int_step_inline:nnnn { \l_tmpa_int } { -1 } { 2 } 
   {
    \int_set:Nn \l_tmpb_int { \int_rand:nn { 1 } { ##1 } }
    \tl_set_eq:Nc \l_tmpa_tl { l_jfbu_shuffle_##1_tl }
    \tl_set_eq:cc { l_jfbu_shuffle_##1_tl } { l_jfbu_shuffle_ \int_use:N \l_tmpb_int _tl }
    \tl_set_eq:cN { l_jfbu_shuffle_ \int_use:N \l_tmpb_int _tl } \l_tmpa_tl
   }
 \tl_set:Nx #1 % more manual approach, ideally using \seq_set_from_clist:Nx
  {
   \s__seq
   \int_step_function:nnnN { 1 } { 1 } { \l_tmpa_int } \__jfbu_seq_construct:n
  }
 }
\cs_new:Npn \__jfbu_seq_construct:n #1
 { \exp_not:N \__seq_item:n { \exp_not:v { l_jfbu_shuffle_#1_tl } } }
Manuel
  • 27,118
  • Wow, this works beautifully! – AlexG Apr 09 '18 at 12:31
  • 1
    @AlexG Even if you use this as temporary, I would wait for a version written by Joseph or other official answers. – Manuel Apr 09 '18 at 12:39
  • Is it Fisher-Yates algorithm? (Out of curiosity that I am asking.) – AlexG Apr 10 '18 at 09:57
  • @AlexG No idea. This is handmade. Basically: pick random item, put first in the final seq and remove from the original seq (so it has -1 item than original), repeat until every item has been moved to the final seq. It could be optimized with a \seq_map_break:n. – Manuel Apr 10 '18 at 10:00
  • @AlexG Basically, sample steps with (orginal) (final) format: (1,2,3,4) (), random item 3, (1,2,4) (3), random item 3, (1,2) (3,4), random item 1, (2) (3,4,2), random item 1 (not random since there's just one left), () (3,4,2,1). – Manuel Apr 10 '18 at 10:05
  • Yes, yes, I understood. Seems to be the classic FY algorithm, as far as I can tell from reading the Wikipedia article. – AlexG Apr 10 '18 at 10:07
  • It is notable that the "shuffling" itself execution time is sensitive to the length of the control sequence names used: with l_jfbu_shuffle_##1_tl I get currently about 2.4s for 50000 items (with the code as in your jfbu'expl3ification) to be compared with 1.85s when using l_jf_##1 in the very same code. –  Apr 29 '18 at 08:14
  • Checking a bit more interface3 it seems all things seq related are "short" (for example \seq_set_from_clist:Nn \foo {a, b , c} creates a short macro, \seq_put_left:Nn uses \tl_set:Nx which creates a short macro). So it should be \cs_set_nopar:Npx at end of expl3ification. –  Apr 29 '18 at 09:07
  • @jfbu I know, I just didn't want to edit more hehe but I add it now. – Manuel Apr 29 '18 at 15:52
  • @jfbu And thats amazing that the length matters that much! – Manuel Apr 29 '18 at 16:06
7

Starting with 12 items, it is impossible to generate all permutations from a user chosen seed because there are technically 2**28 seeds for the MetaPost-PDFTeX RNG and 2**28 < 12!.

ATTENTION I am only saying it is impossible to get all N! permutations when invoking the \seq_shuffle:N immediately after having set the random seed.

It is known that the "lagged Fibonacci sequences" used by the RNG each explore large spaces: the parity bit itself has a period 2**55-1 (or perhaps 55(2**55-1)). The RNG does not consist of iterating a function F on a set of size 2**28. It uses arrays of 55 integers.

Parenthetical remark: the k low bits of generated random integers from using \pdfuniformdeviate 268435456 (2**28) depend only on the k low bits of the seed; this does not mean that each sequence itself is not random enough, but if you reduce modulo 16 you get only 16 possible distinct sequences, those corresponding to seeds 0, ..., 15.

In practice though \pdfuniformdeviate N will do something round(N * random / 2**28) (and map N to 0) so it is mainly dependent on the high bits, not the low bits. Anyway I just wanted to demonstrate how savant I have become now. (and I could aggravate my case even more by mentioning things about counting odd vs even in batches of 165 random integers for N=2**28...)

For fun I wrote findseed.tex which can find a seed for \pdfuniformdeviate which will generate a given permutation of N items. I used it to check that the identity permutation of 12 or 13 items are never generated (by the algorithm used in BLF's answer).

It took 12 minutes to explore the 2**28-sized space of seeds... The seed is seeked from 2**28 downwards for reasons of optimization of the code for the cases when it goes through all 2**28 possible seeds...

For convenience the file is here configured to seek a seed which would give the identity permutation of 11 items. It finds 249252612 (this took about 50s on my laptop.) I also looked a seed from bottom up and found 22635787.

\newcount\maxseedplusone

\maxseedplusone 268435456 % 2**28

\def\x #1#2\char{%
  \if-#1\doesnotexist\fi
  \pdfsetrandomseed #1#2
  \ifnum\pdfuniformdeviate 1=0 % will always be true
  \ifnum\pdfuniformdeviate 2=1 % 1 
  \ifnum\pdfuniformdeviate 3=2 % 2
  \ifnum\pdfuniformdeviate 4=3 % 3
  \ifnum\pdfuniformdeviate 5=4 % 4
  \ifnum\pdfuniformdeviate 6=5 % 5
  \ifnum\pdfuniformdeviate 7=6 % 6
  \ifnum\pdfuniformdeviate 8=7 % 7
  \ifnum\pdfuniformdeviate 9=8 % 8
  \ifnum\pdfuniformdeviate 10=9 % 9
  \ifnum\pdfuniformdeviate 11=10 % 10
  %\ifnum\pdfuniformdeviate 12=11 % 11
  %\ifnum\pdfuniformdeviate 13=12
  % \ifnum\pdfuniformdeviate 14=13
  % \ifnum\pdfuniformdeviate 15=14
  \gotit
  \fi\fi\fi\fi\fi
  \fi\fi\fi\fi\fi
  \fi%\fi\fi%\fi\fi
  \expandafter\x\the\numexpr#1#2-1\char
}%

\def\doesnotexist#1\char
   {\fi\immediate\write128{COMPATIBLE SEED DOES NOT EXIST!}}

\def\gotit#1\expandafter\x\the\numexpr#2-1\char
   {#1\immediate\write128{COMPATIBLE SEED IS #2}}

% no harm to start at 2**28 exactly
\expandafter\x\the\numexpr\maxseedplusone\char
\bye

% For identity permutation

% 5: COMPATIBLE SEED IS 268435437
% 9: COMPATIBLE SEED IS 268130861
% 11: COMPATIBLE SEED IS 249252612 (also 22635787, smallest one)
% 12: COMPATIBLE SEED DOES NOT EXIST!
% 13: COMPATIBLE SEED DOES NOT EXIST!
% 13!/2**28 = 23.19746...

% For permutation [9, 3, 7, 2, 4, 8, 5, 6, 1]
% which is obtained from transpositions: [0, 1, 1, 2, 2, 5, 2, 5, 0]
% COMPATIBLE SEED IS 268264686

Confirmation:

\documentclass{article}
\usepackage{expl3}

\ExplSyntaxOn

% une version de BLF sans toks

\seq_new:N\g__internal_seq

\cs_new_protected:Npn\seq_shuffle_a_la_blf_without_toks:N #1
{
  \group_begin:
      \tl_set_eq:NN \__seq_item:n \__blf_shuffle_item_with_macros:
      \int_zero:N \l_tmpa_int
      #1
  % rebuild a seq variable
  \tl_gset:Nx \g__internal_seq 
          { \s__seq \__seq_construct_from_macros:w 1 \q_stop }
  \group_end:
  \tl_set_eq:NN #1 \g__internal_seq 
  \seq_gclear:N \g__internal_seq
}

\cs_set:Npn \__blf_shuffle_item_with_macros:
{
  \int_incr:N \l_tmpa_int
  \int_set:Nn \l_tmpb_int 
% BEAUCOUP PLUS LENT SI AVEC { \int_rand:nn { 1 } { \l_tmpa_int } }
      { \c_one + \pdftex_uniformdeviate:D \l_tmpa_int }
% possibly random is same as max, then just does \let\foo\foo
  \tl_set_eq:cc { \int_use:N \l_tmpa_int } { \int_use:N \l_tmpb_int }
  \cs_set_nopar:cpn { \int_use:N \l_tmpb_int }
}

\cs_new:Npn \__seq_construct_from_macros:w #1 \q_stop
{
  \exp_not:N \__seq_item:n { \exp_not:v { #1 } }
  \if_int_compare:w #1 = \l_tmpa_int
    \exp_after:wN \use_none_delimit_by_q_stop:w
  \fi:  
  \exp_after:wN 
  \__seq_construct_from_macros:w
      \int_use:N \__int_eval:w #1 + \c_one \__int_eval_end: \q_stop
}

\ExplSyntaxOff
\begin{document}

\ExplSyntaxOn

\seq_new:N\g_my_seq
\clist_new:N\g_my_clist

\seq_set_from_clist:Nn \g_my_seq { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}

\sys_gset_rand_seed:n{ 249252612 }

\seq_shuffle_a_la_blf_without_toks:N \g_my_seq

\clist_set_from_seq:NN \g_my_clist \g_my_seq

\clist_use:Nnnn \g_my_clist { ~and~ } { ,~ } { ,~and~ }\par

\sys_gset_rand_seed:n{ 22635787 }

\seq_shuffle_a_la_blf_without_toks:N \g_my_seq

\clist_set_from_seq:NN \g_my_clist \g_my_seq

\clist_use:Nnnn \g_my_clist { ~and~ } { ,~ } { ,~and~ }

\ExplSyntaxOff
\end{document}

enter image description here

The following python snippet can serve to find the numbers to put on the right sides of the \ifnum tests in the findseed.tex file above, when one targets a given permutation.

def getoriginalshuffling(perm):
    """From a shuffled list of integers 1, ..., N
    this produces a new list which indicates how this
    was obtained by transpositions as in the BLF form
    of the FY algorithm from the identity permutation
    as starting point.

    Attention: the produced list contains 0, ..., N-1
    It tells the values that \pdfuniformdeviate must
    return in succession during the BLF algorithm.

    The input *must* be a permutation of 1, ..., N
    """
    L = perm[:]
    size = len(L)
    for x in range(size,0,-1):  # stops at 1
        i = L.index(x)
        L[i] = L[x-1]
        L[x-1] = i
    return L


BLF's answer announces new \seq_shuffle:N which is faster than all code below. Technically it uses \toks.

(modified after checking out latex3 dev repo) I tested the latex3 dev code, and as it uses a higher level reconstruction of the "seq" as the last step, its speed is about the same as the code below: a 2x gain from using \toks is compensated by a 2x loss from the reconstruction step. Thus the code (now at top of answer) is about the same speed and is not limited to 32767 items (with pdftex). Now, if the latex3 dev code with toks were to switch to a more low-level approach to the reconstruction step it would reclaim 2x improvement.

The code (found at top of this answer) is faster than my earlier expl3 attempt from two factors:

  1. it copies over a key trick by BLF which merges the shuffle itself with the initial step storing items into containers. These containers are \toks in BLF's code, and their usage proves 2x faster than using macros, despite shortest possible names for them.

  2. it turns out that an even more substantial contributing factor to increased efficiency is the usage of \pdfuniformdeviate primitive in place of \int_rand:nn (check out expl3 commented code for explanations on what \int_rand:nn achieves and this comment on why it has a cost. There are some issues with the pdftex RNG).

Actually if I use \int_rand:nn the execution time is multiplied by more than 4x factor on the entire range from 10 items to 1000 items and by 3x at 50000 items!

The code is to be found at top of this answer together with the check that the seed 249252612 generates the identity permutation of 11 items!

Thus, in my a earlier expl3 code (second code snippet) I have belatedly inserted usage of \pdfuniformdeviate (as was used in my original answer of course, which was not using latex3). This single change makes it about 2.5x--3x faster. (the effect is less, because the code was slower to start with). At 50000 items, the gain is about 2x.

To give rough idea, on my current computer (not very fast), the method here now shuffles 1000 items in about 5 milli-seconds, the older one did that in about 8.5 milli-seconds, and when I was using \int_rand:nn in this older code it was about 25 milli-seconds.

For 50000 items, about respectively 0.41s, 0.75s and about 1.6s replacing 0.75s if using \int_rand:nn.

The code (incorporating a trick of BLF, but without toks) has now been incorporated to latex document at top of this answer.


The code before that, but with now usage of \pdfuniformdeviate for about a 2.5x--3x speed gain compared to its earlier self which dutifully was using \int_rand:nn!

\cs_set:Npn \_jfbu_seq_item:n #1
{
  \int_incr:N \l_tmpa_int
  \cs_set_nopar:cpn { l_ \int_use:N \l_tmpa_int } { \exp_not:n { #1 } }
}

\cs_new_protected:Npn\jfbu_seq_shuffle:N #1
{
  % define temp macros to hold the items from the seq
  % THIS ABUSES KNOWLEDGE OF EXPL3 INTERNALS
  \tl_set_eq:NN \tl_tmpa:n \__seq_item:n
  \tl_set_eq:NN \__seq_item:n \_jfbu_seq_item:n
  \int_zero:N \l_tmpa_int
  #1
  \tl_set_eq:NN \__seq_item:n \tl_tmpa:n

  % shuffle seq items

  \int_step_inline:nnnn { \l_tmpa_int } { -1 } { 2 }
  {
    \tl_set:Nx \l_j_tl 
% BEAUCOUP PLUS LENT AVEC \int_rand:nn 
%         { \int_rand:nn { 1 } { \l_tmpa_int } }
          { 1 + \pdftex_uniformdeviate:D \l_tmpa_int }
    \tl_set_eq:Nc \l_tmpa_tl     { l_ ##1 }
    \tl_set_eq:cc { l_ ##1 }     { l_ \l_j_tl }
    \tl_set_eq:cN { l_ \l_j_tl } \l_tmpa_tl
  }

  % rebuild a seq variable avoiding the perils of the save stack
  % THIS ABUSES KNOWLEDGE OF EXPL3 INTERNALS

  \tl_set:Nx #1 { \s__seq \_jfbu_seq_construct:n 1 . }

}

\cs_set_nopar:Npn \jfbu_gob_til_dot:w #1 . { }

\cs_new:Npn \_jfbu_seq_construct:n #1 .
{
  \exp_not:N \__seq_item:n { \use:c { l_ #1 } }
  \if_int_compare:w #1 = \l_tmpa_int
    \exp_after:wN \jfbu_gob_til_dot:w
  \fi:  
  \exp_after:wN \_jfbu_seq_construct:n \int_use:N \__int_eval:w #1 + \c_one .
}

Some related older remarks

  • As was observed quite rightly by @AlexG, the whole business of the \fontdimen's (or intarray) to construct first a permutation of integers can be dispensed of.

  • The code is latest iteration from attempt at doing things with expl3's language. However, while I started learning it I also peeked into the internals, and as a result this code is abusing knowledge of internals of how a "seq" is coded.

  • At this level of optimization, the code speed is quite dependent on the length of the names used for the temporary macros it creates, indexed by integers (digits tokens).

  • Of course the code is faster than the ones I posted earlier, which used \xintiloop itself a generic higher level construct whereas here, the code although couched in expl3 notation is almost directly composed of TeX primitives to a large extent.

  • Thus in latest iteration I shamelessly use macro names \l_1, \l_2, etc... which are very short indeed but do not hold any more any prefix.




This are some earlier parts of answer.

This answer has evolved in stages which can be seen in its revision history.

  • initially, I provided a LaTeX2e + xinttools approach, dealing with comma separated lists both as input and output. I wanted to test again the usage of \fontdimen storage as in a previous answer of mine. We only need to construct a permutation on integers which will serve as indices to the sequence data so this is perfectly well adapted.

  • I felt I needed to provide some more genuine expl3 approach, thus in a a second stage I read more closely the OP and saw it was matter of "seq" typed variables, and after great efforts and goodwill I managed to provide an approach handling such user variables. I used \seq_put... and \seq_pop... macros which I had seen employed in other answers.

  • Gradually with help of @Manuel, the whole code got converted into expl3 language, inclusive of the \fontdimen method which is available abstracted in l3intarray.(1) In passing I did some speed tests comparing the put and pop versus my original methods of \xintAssignArray and \xintiloop in an \edef, and the speed difference proved to be great. But @Manuel provided pure expl3 alternatives which are in the same ballpark in terms of speed as my original xinttools methods. He has now added the final result to his answer.

(1) it was pointed out in a comment by @BrunoLeFloch that l3intarray is now public in expl3 dev repo. For the moment I had to use the private macro names starting with \__intarray.

To keep this answer to reasonable size, I am keeping here only:

  • my original xinttools answer,

  • and its conversion to expl3 language and functionalities. It is not complete conversion because I use the \edef+\xintiloop method of reconstruction of a "seq" out of macros holding item values; this is heretical because it uses the current internal data structure of a "seq" and of course this is very bad. (for expl3ification even of this very bad thing, see "Fourth variant" in second code snippet below).



\documentclass{article}
\usepackage{expl3}
\usepackage{xinttools}

\ExplSyntaxOn

\__intarray_new:Nn \g_jfbu_intarray { 10000 }

\cs_new_protected:Npn \jfbu_genshuffle:n #1 
{ 
  \int_step_inline:nnnn { 1 } { 1 } { #1 } 
    { \__intarray_gset_fast:Nnn \g_jfbu_intarray { ##1 } { ##1 } }
  \int_step_inline:nnnn { #1 } { -1 } { 2 } 
  { 
    \int_set:Nn \l_tmpa_int { \int_rand:nn { 1 } { ##1 } } 
    \int_set:Nn \l_tmpb_int { \__intarray_item_fast:Nn \g_jfbu_intarray { ##1 } }
    \__intarray_gset_fast:Nnn \g_jfbu_intarray { ##1 } 
                            { \__intarray_item_fast:Nn \g_jfbu_intarray { \l_tmpa_int } }
    \__intarray_gset_fast:Nnn \g_jfbu_intarray { \l_tmpa_int }
                            { \l_tmpb_int } 
  } 
}

% attention, mix of expl3 and old TeX from then on!
\cs_new:Npn\jfbu_expand_once_and_brace #1 { { \exp_not:V { #1 } } }

% will be incorporated in future release of xint
\long\def\xintbracediloopindex #1\xintiloop_again\fi\xint_gobble_iii #2%
     {{#2}#1\xintiloop_again\fi\xint_gobble_iii {#2}}%

\cs_new_protected:Npn \seq_shuffle_inplace:N #1
 {
  \int_zero:N \l_tmpa_int
  \seq_map_inline:Nn #1
    { 
      \int_incr:N \l_tmpa_int 
      \tl_set:cn { l_jfbu_\int_use:N \l_tmpa_int _tl } { ##1 } 
    }
  \int_gset:Nn \g_tmpa_int { \l_tmpa_int }
  \jfbu_genshuffle:n \g_tmpa_int
  \edef#1{\s__seq % this does not expand (= \relax)
          \xintiloop[1+1]
          \noexpand\__seq_item:n 
          \expandafter \jfbu_expand_once_and_brace
            \csname l_jfbu_\expandafter\__intarray_item_fast:Nn
                           \expandafter\g_jfbu_intarray
% \xintiloopindex isn't really a macro holding the index, we must expand
% it where it "sees". In particular it cant' be "braced",
% which limits its usage within macro arguments...
% (one needs variant macros using delimited arguments)
% I added a variant \xintbracediloopindex here to avoid having to define
% such macros
             \xintbracediloopindex _tl\endcsname
          \unless\ifnum\xintiloopindex=\g_tmpa_int
          \repeat}%
}

\ExplSyntaxOff

\begin{document}

\ExplSyntaxOn
% for reproducible results
\sys_gset_rand_seed:n { 123456 }

%initialize ordered seq with 1000 items!
\edef\foo{\xintiloop [0+1]
          \xintiloopindex
          \ifnum\xintiloopindex<999
          ,\repeat}
\seq_gset_from_clist:Nc\g_my_seq {foo}
%\show\g_my_seq
% Now measure total time needed:

\pdfresettimer
\seq_shuffle_inplace:N \g_my_seq
\edef\T{\the\dimexpr\pdfelapsedtime sp\relax}
\show\T
%\show\g_my_seq

%initialize ordered seq with 3000 items!
\edef\foo{\xintiloop [0+1]
          \xintiloopindex
          \ifnum\xintiloopindex<2999
          ,\repeat}
\seq_gset_from_clist:Nc\g_my_seq {foo}

% Now measure total time needed:

\pdfresettimer
\seq_shuffle_inplace:N \g_my_seq
\edef\T{\the\dimexpr\pdfelapsedtime sp\relax}
\show\T

\ExplSyntaxOff

\end{document}

And here is the code used for the timing of these alternatives.

\documentclass{article}
\usepackage{expl3}
\usepackage{xinttools}

\ExplSyntaxOn

\__intarray_new:Nn \g_jfbu_intarray { 10000 }

\cs_new_protected:Npn \jfbu_genshuffle:n #1 
{ 
  \int_step_inline:nnnn { 1 } { 1 } { #1 } 
    { \__intarray_gset_fast:Nnn \g_jfbu_intarray { ##1 } { ##1 } }
  \int_step_inline:nnnn { #1 } { -1 } { 2 } 
  { 
    \int_set:Nn \l_tmpa_int { \int_rand:nn { 1 } { ##1 } } 
    \int_set:Nn \l_tmpb_int { \__intarray_item_fast:Nn \g_jfbu_intarray { ##1 } }
    \__intarray_gset_fast:Nnn \g_jfbu_intarray { ##1 } 
                            { \__intarray_item_fast:Nn \g_jfbu_intarray { \l_tmpa_int } }
    \__intarray_gset_fast:Nnn \g_jfbu_intarray { \l_tmpa_int }
                            { \l_tmpb_int } 
  } 
}

\ExplSyntaxOff

\begin{document}

\ExplSyntaxOn
% for reproducible results
\sys_gset_rand_seed:n { 123456 }

%initialize ordered seq with 3000 items!
\edef\foo{\xintiloop [0+1]
          \xintiloopindex
          \ifnum\xintiloopindex<2999
          ,\repeat}
\seq_gset_from_clist:Nc\g_my_seq {foo}

\seq_set_eq:NN  \g_my_savedseq \g_my_seq

% Measure time for counting items and setting up random permutation itself
\pdfresettimer
  \int_gset:Nn \g_tmpa_int { \seq_count:N \g_my_seq } % 3000
  \jfbu_genshuffle:n \g_tmpa_int
\edef\T{\the\dimexpr\pdfelapsedtime sp\relax}
\show\T

% Measure time for popping all items and preparing the tl temp variables
\cs_generate_variant:Nn \seq_pop_left:NN { Nc }
\pdfresettimer
  \int_step_inline:nnnn { 1 } { 1 } { \g_tmpa_int } 
    { \seq_pop_left:Nc  \g_my_seq { l_jfbu_ #1 _tl} }
\edef\T{\the\dimexpr\pdfelapsedtime sp\relax}
\show\T

% Measure time for an alternative way of popping out all items into an "array"
% of \l_jfbu_<index> macros
% Notice that this counts again.
% Code from @Manuel.

\pdfresettimer
\int_zero:N \l_tmpa_int
\seq_map_inline:Nn \g_my_savedseq
 { \int_incr:N \l_tmpa_int 
   \tl_set:cn { l_jfbu_\int_use:N \l_tmpa_int _tl} { #1 } }
\edef\T{\the\dimexpr\pdfelapsedtime sp\relax}
\show\T

% Measure time for pushing back shuffled items

% First variant:

\pdfresettimer
  \int_step_inline:nnnn { 1 } { 1 } { \g_tmpa_int }
    { \seq_put_right:Nv \g_my_seq { l_jfbu_ \__intarray_item_fast:Nn \g_jfbu_intarray { #1 } _tl } }
\edef\T{\the\dimexpr\pdfelapsedtime sp\relax}
\show\T

% Second variant:

% attention, mix of expl3 and old TeX !
\cs_new:Npn\jfbu_expand_once_and_brace #1
  { { \exp_not:V { #1 } } }

% will be incorporated in a future release of xint
\long\def\xintbracediloopindex #1\xintiloop_again\fi\xint_gobble_iii #2%
     {{#2}#1\xintiloop_again\fi\xint_gobble_iii {#2}}%

\pdfresettimer
  \edef\g_my_seq_var{\s__seq
          \xintiloop[1+1]
          \noexpand\__seq_item:n 
          \expandafter \jfbu_expand_once_and_brace
            \csname l_jfbu_\expandafter\__intarray_item_fast:Nn
                           \expandafter\g_jfbu_intarray
             \xintbracediloopindex _tl\endcsname
          \unless\ifnum\xintiloopindex=\g_tmpa_int
          \repeat}%
\edef\T{\the\dimexpr\pdfelapsedtime sp\relax}
\show\T

% Third variant

\cs_new:Npn \__jfbu_shuffle_step:n #1
 {
  { \exp_not:v 
    { l_jfbu_ \__intarray_item_fast:Nn \g_jfbu_intarray { #1 } _tl } } ,
 }

\cs_generate_variant:Nn \seq_set_from_clist:Nn { Nx }

\pdfresettimer
\seq_set_from_clist:Nx \g_my_seq_varvar
   {
    \int_step_function:nnnN { 1 } { 1 } { \g_tmpa_int } \__jfbu_shuffle_step:n
   }
\edef\T{\the\dimexpr\pdfelapsedtime sp\relax}
\show\T

% Fourth variant (but abuses knowledge of internal expl3 seq structure)

\cs_new:Npn \__jfbu_shuffle_step_var:n #1
 {
   \exp_not:N \__seq_item:n 
    { \exp_not:v 
      { l_jfbu_ \__intarray_item_fast:Nn \g_jfbu_intarray { #1 } _tl } 
    }
 }

\pdfresettimer
\cs_set_nopar:Npx \g_my_seq_varvarvar
   {
    \s__seq
    \int_step_function:nnnN { 1 } { 1 } { \g_tmpa_int } \__jfbu_shuffle_step_var:n
   }
\edef\T{\the\dimexpr\pdfelapsedtime sp\relax}
\show\T

% \show\g_my_seq
% \show\g_my_seq_var
% \show\g_my_seq_varvar
% \show\g_my_seq_varvarvar

\ifx\g_my_seq_var \g_my_seq\else \ERROR\fi
\ifx\g_my_seq_varvar \g_my_seq\else \ERROR\fi
\ifx\g_my_seq_varvarvar \g_my_seq\else \ERROR\fi

\ExplSyntaxOff    
\end{document}

The same seed does not produce same results as in the non-expl3 part of the answer, presumably from the fact expl3 adds its own twists/improvements to random number generation. See @JosephWright's comment

enter image description here


enter image description here


This is initial answer, which focused on comma separated values both on input and output (and it complains at various locations that this is not optimal storage).

It does not expand the items of the comma separated list, for example:

\def\apple{\Error}
\def\banana{\Error}
\def\dog{\Error}

\SetToShuffledCSV\foo{ \apple , \banana , strawberry , raspberry , 
                        chuckberry , pear , cherry, apricot, 
                        cat , \dog }

\show\foo

\SetToShuffledCSVExpandOnce\foo{\foo}

\show\foo

\SetToShuffledCSVExpandOnce\foo{\foo}

\show\foo

prints in the console: (user hits return key at each ? prompt)

> \foo=macro:
->\banana , cat, strawberry, raspberry, \dog , cherry, pear, \apple , chuckberr
y, apricot.
l.83 \show\foo

? 
> \foo=macro:
->\dog , \apple , cat, \banana , pear, strawberry, raspberry, apricot, cherry, 
chuckberry.
l.87 \show\foo

? 
> \foo=macro:
->apricot, pear, chuckberry, \banana , \apple , raspberry, cat, strawberry, che
rry, \dog .
l.91 \show\foo

? 

(the spaces after \banana, \apple, \dog are not space tokens in the \foo contents but TeX always add such a space with \show).

The first example was with an explicit comma separated list. A variant of \SetToShuffledCSV was then used to handle \foo as input which needs to be expanded once.

I also need another variant to "f-expand" the argument, for the examples.

Of course it would be better to re-frame all of this using the expl3 language (the business about variants illustrates it!), and I do apologize I did not do so ; I thought Henri's answer had all I needed to copy over, but then I realized I was lacking some knowledge and would need to dig into the documentation in order to provide a user interface with expl3 "clist" and other types.

As re-iterated in code comments below it would be more efficient to use all the way an "array" type of data (as produced by xinttools's \xintAssignArray).

I was triggered by Henri's comment abour readability, I think plain old TeX is very readable and here we go.

\documentclass{article}
\usepackage{xinttools}

\newcount\cnt
\newcount\cnti
\newcount\cntj
\newcount\cntk

\font\czzc=cmr10 at 666sp
\fontdimen10000\czzc = 0sp % make room ...
% cf  texmf.cnf, we could use 5000000 for example:
% Words of font info for TeX (total size of all TFM files, approximately).
% Must be >= 20000 and <= 147483647 (without tex.ch changes).
% font_mem_size = 8000000

% no \fontdimen0, hence needs indexing starting at 1
\newcommand\GenShuffle{% important: called with \cnt holding length = N
% prepare 1, 2, ..., N
   \cnti 1
   \xintloop
      \fontdimen\cnti\czzc=\cnti sp
   \ifnum\cnti<\cnt
      \advance\cnti 1
   \repeat
% \cnti holds also N here. Now implement:
%    for i = #list,2,-1 do
%       local j = math.random(i)
%       list[i], list[j] = list[j], list[i]
%    end
   \xintloop
     \cntj=\numexpr 1+\pdfuniformdeviate\cnti\relax % random from 1 to i (incl.)
     \cntk=\fontdimen\cnti\czzc                % store "list[i]"
     \fontdimen\cnti\czzc=\fontdimen\cntj\czzc % set "list[i]" to "list[j]"
     \fontdimen\cntj\czzc=\cntk sp             % set "list[j]" to "list[i]"
   \advance\cnti -1
   \ifnum\cnti>1 
   \repeat
}
\newcommand\ExpandOnlyOnce[1]{\unexpanded\expandafter{#1}}
\newcommand\SetToShuffledCSV[2]{%
% #1 is a macro which will hold the new csv-list
% #2 is a csv list (it is NOT expanded in any way)
% we convert it to an "array" and then back
% Manipulating braced items is more congenial to xinttools
% than comma separated lists, but let's use csv nevertheless
    \xintAssignArray\xintCSVtoListNoExpand{#2}\to\ShArray
    \cnt\ShArray{0} % number of items
    \GenShuffle     % generate random permutation
    \edef#1{\xintiloop[1+1]
            \expandafter\ExpandOnlyOnce
              \csname ShArray\number\fontdimen\xintiloopindex\czzc\endcsname
            \unless\ifnum\xintiloopindex=\cnt
            , % (space intentional)
            \repeat}%
}%
\newcommand\SetToShuffledCSVExpandOnce[2]{%
    \expandafter\SetToShuffledCSV\expandafter#1\expandafter{#2}}%
\newcommand\SetToShuffledCSVExpandFullFirst[2]{%
    \expandafter\SetToShuffledCSV\expandafter#1\expandafter{\romannumeral-`0#2}}%

\begin{document}

\pdfsetrandomseed 123456

\SetToShuffledCSV\foo{ 0,1,2,3,4,5,6,7,8,9 }

% \show\foo

First example: \foo

\pdfsetrandomseed 123456

\SetToShuffledCSV\foo{ apple , banana , strawberry , raspberry , 
                        chuckberry , pear , cherry, apricot, 
                        cat , dog }

Second First example: \foo

% overhead to get the csv list and then convert it again
% to braced items, and then to an "array".
% would be better to have "array" as data-type to start with

\SetToShuffledCSVExpandFullFirst\foo{\xintListWithSep{,}{\xintSeq{0}{99}}}

Second example: \foo

\SetToShuffledCSVExpandFullFirst\foo{\xintListWithSep{,}{\xintSeq{0}{499}}}

Third example: \foo

%\show\foo
\end{document}

Execution seems not too slow, but it could be much better if the main type was not comma separated values, because internally we use an "array" in the style of "xinttools", so there is conversion csv -> list of braced items -> array. Furthermore for the second and third example the csv is itself generated by a macro.

enter image description here

  • Thank you, @jfbu, for providing your TeX based solution! I have tested your code on a seq var containing 0,1,...,999 which takes about 0.4 s. I have also worked on my non-Lua version to improve its speed by reducing the number of direct operations on the Sequence variable, but haven't updated my answer yet. – AlexG Apr 23 '18 at 10:38
  • @jfbu For GenShuffle in expl3 syntax, see my answer ;-). – AlexG Apr 23 '18 at 10:57
  • @jfbu Ah, ok.The code in Manuels comment looks a bit complicated. Maybe it's only a matter of formatting. – AlexG Apr 23 '18 at 13:28
  • 1
    On the seed: there are some issues with the pdfTeX (etc.) RNG, which we code around such that we get a uniform distribution across the entire range. – Joseph Wright Apr 23 '18 at 13:44
  • @jfbu Okey, I added a full expl3 version of your commands to my answer, if you want to add that to your answer I will remove it from mine. I think it's easy to understand. In any case, you could clean your answer :) may be leave the full TeX and a full expl3 answer, or may be, original plus your final edition. I will remove my comments from this answer. – Manuel Apr 23 '18 at 17:04
  • @Manuel great. I think I need indeed to clean my answer but I propose you keep the final version, because you wrote almost all of the expl3 code in it!. For info: on 3000 items your code for reconstructing the "seq" takes about 0.045s on my computer versus 0.015s for \edef+\xintiloop way. This is in same ballpark roughly, of course expl3 can not be as perfect as xintiloop ;-) –  Apr 23 '18 at 18:12
  • @jfbu Ok to all. Out of curiosity, could you check the time if rather than creating a clist and then converting to seq, we made the seq directly? Just change \seq_set_from_clist:Nx #1 { in my answer to \cs_set_nopar:Npx #1 { \s__seq and change the definition of \__jfbu_shuffle_step:n to \exp_not:N \__seq_item:n { \exp_not:v { l_jfbu_shuffle_ \__intarray_item_fast:Nn \g_jfbu_intarray { #1 } _tl } }. – Manuel Apr 23 '18 at 18:55
  • 1
    @jfbu No, I think that's better to leave in a comment. expl3 idea is that you don't have to care for the implementation, and that would be more to do. clist is logical and I accept myself to create that from scratch. But it's nice to see that it's a decent improvement. – Manuel Apr 23 '18 at 19:14
  • @jfbu recursion is even faster than \xintiloop and \__int_array* methods :-) – AlexG Apr 25 '18 at 13:51
  • @jfbu now mine is again faster (~25 %) ;-) – AlexG Apr 27 '18 at 11:02
  • Oh, I apologize. This change was @Manuel 's suggestion. – AlexG Apr 27 '18 at 11:43
  • @AlexG yeah, let's blame @Manuel I concur ;-) –  Apr 27 '18 at 12:08
  • 1
    In the latest expl3 (perhaps not yet released?) l3intarray is public, and to construct your sequence you can try \seq_set_from_function:NnN #1 { \int_step_function:nN \l_tmpa_int } \__seq_jfbuild:n with \cs_new:Npn \__seq_jfbuild:n #1 { \exp_not:v { item_ #1 } }. – Bruno Le Floch Apr 27 '18 at 12:24
  • Bon appetit, :-)! – AlexG Apr 27 '18 at 12:30
  • Another option is to build a list with some separator then to use \seq_set_split:Nnn which has been around for a while. – Bruno Le Floch Apr 27 '18 at 15:14
  • @BrunoLeFloch Manuel used \seq_set_from_clist:Nx initially with an \int_step_function:nnnN internally to loop over indices and build a cslist, then converted into a seq. Is that what you hint at with \seq_set_split:Nnn. This Manuel method (who was my first expl3 teacher) was intrinsic but a bit slower than the more direct code which knows how a seq looks like. –  Apr 27 '18 at 17:25
  • @jfbu Yes, that's what I was suggesting. I suspect that the slowness comes largely from \int_step_function. By the way, I'm trying out various implementations to include in expl3; could you tell me which of your chunks of code I should compare with what I have now? (I can shuffle 50000 items in about 5 seconds but perhaps my computer is just faster.) – Bruno Le Floch Apr 28 '18 at 16:10
  • @BrunoLeFloch Didn't know about \seq_set_from_function:, nice one. I like the things that you keep adding to latex3 :) – Manuel Apr 29 '18 at 16:08
  • I've tested the LuaTeX solution for very large sequences (say, 10000 items) and it seems surprisingly slow (several seconds). Am I messing up somewhere? Also, see my new implementation mentioned below. – Bruno Le Floch Apr 29 '18 at 21:29
  • @BrunoLeFloch I think you intended to post that comment to this other answer –  Apr 29 '18 at 21:45
4

2nd UPDATE

Challenged by the speed of user Jfbu's method, I did some measurements on intermediate steps of my previous code. It turned out that shuffling itself is very fast, but most of the time was spent on rebuilding the sequence variable from the shuffled items using the standard expl3 function \seq_put_left:Nn.

Therefore, \seq_put_left:Nn is replaced by a macro based on recursion, also making use of knowledge about the internals of an expl3 sequence variable (figured out by applying \show on such a var). An input stack overflow error occurring with large sequences could be fixed thanks to ↗this answer, and a ↗comment by user Manuel.

The code is readable, compact, purely expl3, and very fast.

Complete example:

\documentclass{article}
\usepackage{expl3}\ExplSyntaxOn

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% shuffling acc to Durstenfeld-Knuth
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\cs_new_protected:Npn\seq_shuffle:N#1{
  \int_zero_new:N\l_items_int
  %count and convert seq elements into numbered tokenlist vars
  \seq_map_inline:Nn#1{
    \int_incr:N\l_items_int
    \tl_set:cn{item_\int_use:N\l_items_int}{##1}
  }
  %shuffle seq elements
  \int_step_inline:nnnn{\l_items_int}{-1}{2}{
    \tl_set:Nx\l_j_tl{\int_rand:nn{1}{##1}}
    \tl_set_eq:Nc\l_tmpa_tl{item_##1}
    \tl_set_eq:cc{item_##1}{item_\l_j_tl} \tl_set_eq:cN{item_\l_j_tl}\l_tmpa_tl
  }
  %rebuild seq variable "the hard way" by recursion
  \tl_set:Nx#1{\s__seq\_seq_build:w 1\q_stop}
}
\cs_new:Npn\_seq_build:w #1\q_stop{
  \if_int_compare:w#1>\l_items_int
    \exp_after:wN\use_none_delimit_by_q_stop:w
  \else:
    \exp_not:N\__seq_item:n{\tl_use:c{item_#1}}
  \fi:
  \exp_after:wN\_seq_build:w\int_use:N\__int_eval:w #1+1\__int_eval_end:\q_stop
}  
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\seq_new:N\g_my_seq
\sys_gset_rand_seed:n{ 123456 }

\begin{document}
\typeout{++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++}
%shuffle seq with 1000 items
\int_step_inline:nnnn{0}{1}{999}{
  \seq_put_right:Nn\g_my_seq{#1}
}
\pdfresettimer \seq_shuffle:N\g_my_seq
\typeout{I~shuffled~\int_use:N\l_items_int\space~items~in~\dim_to_decimal:n{\pdfelapsedtime sp}~s.}

\seq_use:Nnnn\g_my_seq{~and~}{,~}{,~and~}\par

%shuffle seq with 3000 items
\seq_clear:N\g_my_seq
\int_step_inline:nnnn{0}{1}{2999}{
  \seq_put_right:Nn\g_my_seq{#1}
}
\pdfresettimer \seq_shuffle:N\g_my_seq
\typeout{I~shuffled~\int_use:N\l_items_int\space~items~in~\dim_to_decimal:n{\pdfelapsedtime sp}~s.}
\seq_use:Nnnn\g_my_seq{~and~}{,~}{,~and~}
\typeout{++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++}
\end{document}

1st UPDATE

Well, this is my humble expl3-based attempt of implementing an "in-place" shuffling code (Durstenfeld-Knuth method), modifying the original sequence variable.

The first version of the code operated directly on both ends of the sequence variable passed as an argument to \seq_shuffle:N. But it turned out to be painfully slow.

The new version, still quite compact, converts the given Seq var into token list variables, which are much faster to be accessed. Now, shuffling is about as fast as user jfbu's solution.

%shuffling acc to Durstenfeld-Knuth
\cs_generate_variant:Nn\seq_pop_left:NN{Nc}
\cs_new_protected:Npn\seq_shuffle:N#1{
  \tl_set:Nx\l_seq_count_tl{\seq_count:N#1}
  \int_step_inline:nnnn{1}{1}{\l_seq_count_tl}{
    \seq_pop_left:Nc#1{item_##1}
  }
  \int_step_inline:nnnn{\l_seq_count_tl}{-1}{2}{
    \tl_set:Nx\l_j_tl{\int_rand:nn{1}{##1}}
    \tl_set:Nv\l_tmpa_tl{item_##1}
    \tl_set:cv{item_##1}{item_\l_j_tl} \tl_set:cV{item_\l_j_tl}\l_tmpa_tl
    \seq_put_left:Nv#1{item_##1}
  }
  \seq_put_left:Nv#1{item_1}
}

Complete example:

\documentclass{article}
\usepackage{expl3}\ExplSyntaxOn

%shuffling acc to Durstenfeld-Knuth
\cs_generate_variant:Nn\seq_pop_left:NN{Nc}
\cs_new_protected:Npn\seq_shuffle:N#1{
  \tl_set:Nx\l_seq_count_tl{\seq_count:N#1}
  \int_step_inline:nnnn{1}{1}{\l_seq_count_tl}{
    \seq_pop_left:Nc#1{item_##1}
  }
  \int_step_inline:nnnn{\l_seq_count_tl}{-1}{2}{
    \tl_set:Nx\l_j_tl{\int_rand:nn{1}{##1}}
    \tl_set:Nv\l_tmpa_tl{item_##1}
    \tl_set:cv{item_##1}{item_\l_j_tl} \tl_set:cV{item_\l_j_tl}\l_tmpa_tl
    \seq_put_left:Nv#1{item_##1}
  }
  \seq_put_left:Nv#1{item_1}
}

%initialize ordered seq
\seq_new:N\g_my_seq
\int_step_inline:nnnn{999}{-1}{0}{
  \seq_put_left:Nn\g_my_seq{#1}
}

%shuffle
\seq_shuffle:N\g_my_seq

\begin{document}
\seq_use:Nnnn\g_my_seq{~and~}{,~}{,~and~}
\end{document}

Old version:

It looks quite compact, yet three additional token-list variables need to be defined for internal use. On the other hand I have no idea about its efficiency in terms of time; "pop" and "push" operations have to be used repeatedly on both ends of the sequence in order to isolate a randomly chosen sequence item.

\cs_new_protected:Npn\seq_shuffle:N#1{
  \int_step_inline:nnnn{\seq_count:N#1}{-1}{2}{
    \tl_set:Nx\l_j_tl{\int_rand:nn{1}{##1}}
    \int_step_inline:nnnn{1}{1}{\l_j_tl-1}{
      \seq_pop_left:NN#1\l_tmpa_tl \seq_put_right:NV#1\l_tmpa_tl
    }
    \seq_pop_left:NN#1\l_chosen_tl
    \int_step_inline:nnnn{1}{1}{\l_j_tl-1}{
      \seq_pop_right:NN#1\l_tmpa_tl \seq_put_left:NV#1\l_tmpa_tl
    }
    \seq_put_right:NV#1\l_chosen_tl
  }
  %transfer remaining element to the destination sequence
  \seq_pop_left:NN#1\l_chosen_tl \seq_put_right:NV#1\l_chosen_tl
}

Complete example:

\documentclass{article}
\usepackage{expl3}\ExplSyntaxOn

\cs_new_protected:Npn\seq_shuffle:N#1{
  \int_step_inline:nnnn{\seq_count:N#1}{-1}{2}{
    \tl_set:Nx\l_j_tl{\int_rand:nn{1}{##1}}
    \int_step_inline:nnnn{1}{1}{\l_j_tl-1}{
      \seq_pop_left:NN#1\l_tmpa_tl \seq_put_right:NV#1\l_tmpa_tl
    }
    \seq_pop_left:NN#1\l_chosen_tl
    \int_step_inline:nnnn{1}{1}{\l_j_tl-1}{
      \seq_pop_right:NN#1\l_tmpa_tl \seq_put_left:NV#1\l_tmpa_tl
    }
    \seq_put_right:NV#1\l_chosen_tl
  }
  \seq_pop_left:NN#1\l_chosen_tl \seq_put_right:NV#1\l_chosen_tl
}

%initialize ordered seq
\seq_gset_from_clist:Nn\g_my_seq{0,1,2,3,4,5,6,7,8,9}
\seq_shuffle:N\g_my_seq

\begin{document}
\seq_use:Nnnn\g_my_seq{~and~}{,~}{,~and~}
\end{document}
AlexG
  • 54,894
  • The updated code us much faster. – AlexG Apr 23 '18 at 11:27
  • Is there any reason to use tl rather than int for integers? And note that once you have every item with tl, you don't need \tl_set: but \tl_set_eq: suffices. – Manuel Apr 25 '18 at 14:29
  • Oh my, replacing with \tl_set_eq: makes it even faster. Thank you! Just wanted to avoid using an integer register. – AlexG Apr 25 '18 at 14:50
  • @Manuel : The only problem with recursion : The maximum number of seq items is 4994, otherwise I get ! TeX capacity exceeded, sorry [input stack size=5000]. – AlexG Apr 25 '18 at 14:54
  • @AlexG +1 but what you call recursion is quite the opposite of what people in TeX call "tail recursion". You are leaving tokens up in the token stream (stressing the input save stack). This is indeed very fast, for example in xinttools, \xintSeq does that when it has no optional argument. And it has the 5000 limit. At any rate good point about not needing the intarray. You are of course absolutely right and this is already a win. –  Apr 25 '18 at 16:41
  • \tl_use:c{item_#1} will not prevent expansion so this can not be used with a sequence whose items contain arbitrary tokens. I used \exp_not:v {item_#1} after I had learned expl3 ;-) –  Apr 27 '18 at 10:47
  • Thank you, didn't know that trick! Fortunately, it does not slow down the code. – AlexG Apr 27 '18 at 11:28
  • In the latest expl3, rather than abusing internals you can try \seq_set_from_function:NnN #1 { \int_step_function:nN \l_tmpa_int } \__seq_jfbuild:n with \cs_new:Npn \__seq_jfbuild:n #1 { \exp_not:v { item_ #1 } } (comment copied from the same comment I made on jfbu's answer). – Bruno Le Floch Apr 27 '18 at 12:25
  • Thank you, Bruno! Is it already on CTAN? I will try this... – AlexG Apr 27 '18 at 12:36
  • @AlexG finally I don't use \exp_not:v. See my latest update: again faster ;-). –  Apr 29 '18 at 11:44
  • @AlexG texlive 2017, complete example of 2nd update does not compile here: :42: Undefined control sequence. l.42 \pdfresettimer \seq_shuffle:N\g_my_seq ./shuffle-zahlen.tex:43: Undefined control sequence. <argument> \pdfelapsedtime sp l.43 ...in~\dim_to_decimal:n{\pdfelapsedtime sp}~s.} – Keks Dose Apr 29 '18 at 20:57
  • @keksdose need to test this, though I think I didn't use anything beyond frozen tl17 – AlexG Apr 30 '18 at 05:22
  • @KeksDose Perhaps you used lualatex? – AlexG Apr 30 '18 at 07:03
  • @AlexG Sorry, yes. With pdfLaTeX it is fast: I shuffled 9000 items in 0.20052 s. – Keks Dose Apr 30 '18 at 07:22