14

How can one use \pdfmatch with regular expressions and where can one find a listing and description of all the available pdfTeX primitives and help on their use?

Edit

The latest issue of the pdfTeX documentation, now contains a small description with examples for the command.

yannisl
  • 117,160
  • 2
    Unfortunately, many of the pdftex primitives are not properly documented. The pdftex source code seems to be the only definitive reference. – Lev Bishop Oct 20 '10 at 18:05
  • @Yiannis: The documentation of Heiko Oberbiek's askinclude package has a section on how to use \pdfmatch. I have not yet studied this and the source code of pdftex, but I will when I get time. – Bruno Le Floch Jan 03 '11 at 21:30
  • 1
    http://groups.google.com/group/comp.text.tex/browse_thread/thread/23d3473d1bfbee80/9e5620948638f596?lnk=gst&q=pdfmatch# --- and --- http://groups.google.com/group/comp.text.tex/browse_thread/thread/fa3d97ea41a80294/edf0cfa6aaf16d24?lnk=gst&q=pdfmatch#edf0cfa6aaf16d24 --- have extra pointers (Heiko Oberdiek in both cases). Sorry don't know how to make links in comments. – Bruno Le Floch Jan 04 '11 at 22:30
  • @Bruno Thx for the links. (BTW, this is a Markown-enabled website, so [Markown](http://daringfireball.net/projects/markdown/syntax) allows you to put inline web links.) – chl Jan 22 '11 at 21:02

2 Answers2

13

See pdftex 1.30.0 announcement, in particular:

  - \pdfmatch [icase] [subcount <number>}] {<pattern>}{<string>}
    Implements pattern matching using the POSIX regex.
    It returns the same values as \pdfstrcmp, but with the following
    semantics: 
      -1: error case (invalid pattern, ...)
       0: no match
       1: match found
    Options:
    * icase: case insensitive matching
    * subcount: it sets the table size for found subpatterns.
      A number "-1" resets the table size to the start default.
  - \pdflastmatch <number>
    The result of \pdfmatch is stored in an array. The entry "0" contains
    the match, the following entries submatches. The positions of the
    matches are also available. They are encoded:
      <position> "->" <match string>
    The position "-1" with an empty string indicates that this entry is not
    set.
12

For regular expressions specifically, the l3regex package provides a cross-engine solution (needs eTeX and \strcmp).

For instance,

\RequirePackage{l3regex}
\ExplSyntaxOn
\regex_extract_all:nnN { \w+ } { Hello,~ world! } \l_foo_seq
\seq_show:N \l_foo_seq

shows Hello and world as the two items of the resulting sequence (list).

Here's a list of various questions of whose one answer uses l3regex