3

I want to typeset a table with regular expressions in one column of the table. The data that I want to put in the table is below:

enter image description here

I have tried with following code:

\newcolumntype{e}{>{\hsize=0.02\hsize}X}
\newcolumntype{s}{>{\hsize=0.18\hsize}X}
\newcolumntype{b}{>{\hsize=0.80\hsize}X}
\begin{table} [!htb]
%\small
\footnotesize
\caption{List of Regex Patterns to filter GitHub repositories.}
\label{regex-list}
\begin{tabularx}{\columnwidth} {|e | s | b |}
 \hline
 \multicolumn{1}{|c|}{\textbf{ID}} &
  \multicolumn{1}{c|}{\textbf{Secret Type}} &
  \multicolumn{1}{c|}{\textbf{Regular Expression}}\\
 \hline \hline
 65 & AWS API Secret & \b([A-Za-z0-9+/]{40})[ \r\n'"\x60] \\ \hline
 71 & Azure Client Secret & (?i)(%s).{0,20}([a-z0-9_\.\-~]{34}) \\ \hline
 278 & Generic Pattern & (?i)(?:pass|token|cred|secret|key)(?:.|[\n\r]){0,40}(\b[\x21-\x7e]{16,64}\b) \\ \hline
 605 & Slack Token &   (xoxb$\vert$xoxp$\vert$xapp$\vert$xoxa$\vert$xoxr)-[0-9]10,13[a-zA-Z0-9]* \\ \hline
 640 & Stripe API Key & [rs]k_live_[a-zA-Z0-9]{20,30} \\ \hline
\end{tabularx}
\end{table}

But, the table is broken and the regular expressions are not shown properly.

enter image description here

Then I tried adding \verb command like below:

\newcolumntype{e}{>{\hsize=0.02\hsize}X}
\newcolumntype{s}{>{\hsize=0.18\hsize}X}
\newcolumntype{b}{>{\hsize=0.80\hsize}X}
\begin{table} [!htb]
%\small
\footnotesize
\caption{List of Regex Patterns to filter GitHub repositories.}
\label{regex-list}
\begin{tabularx}{\columnwidth} {|e | s | b |}
 \hline
 \multicolumn{1}{|c|}{\textbf{ID}} &
  \multicolumn{1}{c|}{\textbf{Secret Type}} &
  \multicolumn{1}{c|}{\textbf{Regular Expression}}\\
 \hline \hline
 65 & AWS API Secret & \verb/\b([A-Za-z0-9+/]{40})[ \r\n'"\x60]/ \\ \hline
 71 & Azure Client Secret & \verb/(?i)(%s).{0,20}([a-z0-9_\.\-~]{34})/ \\ \hline
 278 & Generic Pattern & \verb/(?i)(?:pass|token|cred|secret|key)(?:.|[\n\r]){0,40}(\b[\x21-\x7e]{16,64}\b)/ \\ \hline
 605 & Slack Token &  \verb/(xoxb$\vert$xoxp$\vert$xapp$\vert$xoxa$\vert$xoxr)-[0-9]10,13[a-zA-Z0-9]*/ \\ \hline
 640 & Stripe API Key & \verb/[rs]k_live_[a-zA-Z0-9]{20,30}/ \\ \hline
\end{tabularx}
\end{table}

But, still the regular expressions are not shown properly.

enter image description here

What is a good approach to typesetting regular expressions in a LaTeX table?

Mico
  • 506,678
Setu Kumar Basak
  • 541
  • 3
  • 5
  • 11

3 Answers3

5

I suggest you (a) load the xurl package and employ its \path macro to typeset the regex strings and (b) execute \catcode 37=11 before \begin{tabularx} in order to remove the TeX-special nature of the % symbol (ASCII code: 37).

The argument of \path can get line-broken at arbitrary places. Most TeX-special characters -- including {, }, \ (backslash) and _ (underscore) -- can be handled without fuss by \path. AFAICT, the only TeX-special character that can cause drama in the argument of \path is %. That's why it's necessary to perform step (b) above -- unless, of course, none of the regex strings contain the % character to begin with. (In your table, though, the % character does occur.)

Note that because the TeX-special meaning of % -- the start of a comment -- gets disabled by action (b) above, TeX-style comments are not allowed within the scope of \catcode 37=11. Here, the scope ends at \end{table}.

enter image description here

\documentclass{article}
\usepackage{tabularx}
\usepackage{xurl} % allow line breaks at arbitrary locations
\begin{document}

\begin{table} [!htb] \setlength\extrarowheight{2pt} % for a less cramped "look" \caption{List of Regex Patterns to filter GitHub repositories.\strut} \label{regex-list} % Assign category code 11 ("other") to "%" symbol: \catcode 37=11 \begin{tabularx}{\columnwidth} {| l | l | X |} \hline ID & Secret Type & Regular Expression \ \hline \hline 65 & AWS API Secret & \path{\b([A-Za-z0-9+/]{40})[ \r\n'"\x60]} \ \hline 71 & Azure Client Secret & \path{(?i)(%s).{0,20}([a-z0-9_.-~]{34})} \ \hline 278 & Generic Pattern & \path{(?i)(?:pass|token|cred|secret|key)(?:.|[\n\r]){0,40}(\b[\x21-\x7e]{16,64}\b)} \ \hline 605 & Slack Token & \path{(xoxb|xoxp|xapp|xoxa|xoxr)-[0-9]10,13[a-zA-Z0-9]*} \ \hline 640 & Stripe API Key & \path{[rs]k_live_[a-zA-Z0-9]{20,30}} \ \hline \end{tabularx} \end{table}

\end{document}

Mico
  • 506,678
1

For learning purpose I'd point out what went wrong in your original attempts.

Attempt 1. escape characters manually

You need to escape all the characters, and for the pipe symbol either using T1 font encoding (recommended!) or escape them manually.

Refer to

Anyway, the following code works

\documentclass{article}
\usepackage{tabularx}
\begin{document}

\newcolumntype{e}{>{\hsize=0.02\hsize}X} \newcolumntype{s}{>{\hsize=0.18\hsize}X} \newcolumntype{b}{>{\hsize=0.80\hsize}X} \begin{table} [!htb] %\small \footnotesize \caption{List of Regex Patterns to filter GitHub repositories.} \begin{tabularx}{\columnwidth} {|e | s | b |} \hline \multicolumn{1}{|c|}{\textbf{ID}} & \multicolumn{1}{c|}{\textbf{Secret Type}} & \multicolumn{1}{c|}{\textbf{Regular Expression}}\ \hline \hline 65 & AWS API Secret & \textbackslash b([A-Za-z0-9+/]{40})[ \textbackslash r\textbackslash n'"\textbackslash x60] \ \hline 71 & Azure Client Secret & (?i)(%s).{0,20}([a-z0-9_\textbackslash.\textbackslash-~{}]{34}) \ \hline 278 & Generic Pattern & (?i)(?:pass|token|cred|secret|key)(?:.|[\textbackslash n\textbackslash r]){0,40}(\textbackslash b[\textbackslash x21-\textbackslash x7e]{16,64}\textbackslash b) \ \hline 605 & Slack Token & (xoxb$\vert$xoxp$\vert$xapp$\vert$xoxa$\vert$xoxr)-[0-9]10,13[a-zA-Z0-9]* \ \hline 640 & Stripe API Key & [rs]k_live_[a-zA-Z0-9]{20,30} \ \hline \end{tabularx} \end{table}

\begin{table} [!htb] %\small \footnotesize \caption{List of Regex Patterns to filter GitHub repositories.} \begin{tabularx}{\columnwidth} {|e | s | b |} \hline \multicolumn{1}{|c|}{\textbf{ID}} & \multicolumn{1}{c|}{\textbf{Secret Type}} & \multicolumn{1}{c|}{\textbf{Regular Expression}}\ \hline \hline 65 & AWS API Secret & \texttt{\textbackslash b([A-Za-z0-9+/]{40})[ \textbackslash r\textbackslash n'"\textbackslash x60]} \ \hline 71 & Azure Client Secret & \texttt{(?i)(%s).{0,20}([a-z0-9_\textbackslash.\textbackslash-~{}]{34})} \ \hline 278 & Generic Pattern & \texttt{(?i)(?:pass|token|cred|secret|key)(?:.|[\textbackslash n\textbackslash r]){0,40}\discretionary{}{}{}(\textbackslash b[\discretionary{}{}{}\textbackslash x21-\textbackslash x7e]{16,64}\textbackslash b)} \ \hline 605 & Slack Token & \texttt{(xoxb$\vert$xoxp$\vert$xapp$\vert$xoxa$\vert$xoxr)-[0-9]10,13[a-zA-Z0-9]*} \ \hline 640 & Stripe API Key & \texttt{[rs]k_live_[a-zA-Z0-9]{20,30}} \ \hline \end{tabularx} \end{table}

\end{document}

Attempt 2. use \verb

It's extremely difficult to escape all these characters correctly, however. As such, you can attempt to use \verb.

However,

  • \verb is explicitly documented to be unsupported in the tabularx documentation, so this is only a hack. Refer to how to use fancyvrb Verbatim in tabularx?
  • In more recent version there's "partial support", but ! cannot be used as a delimiter.
\begin{tabularx}{2cm}{c}
\verb|a|
\verb|\\\a\b|  % some spaces are managed
%\verb!a!  % breaks!
\verb/a/
\end{tabularx}

Anyway, to fix the \verb problem I use my package cprotectinside, and you have to undo the hack by tabularx, so the end result is

\documentclass{article}
\usepackage{tabularx}
\usepackage{cprotectinside}
\begin{document}

\newcolumntype{e}{>{\hsize=0.02\hsize}X} \newcolumntype{s}{>{\hsize=0.18\hsize}X} \newcolumntype{b}{>{\hsize=0.80\hsize}X}

\let\normalverb\verb

\cprotectinside{@}{ \begin{table} [!htb] %\small \footnotesize \caption{List of Regex Patterns to filter GitHub repositories.} \begin{tabularx}{\columnwidth} {|e | s | b |} \hline \multicolumn{1}{|c|}{\textbf{ID}} & \multicolumn{1}{c|}{\textbf{Secret Type}} & \multicolumn{1}{c|}{\textbf{Regular Expression}}\ \hline \hline

65 & AWS API Secret & @\normalverb\b([A-Za-z0-9+/]{40})[ \r\n'"\x60]@ \ \hline 71 & Azure Client Secret& @\normalverb(?i)(%s).{0,20}([a-z0-9_\.\-~]{34})@ \ \hline 278& Generic Pattern & @\normalverb(?i)(?:pass|token|cred|secret|key)(?:.|[\n\r]){0,40}(\b\discretionary{}{}{}\normalverb[\x21-\x7e]{16,64}\b)@ \ \hline 605& Slack Token & @\normalverb(xoxb|xoxp|xapp|xoxa|xoxr)-[0-9]10,13[a-zA-Z0-9]*@ \ \hline 640& Stripe API Key & @\normalverb[rs]k_live_[a-zA-Z0-9]{20,30}@ \ \hline \end{tabularx} \end{table} }

\end{document}

By the way, be careful of warnings. It's not recommended to redefine b...

Package array Warning: Redefining primitive column b on input line 8.

but it's not the concern here.

Alternative: lstinline

You still need to escape things, but it is easier than the standard escaping.

\documentclass{article}
\usepackage{tabularx}
\usepackage{listings}
\begin{document}

\newcolumntype{e}{>{\hsize=0.02\hsize}X} \newcolumntype{s}{>{\hsize=0.18\hsize}X} \newcolumntype{b}{>{\hsize=0.80\hsize}X}

\let\normalverb\verb

% set font of lstinline to use texttt font \lstset{basicstyle=\ttfamily}

\begin{table} [!htb] %\small \footnotesize \caption{List of Regex Patterns to filter GitHub repositories.} \begin{tabularx}{\columnwidth} {|e | s | b |} \hline \multicolumn{1}{|c|}{\textbf{ID}} & \multicolumn{1}{c|}{\textbf{Secret Type}} & \multicolumn{1}{c|}{\textbf{Regular Expression}}\ \hline \hline

65 & AWS API Secret & \lstinline\\b([A-Za-z0-9+/]\{40\})[ \\r\\n'"\\x60] \ \hline 71 & Azure Client Secret& \lstinline(?i)(\%s).\{0,20\}([a-z0-9_\\.\\-~]\{34\}) \ \hline 278& Generic Pattern & \lstinline(?i)(?:pass|token|cred|secret|key)(?:.|[\n\r])\{0,40\}(\b\discretionary{}{}{}\lstinline[\\x21-\\x7e]\{16,64\}\\b) \ \hline 605& Slack Token & \lstinline(xoxb|xoxp|xapp|xoxa|xoxr)-[0-9]10,13[a-zA-Z0-9]* \ \hline 640& Stripe API Key & \lstinline[rs]k_live_[a-zA-Z0-9]\{20,30\} \ \hline \end{tabularx} \end{table}

\end{document}

user202729
  • 7,143
0

You could use the \regexp{} command that comes with the biblatex package:

\documentclass{article}
\usepackage{tabularx}
\usepackage{biblatex}
\begin{document}

\newcolumntype{e}{>{\hsize=0.02\hsize}X} \newcolumntype{s}{>{\hsize=0.18\hsize}X} \newcolumntype{b}{>{\hsize=0.80\hsize}X} \begin{table} [!htb] %\small \footnotesize \caption{List of Regex Patterns to filter GitHub repositories.} \label{regex-list} \begin{tabularx}{\columnwidth} {|e | s | b |} \hline \multicolumn{1}{|c|}{\textbf{ID}} & \multicolumn{1}{c|}{\textbf{Secret Type}} & \multicolumn{1}{c|}{\textbf{Regular Expression}}\ \hline \hline 65 & AWS API Secret & \regexp{\b([A-Za-z0-9+/]{40})[ \r\n'"\x60]} \ \hline 71 & Azure Client Secret & \regexp{(?i)(%s).{0,20}([a-z0-9_.-~]{34})} \ \hline 278 & Generic Pattern & \regexp{(?i)(?:pass|token|cred|secret|key)(?:.|[\n\r]){0,40}(\b[\x21-\x7e]{16,64}\b)} \ \hline 605 & Slack Token & \regexp{(xoxb|xoxp|xapp|xoxa|xoxr)-[0-9]10,13[a-zA-Z0-9]*} \ \hline 640 & Stripe API Key & \regexp{[rs]k_live_[a-zA-Z0-9]{20,30}} \ \hline \end{tabularx} \end{table}

\end{document}

The only character that needs escaping is %, as stated in the biblatex docs:

Perl escape sequences like \t for a tab, \n for a newline, \A for the start of a string or \d for a digit can be used, without TEX trying to execute them as commands, as can be special characters like ^, _ or {..} and #. Only the % must be protected: to match a single % in the bib, use % in the regular expression, a % is matched by \%.