5

Auto generate List of \url usages within document describes a method for generating an index of URLs that are used in the document, including page number. However, the URLs are sorted alphanumerically. I'd like to make a URL index that sorts them per page, in the order in which the links appear on the page. Furthermore, my links are actually in \hrefs and I would like to include the link text in the index:

p. 1
     Wikipedia... http://wikipedia.org
     SXE... http://stackexchange.com
p. 2
     Google... http://google.com

I've created a MWE following the instructions from Heiko Oberdiek in the post I linked to above (compile with xelatex links.tex, makeindex links-url, xelatex links.tex). Note, the output I currently get is, as expected (but not desired) in alphabetical order rather than page order:

enter image description here

\documentclass[12pt,article]{memoir}

\settrimmedsize{11in}{8.5in}{*}
\settrims{0in}{0in}
\settypeblocksize{9in}{6.5in}{*}
\setlrmargins{1in}{*}{*}
\setulmargins{1in}{*}{*}
\setheadfoot{\onelineskip}{2\onelineskip}
\setheaderspaces{*}{1.5\onelineskip}{*}

\checkandfixthelayout

\PassOptionsToPackage{hyphens}{url}
\usepackage[linktoc=all,frenchlinks,pdfborderstyle={/S/U/W .5},citebordercolor={1 1 1},linkbordercolor={1 1 1},urlbordercolor={1 1 1}]{hyperref}
\makeatletter
\g@addto@macro{\UrlBreaks}{\UrlOrds}
\makeatother

\usepackage{filecontents}

\begin{filecontents*}{\jobname-url.mst}
% Input style specifiers
keyword "\\urlentry"
% Output style specifiers
preamble "\\begin{theurls}"
postamble "\\end{theurls}\n"
group_skip ""
headings_flag 0
item_0 "\n\\urlitem{"
delim_0 "}{"
delim_t "}"
line_max 500
\end{filecontents*}

\usepackage{pdfescape}

\makeatletter
\newwrite\file@url
\openout\file@url=\jobname-url.idx\relax

\def\instring#1#2{TT\fi\begingroup
  \edef\x{\endgroup\noexpand\in@{#1}{#2}}\x\ifin@}

\newcommand*{\write@url}[1]{%
  \begingroup
    \EdefEscapeHex\@tmp{#1}%
    \protected@write\file@url{}{%
      \protect\urlentry{\@tmp}{\thepage}%
    }%
  \endgroup
}
\let\saved@hyper@linkurl\hyper@linkurl
\renewcommand*{\hyper@linkurl}[2]{%
  \write@url{#2}%
  \saved@hyper@linkurl{#1}{#2}%
}
\newcommand*{\listurlname}{List of URLs}
\newcommand*{\printurls}{%
  \InputIfFileExists{\jobname-url.ind}{}{}%
}
\newenvironment{theurls}{%
  \begin{center}{\Huge \listurlname}%
\vspace{.7in}\\ \end{center}
{All of these links were active as of \today.} \\
\quad \\ \noindent
  %\section*{\listurlname}%
  \@mkboth{\listurlname}{\listurlname}%
  \let\write@url\@gobble
  \ttfamily\footnotesize
  \raggedright
  \hspace{-1.4em}
}{%
  \par
}
\newcommand*{\urlitem}[2]{%
  \hangindent=1em
  \hangafter=1
  \begingroup
    \urlpages{#2}%
    \EdefUnescapeHex\@tmp{#1}%
    \expandafter\url\expandafter{\@tmp}%
  \endgroup
  \newline
}
\newcommand*{\urlpages}[1]{%
{%
\normalfont
\space(\if\instring{,}{#1}{pp.\space#1}\else{p.\space\hyperlink{page.#1}{#1}}\fi)}
}
\makeatother

\begin{document}
\href{http://wikipedia.org}{Wikipedia} \href{http://stackexchange.com}{SXE}  
\newpage
\href{http://google.com}{Google}

\section*{Index of URLs}
\printurls
\end{document}
Moriambar
  • 11,466
Joe Corneli
  • 4,340

2 Answers2

3

Since the URLs are not sorted alphabetically, sorting via Makeindex is not necessary. The page numbers only requires the label/reference system (two LaTeX runs).

The example stores for each URL the URL, text and page in a reference. Package zref allows arbitrary data fields in its references.

Then \UrlList prints the list of URLs. This can be configured by environment UrlListEnv and macros \UrlListPage and \UrlListItem. The example below uses a description environment with page description labels.

\documentclass{article}
\usepackage[colorlinks]{hyperref}

\usepackage{zref-base,zref-lastpage}
\usepackage{etexcmds}
\usepackage{pdfescape}

\makeatletter

% Help counters for numbering URLs
\newcounter{UrlList}
\newcounter{UrlListAux}
\renewcommand*{\theUrlList}{UrlList\the\value{UrlList}}
\renewcommand*{\theUrlListAux}{UrlList\the\value{UrlListAux}}

% Each URL gets a reference with text, url and page number.
\zref@newprop{UrlList@Text}{\UrlList@Text}
\zref@newprop{UrlList@HexLink}[3F3F]{\UrlList@HexLink}
\zref@newlist{UrlList}
\zref@addprops{UrlList}{UrlList@Text, UrlList@HexLink, page}
\zref@newprop{UrlList@Max}{\the\value{UrlList}}
\zref@addprops{LastPage}{UrlList@Max}

% Switch is needed in the list of URLs to disable URL recording.
\newif\ifUrlList@

% Hyperref internal is redefined to write the label with the URL data
\newcommand\saved@hyper@linkurl{}
\let\saved@hyper@linkurl\hyper@linkurl
\renewcommand{\hyper@linkurl}[2]{%
  % #1: text
  % #2: URL
  \ifUrlList@
  \else
    \begingroup
      \refstepcounter{UrlList}%
      \protected@edef\UrlList@Text{#1}%
      \@onelevel@sanitize\UrlList@Text
      \EdefEscapeHex\UrlList@HexLink{#2}%
      \zref@labelbylist{\theUrlList}{UrlList}%
    \endgroup
  \fi
  \saved@hyper@linkurl{#1}{#2}%
}

% \UrlList checks, whether are URL references and prints the list of URLs
\newcommand*{\UrlList}{%
  \zref@refused{LastPage}%
  \edef\UrlList@Max{%
    \zref@extractdefault{LastPage}{UrlList@Max}{-1}%
  }%
  \ifnum\UrlList@Max<0 %
    \@latex@warning@no@line{Rerun LaTeX to get list of URLs}%
  \else
    \UrlList@true
    \begin{UrlListEnv}%
      \let\UrlList@LastPage\@empty
      \setcounter{UrlListAux}{0}%
      \@whilenum\value{UrlListAux}<\UrlList@Max\do{%
        \stepcounter{UrlListAux}%
        \zref@refused{\theUrlListAux}%
        \zref@ifrefundefined{\theUrlListAux}{%
        }{%
          \EdefUnescapeHex\UrlList@Link{%
            \zref@extract{\theUrlListAux}{UrlList@HexLink}%
          }%
          \zref@def@extract\UrlList@Text{\theUrlListAux}{UrlList@Text}%
          \zref@def@extract\UrlList@Page{\theUrlListAux}{page}%
          \edef\UrlList@Next{%
            \noexpand\UrlListItem{%
              \etex@unexpanded\expandafter{\UrlList@Page}%
            }{%
              \etex@unexpanded\expandafter{\UrlList@Link}%
            }{%
              \etex@unexpanded\expandafter{\UrlList@Text}%
            }%
          }%
          \ifx\UrlList@Page\UrlList@LastPage
          \else
            \expandafter\UrlListPage\expandafter{\UrlList@Page}%
            \let\UrlList@LastPage\UrlList@Page
          \fi
          \UrlList@Next
        }%
      }%
    \end{UrlListEnv}%
  \fi
}
\makeatletter

% USER configuration

% Environment UrlListEnv surrounds the list of URLs, if
% URLs are available.
\newenvironment{UrlListEnv}{%
  \begin{description}%
}{%
  \end{description}%
}

% \UrlListPage{<page>}
% Sets the page header
\newcommand*{\UrlListPage}[1]{%
  \item[\hyperlink{page.#1}{Page #1}]\mbox{}%
}

% \UrlListItem{<page>}{<URL>}{<text>}
% Formats a URL entry
\newcommand*{\UrlListItem}[3]{%
  \\\relax#3 \dots\ \href{#2}{\nolinkurl{#2}}%
}  
\makeatother

\begin{document}
\href{http://wikipedia.org}{Wikipedia} \href{http://stackexchange.com}{SXE}  
\newpage
\href{http://google.com}{Google}

\section*{Index of URLs}
\UrlList
\end{document}

Result/page 2:

Result

Heiko Oberdiek
  • 271,626
  • The command \url is not defined anymore. Logic because there is no key-text for the index list. But often it may be usefull to display the url in the text and listed the url also in the index with key-information. Is there a way to make \url command working with two arguments (url + key-text) also to display url in the document and the same listing as \href in the index ? – Piroooh Apr 21 '20 at 15:05
1

After trying a few different approaches, here is a rather hacky solution that works -- although it requires some by-hand editing of the .ind file, so there's room for improvement!

\usepackage{makeidx}
\makeindex
\let\hrefold\href
\newcounter{IndexItemCounter}
\renewcommand*{\href}[2]{\stepcounter{IndexItemCounter}%
\hrefold{#1}{#2}\index{\thepage@{p. \thepage}%
!{\theIndexItemCounter}@{#2\ldots\ \url{#1}}}}

A few comments:

  • There's no real need for the standard indexing feature which produces page numbers following each entry (for now I just delete those lines from the .ind file).
  • # inside of URLs throws the indexing mechanism for a loop.

To make the display look reasonable I do a tiny bit of customization with memoir:

\renewcommand{\indexname}{Web Pages Cited in This Book}
\onecolindex
\printindex

I also added \raggedright at the top of the .ind file.

Joe Corneli
  • 4,340