1

Special characters in the URL in a hyperref href generates a wrong URL

\documentclass{article}
\usepackage{hyperref}
\begin{document}
\href{https://www.målogmæle.dk/MoM-arkiv/MoM_36/MoM36_3.pdf}{En hvislen i bækken}
\end{document}

The link I am getting is:

https://www.m\unhbox \voidb@x \bgroup \let \unhbox \voidb@x \setbox \@tempboxa \hbox {a\global \mathchardef \accent@spacefactor \spacefactor }\let \begingroup \endgroup \relax \let \ignorespaces \relax \accent 23 a\egroup \spacefactor \accent@spacefactor logm\OT1\ae le.dk/MoM-arkiv/MoM_36/MoM36_3.pdf

The link is actually coming from a Bibtex file, but I suppose that is not the actual problem. The bibtex file is:

@Article{Q87401587,
  author =   {Sune Gregersen},
  title =    {En hvislen i bækken},
  journal =  {Mål \& mæle},
  year =     {2014},
  pages =    {5-8},
  URL =      {https://www.målogmæle.dk/MoM-arkiv/MoM\_36/MoM36\_3.pdf},
  wikidata = {Q87401587}
}

The latex file that is using the file is:

\documentclass{article}
\usepackage{hyperref}

\begin{document} \cite{Q87401587} \bibliographystyle{acl_natbib} \bibliography{tmp} \end{document}

Running Bibtex yields this .bbl file:

\begin{thebibliography}{}
\expandafter\ifx\csname natexlab\endcsname\relax\def\natexlab#1{#1}\fi

\bibitem[{Gregersen(2014)}]{Q87401587} Sune Gregersen. 2014. \newblock \href{https://www.målogmæle.dk/MoM-arkiv/MoM_36/MoM36_3.pdf}{En hvislen i bækken}. \newblock {\em Mål & mæle/} pages 5--8.

\end{thebibliography}

Is there a way to fix this?

3 Answers3

1

With a current LaTeX and an utf8 encoded input file that doesn't happen. There the link is as you see it. Put actually your link is then not correctly encoded, PDF requires percent encoding of special chars. So it is better to use either punycode as in the other answer, or this:

\documentclass{article}
\usepackage{hyperref}
\begin{document}

\href{https://www.m%C3%A5logm%C3%A6le.dk/MoM-arkiv/MoM_36/MoM36_3.pdf}{En hvislen i bækken} \end{document}

With the new pdf management, that you load with \DocumentMetadata{}, you can also hyperref let reencode the url:

\DocumentMetadata{}

\documentclass{article} \usepackage{hyperref} \begin{document} \hrefurl[urlencode]{https://www.målogmæle.dk/MoM-arkiv/MoM_36/MoM36_3.pdf}{En hvislen i bækken} \end{document}

Ulrike Fischer
  • 327,261
0

The keyword here is "punycode" and answered at Punycode (unicode) in domain names is not converted properly

The punycode-converted latex file will work:

\documentclass{article}
\usepackage{hyperref}
\begin{document}
\href{https://www.xn--mlogmle-exan.dk/MoM-arkiv/MoM_36/MoM36_3.pdf}{En hvislen i bækken}
\end{document}
0

I recommend you compile your document with either LuaLaTeX or XeLaTeX, i.e., one of the fully unicode-aware engines, instead of pdfLaTeX. If you make the switch, you'll get the output shown in the screenshot below, which I take is what you require.

Note that the sample code employs the plainnat bibliography style since my TeX distribution, MacTeX2022, doesn't have access to the acl_natbib bib style. Observe also that I've removed the backslash characters that somebody (the OP, maybe?) appears to have inserted before the _ characters in the URL string.

enter image description here

The entry in the bbl file looks like this:

\bibitem[Gregersen(2014)]{Q87401587}
Sune Gregersen.
\newblock En hvislen i bækken.
\newblock \emph{Mål \& mæle}, pages 5--8, 2014.
\newblock URL \url{https://www.målogmæle.dk/MoM-arkiv/MoM_36/MoM36_3.pdf}.

% !TEX TS-program = lualatex   %% or 'xelatex', if you prefer
\documentclass{article}
\begin{filecontents}[overwrite]{tmp.bib}
@Article{Q87401587,
  author =   {Sune Gregersen},
  title =    {En hvislen i bækken},
  journal =  {Mål \& mæle},
  year =     {2014},
  pages =    {5-8},
  URL =      {https://www.målogmæle.dk/MoM-arkiv/MoM_36/MoM36_3.pdf},
  wikidata = {Q87401587}
}
\end{filecontents}

\usepackage[numbers]{natbib} \bibliographystyle{plainnat} % I don't have the 'acl_natbib' bib style

\usepackage{xurl} % allow linebreaks at arbitrary points in URL strings \urlstyle{same} \usepackage[colorlinks,allcolors=blue]{hyperref}

\begin{document} \noindent \href{https://www.målogmæle.dk/MoM-arkiv/MoM_36/MoM36_3.pdf}{En hvislen i bækken}

\nocite{*} \bibliography{tmp} \end{document}

Mico
  • 506,678
  • 1
    even with lualatex you should percent encode an url. lualatex can not change the rules how urls should correctly be encoded in a pdf (even if many viewers handles that then nevertheless). – Ulrike Fischer Feb 03 '23 at 18:23