3

What is the proper way to generate punycode URIs using either \url or \href?

MWE using lualatex:

 \documentclass{article}
 \usepackage[utf8]{inputenc}
 \usepackage[unicode=true]{hyperref}
 \begin{document}
 \href{http://π.example.com/}{http://π.example.com/}
 \end{document}

The PDF generates as "http://.example.com/" instead of the punycoded "http://xn--1xa.example.com". I need the link to show up in the language of my target audience inside of their PDF readers, so they should be able to see http://π.example.com/ when they hover over the link. Using the percent-code shows up correctly in the pdf viewer:

 http://\%CF\%80.example.com/

but takes you to %CF%80.example.com rather than xn--1xa.example.com.

Jon
  • 474
  • 2
  • 11
  • it's not that they are not converted properly, it's that hyperref does not claim to, and does not, implement punicode at all. It might be an interesting exercise, but it is not done. For now just use some punicode convertor and add the encoded URL directly – David Carlisle Jun 19 '17 at 19:34

1 Answers1

6

Hyperref doesn't currently implement punycode but you can use any online encoder and add the string explicitly for example

enter image description here

\documentclass{article}
 %\usepackage[utf8]{inputenc}
 \usepackage[unicode=true]{hyperref}
\usepackage{fontspec}
\setmainfont{Arial}

 \begin{document}

1  \href{http://xn--1xa.example.com}{http://π.example.com/}

 \end{document}

Note I removed inputenc which should not be used with luatex, and swiched to a font that has greek and latin.

Note as you hinted you are using luatex you could in principle use this

https://github.com/HalosGhost/lua-punycode/tree/master/src

But it seemed to be expecting a slightly different version of utf8 library than luatex has in unicode.utf8 so I couldn't get it to work just now.

David Carlisle
  • 757,742
  • Thanks for taking the time to write this! It is very helpful and is the solution I had to implement. While this doesn't pass the Unicode characters to the URI handler (so punycoding is offloaded to the application that opens the URI), this appears to be as good as it gets in pdf documents. I couldn't even create a link to a Unicode URI in Adobe Pro. – Jon Jul 12 '17 at 18:14
  • @Jon actually your comment leaves me confused, if you wanted the unicode displayed in the document and a unicode URI passed to the URL handler why did you need punycode at all? I thought you wanted to pass the punycoded string as the url? – David Carlisle Jul 12 '17 at 18:39