39

As discussed on How to make the correct hash-symbol in C Sharp, I chose to render the name of the C# language on my resume as C$^\sharp$

enter image description here

I'd like to keep that typographical touch. But I'm afraid that will make it impossible for automated tools to spot “C#” in my resume.

Is there a way to print C# with a sharp(♯) but have it behave as a hash(#) as far as the find and copy-paste features of a PDF reader are concerned?

I'm using LuaLaTeX.

  • 3
    The more common problem is likely to be that PotentialNewJobCo's HR dept is still using DerpSoft ResumeScraperPro 2005 which unable to extract data from anything that isn't a .doc file and will puke on the pdf just as totally as if you sent them raw tex. – Dan Is Fiddling By Firelight Sep 12 '13 at 14:58

2 Answers2

43

The PDF format knows a feature "ActualText" that allows to specify a replacement string for copy/paste:

\documentclass{article}
\usepackage{accsupp}
\newcommand*{\Csharp}{%
  C%
  \BeginAccSupp{
    method=hex,
    unicode=false,
    ActualText=23,
  }%
  $^\sharp$%
  \EndAccSupp{}%
}
\begin{document}
\Csharp
\end{document}

The feature is supported by AcrobatReader, but there are other PDF viewers without supporting it.

Heiko Oberdiek
  • 271,626
  • Works as expected in (at least) Evince 3.8.3 and Acrobat 9.4.1 (Linux), (the text 'C#' is copied). – alfC Sep 11 '13 at 20:45
  • Does it make any difference whether I specify the # your way or with method=plain, ActualText={\#}? The latter seems to be more easily understandable as source code. (Works with AR, doesn’t work with SumatraPDF and PDF-XChange Viewer.) And is there a reason you’re using the starred version of \newcommand? The new macro doesn’t even take an argument here. – doncherry Sep 11 '13 at 20:47
  • @doncherry, it works in the viewers I reported earlier. (and also in zathura 0.2.3 and poppler's pdftotext) – alfC Sep 11 '13 at 20:57
  • 1
    @doncherry: \# cannot be used, because it is not expandable and does not expand to a plain #. \ltx@hashchar of package ltxcmds could be used. But that needs \makeatletter and \makeatother. A naked # causes much trouble because of its catcode as parameter character. Thus, IMHO, method=hex is the easiest method for this case. – Heiko Oberdiek Sep 11 '13 at 21:11
  • @HeikoOberdiek I don’t fully understand expansion, but it does seem to me like \# works – I can search for and copy the # in the pdf just fine. – doncherry Sep 11 '13 at 21:15
  • what about \begingroup\catcode#=12 #\catcode#=6\endgroup? – Francis Sep 11 '13 at 21:16
  • 5
    @doncherry: It's sheer luck. In LaTeX \# is defined via \chardef and therefore it is not expanded, but written as the two bytes \ and #. Then the PDF viewer ignores the backslash, because it does not escape anything with a special meaning. Using non-obvious side effects in both TeX and the PDF format is not my understanding of "plain". – Heiko Oberdiek Sep 11 '13 at 22:05
  • 2
    @Francis: Changing the catcode of # allows a clean specification of ActualText=# with method=plain. However, the code for the catcode change(s) makes the code less readable. Therefore I decided to stick to method=hex. – Heiko Oberdiek Sep 11 '13 at 22:09
23

Here is a hack that uses \ooalign and tikz:

\documentclass{article} 
\usepackage{tikz}

\newcommand{\Chash}{\ooalign{\hidewidth\tikz\node[inner sep=0pt,opacity=0]{C\#};\cr C$^\sharp$ \cr}}

\begin{document}
I know \Chash and else.
\end{document}

Explain: \ooalign put a transparent C\# over C$^\sharp$ such that you can find it in the PDF. I believe this one does not depend on what PDF viewer you're using.

enter image description here

Update: Replacing \# with plain #, this can break \newcommand if you forget to switch the catcode back to 6:

\documentclass{article} 
\usepackage{tikz}

\catcode`#=12
\newcommand{\Chash}{\ooalign{\hidewidth\tikz\node[inner sep=0pt,opacity=0]{C#};\cr C$^\sharp$ \cr}}
\catcode`#=6

\begin{document}
I know \Chash and else.
\end{document}

Search in PDF:

enter image description here

Francis
  • 6,183