1

The internet is awash with questions and answers for producing a nice looking tilde. But my goal is to produce output (PDF, let's say) from which one can copy-paste into another application and get the same result as typing the ~ character from the keyboard. I don't care how it looks, I only care if it can be copy-pasted.

This is not a question about how to display a tilde symbol. \textasciitilde, \~{}, and $\sim$ all do that. However, the first two generate (UTF-8)CC83 = U+0303 = "combining tilde" and the last generates (UTF-8)E288BC = U+223C from the mathematical operators table. (As determined by pasting into emacs and using hexl-mode.)

It seems as though there should be a font-independent solution, considering that ~ is in ASCII (U+7E, to be specific), and therefore should be present in every font. The answer may be font (encoding) dependent, since ~ is not a required glyph for general text encodings (LaTeX font encodings section 3.1). But all (La)TeX code I've found (including the url package) generate higher UNICODE characters, so it would be surprising if an ASCII character were impossible.

James6M
  • 11
  • 1
    Welcome to TeX.SE. – Mico Feb 10 '23 at 04:31
  • 3
    Assuming you're interested in a LaTeX-based solution, have you considered \textasciitilde? As in, \documentclass{article} \begin{document} \textasciitilde \end{document}? – Mico Feb 10 '23 at 04:33
  • @Mico I just looked at a large PDF file, of uncertain origin, which claimed to list all LaTeX code. But \textasciitilde was not listed, and no other code would apply. So the OP request is certainly not unreasonable. However, I discovered that \~{ } works (it puts a tilde "over" a space, resulting in an ordinary tilde that can be copied and pasted in text, as requested). – rallg Feb 10 '23 at 04:47
  • 2
    @rallg - A large pdf file of entirely certain origin -- The Comprehensive LaTeX Symbol List -- currently features 18,150 symbols and their corresponding LaTeX commands, in 585 [!] separate tables. The instruction \textasciitilde shows up in Table 2, "Predefined LATEX2 Text-mode Commands". (A footnote to this table points out that "\~{} can be used ... instead of \textasciitilde".) This makes me think that suggesting \textasciitilde as a solution wasn't particularly unreasonable. – Mico Feb 10 '23 at 06:52
  • 2
    your last sentence is false. ~ is in ASCII but it is not in the default OT1 font encoding used by latex for example. – David Carlisle Feb 10 '23 at 07:17
  • @DavidCarlisle Check again: LaTeX font encodings page 19, table cell '176 = "7E. – James6M Feb 13 '23 at 07:18
  • @rallg I've updated the question to specify what hasn't worked for me, including \~{ }, but apparently that worked for you. Could you give specifics: Were you copying out of a PDF? What PDF viewer? What were you pasting into? – James6M Feb 13 '23 at 07:27
  • @James6M the general statement that being in ascii means it should work is false see 1 < 2 in OT1 for a clearer example, for ~ the character in slot 7E of OT1 is the tilde accent so ̃ not ~ which is why \~{} putting a tilde accent on nothing is an approximation and is the definition of \textasciitilde in OT1 – David Carlisle Feb 13 '23 at 08:59
  • You can use the accsupp package to provide a real ~ for cut and paste around a visual \~{} – David Carlisle Feb 13 '23 at 09:03
  • I voted to re-open, I'll post an answer working for cut and paste once it is open – David Carlisle Feb 13 '23 at 09:09
  • which engine and which fonts do you use? – Ulrike Fischer Feb 13 '23 at 09:30
  • @James6M Used lualatex on Linux, copied out of Evince, pasted into BASH command window. It is quite possible that the system automatically substitutes ascii tilde when it sees something else, depending on the system font. So it may be a system-dependent or font-dependent thing. But hunting that down, is above my minimal programming knowledge. – rallg Feb 13 '23 at 16:36
  • @rallg thanks. With various permutations I'm consistently getting the U+0303 character. My guess is that the difference between our experiences is on the font (or font encoding) side. – James6M Feb 15 '23 at 02:35
  • @DavidCarlisle has given an answer that works, but this might also. Try \verb|~| compiled with xelatex. (Not tested.) – barbara beeton Feb 15 '23 at 04:15
  • @James6M you get U+0303 from \textasciitilde (or \string~ or \char"7E) in lualatex or xelatex? That would be weird. – David Carlisle Feb 15 '23 at 10:23

2 Answers2

3

The general statement that being in ascii means it should work is false see

1 < 2 

in OT1 for a clearer example. For ~ the character in slot 7E of OT1 is the tilde accent so ̃ not ~ which is why \~{} putting a tilde accent on nothing is an approximation and is the definition of \textasciitilde in OT1

You can use the accsupp package to provide a real ~ for cut and paste around a visual \~{}

\documentclass{article}
\usepackage{accsupp}
\begin{document}

[\textasciitilde] [\BeginAccSupp{method=hex,unicode,ActualText=007E}\textasciitilde\EndAccSupp{}]

\end{document}

makes

enter image description here

If I cut and paste to

https://w3c.github.io/xml-entities/unicode-names.html?%5B%CB%9C%5D%20%5B~%5D%0A

I get

Result:

U+005b LEFT SQUARE BRACKET &lsqb; &lbrack; \lbrack [ U+02dc SMALL TILDE &tilde; &DiacriticalTilde; \texttildelow U+005d RIGHT SQUARE BRACKET &rsqb; &rbrack; \rbrack ] U+0020 SPACE \space U+005b LEFT SQUARE BRACKET &lsqb; &lbrack; \lbrack [ U+007e TILDE \textasciitilde U+005d RIGHT SQUARE BRACKET &rsqb; &rbrack; \rbrack ]

showing the second one is a ~

Or better (via Ulrike) use \pdfglyphtounicode

\documentclass{article}
\pdfglyphtounicode{tilde}{007E}
\begin{document}

[\textasciitilde]

\end{document}

Note this is only an issue in the 7-bit OT1 encoding, if you use any reasonable encoding such as T1 in pdflatex or TU (Unicode) in lualatex, then \textasciitilde will work by default

David Carlisle
  • 757,742
  • Interesting. I just double-checked on my system (Linux, lualatex, Evince, paste to nano editor). The uncompressed PDF (via qpdf) does not exhibit 7E where I would expect it to be, using a hex editor. But when copied and pasted to nano, the pasted character was indeed 7E. On the other hand, I can get the combining tilde via other code. So apparently, something at the system level is providing the 7E. Above my head. – rallg Feb 15 '23 at 01:30
  • @DavidCarlisle Your last sentence may be the most helpful to me. \usepackage[T1]{fontenc} did result in a cut&paste-able ~. It unfortunately also made a PDF in which my viewers think each 10pt character is about 60pt tall for selection purposes, making it unusable; but that's probably an unrelated problem.Where can I find more about pfdglyphtounicode? It's not in my installation, and neither Google or CTAN could help (except to say that it isn't in XeTeX). – James6M Feb 15 '23 at 04:13
  • @James6M it is a pdftex command, so it is in the pdftex manual. but I do not understand your comment. if you are using xetex you definitely should never be using fontenc or T1 encoding, if you do all hyhenation will be wrong. why use a unicode Tex then force it to use a legacy 8bit font encoding instead of unicode? – David Carlisle Feb 15 '23 at 07:57
  • @DavidCarlisle Thanks for the pointer to the pdftex manual. I'm not using XeTeX. I was just saying that the only info I got from a Google search was that pdfglyphtounicode is not available in XeTeX. I am using pdfetex, but a rather old version, in a computing environment where I do not have the power to upgrade. This may lead to gymnastics on my end, but I don't believe those are pertinent to my original question. – James6M Feb 18 '23 at 03:13
0

If \~{} did not work, try this. Works for me:

In Preamble,

\makeatletter
\edef\realtilde{\expandafter\@gobble\string\~}
\makeatother

In document body, examples:

Hello\realtilde World % Prints without space.
Hello \realtilde\space World % Prints with both spaces.

I do not recall where I got the macro definition. Years ago.

rallg
  • 2,379
  • Thanks. This didn't work for me, but I believe that's due to the underlying fact that I am currently restricted to the OT1 font encoding, with which no solution exists. I need to solve other problems to use a better encoding, but those problems are separate from the current question. – James6M Feb 18 '23 at 03:27