182

I would like to write a .pdf, where my readers could copy the rendered formulae and paste them directly in their TeX documents.

Example formula

And the copied text would automatically be:

\int x\sin ax\;\mathrm{d}x = \frac{\sin ax}{a^2}-\frac{x\cos ax}{a}+C

Is this possible?

cmhughes
  • 100,947
Jonas
  • 3,353
  • 12
    I added the accessibility tag here- I know it may not have been the original intention, but your question has very important connotations in this respect. – cmhughes Jun 18 '13 at 22:36

3 Answers3

174

The PDF format supports a feature "ActualText" that is used for copy-paste instead of the actual typeset text. However, it is not supported by all PDF viewers, but Acrobat Reader does support it.

\documentclass{article}
\usepackage{accsupp}

\newcommand*{\copyable}[1]{%
  \BeginAccSupp{%
    ActualText=\detokenize{#1},%
    method=escape,
  }%
  #1%
  \EndAccSupp{}%
}

\begin{document}
\[
  \copyable{\int x\sin ax\;\mathrm{d}x = \frac{\sin ax}{a^2}-\frac{x\cos ax}{a}+C}
\]
\end{document}

Result

Copy-paste result (\detokenize adds spaces after command names):

\int x\sin ax\;\mathrm {d}x = \frac {\sin ax}{a^2}-\frac {x\cos ax}{a}+C

A more verbatim-like copy is much more complicated:

\documentclass{article}
\usepackage{accsupp}

\makeatletter
\newcommand*{\copyable}{%
  \begingroup
  \@sanitize
  \catcode`\%=14 % allow % as comment char, also needed for \%
  \@copyable
}
\newcommand*{\@copyable}[1]{%
  \endgroup
  \BeginAccSupp{%
    ActualText=\detokenize{#1},%
    method=escape,
  }%
  \scantokens{#1}%
  \EndAccSupp{}%
}
\makeatother

\begin{document}
\[
  \copyable{\int x\sin ax\;\mathrm{d}x = \frac{\sin ax}{a^2}-\frac{x\cos ax}{a}+C}
\]
\end{document}

Copy-paste result:

\int x\sin ax\;\mathrm{d}x = \frac{\sin ax}{a^2}-\frac{x\cos ax}{a}+C

However, the argument of \copyable is read with verbatim catcodes. Therefore this trick will not work, if \copyable is inside an argument of another macro (or in environments of package amsmath). In this case the \detokenize has a function to get the previous version with spaces after command names at least.

Discussion about areas

\copyable inserts whatsits that does not influence the mathematical spacing. Thus the whole equation can be split:

\[
  \copyable{\int x\sin ax\;\mathrm{d}x}
  \copyable{=}
  \copyable{\frac{\sin ax}{a^2}}
  \copyable{-}
  \copyable{\frac{x\cos ax}{a}}
  \copyable{+}
  \copyable{C}
\]

And where the mapping from formula to copy-paste text is correct, \copyable can be omitted:

\[
  \copyable{\int x\sin ax\;\mathrm{d}x}
  =
  \copyable{\frac{\sin ax}{a^2}}
  -
  \copyable{\frac{x\cos ax}{a}}
  + C
\]

The area that is shown in the PDF viewer for selecting depends on the formula and especially the PDF viewer.

Acrobat Reader usually only shows the area above the first symbol of the formula. In case of the integral sign, a small rectangle above it. The latest example with AR9/Linux:

AR selecting all

Evince 3.4.0 is a little better in the coverage of the rectangle. The first equation with the whole equation in \copyable:

Evince selecting the whole equation

But Evince has problems in the separation of the equations. Here I wanted to select the second equation only, but it gets mixed up with the third:

Evince selecting second equation

Okular 0.14.3 has a nicer selecting tool, but if the \copyable is split up into terms (second and third equations of the latest example), then the terms get mixed up. - Fatal for mathematicians.

Heiko Oberdiek
  • 271,626
  • Would it be possible to avoid this problem using a LuaTeX solution? (I am by all means not a LuaTeX expert, so I am just asking) – Federico Poloni Jun 18 '13 at 11:59
  • 1
    @FedericoPoloni: Not really, LuaTeX does not change the way TeX reads it input and converts it to tokens. – Heiko Oberdiek Jun 18 '13 at 12:09
  • 4
    Just checked: it is also copyable in okular and evince viewers, even better than in Acrobat Reader. A very useful feature indeed. – g.kov Jun 18 '13 at 14:09
  • 1
    Would it possible to match the actual formular symbols with the area one can select? In Adobe Reader the area I am able to select in order to copy the formular in the MWE, is above the formular and only very small. Or is this only an Adobe Reader problem? – maetra Jun 19 '13 at 09:32
  • 2
    Is there a solution that works with Preview in Mac OS X? – lhf Jun 19 '13 at 17:27
  • @lhf: It depends on the capabilities of the Preview in Mac OS X. If it does not support the ActualText feature, then annotations might be an alternative. But that adds an additional symbol. – Heiko Oberdiek Jun 19 '13 at 17:37
  • Strangely, Adobe Reader XI does not copy underscores under windows. – maetra Dec 17 '13 at 10:34
  • @maetra: There are not any underscores in the answer. With some encodings (e.g. T1) the underscore character is a glyph of a font and therefore copyable. But the default in LaTeX (OT1) is a simple rule that is not copyable. – Heiko Oberdiek Dec 19 '13 at 16:37
  • 2
    If you add e.g. x_i to your formula you will see that Adobe Reader does only copy xi while for example evince copies x_i correctly. I assume this an Acrobat bug and there is nothing one can do about it. Adding \usepackage[T1]{fontenc} does not help. – maetra Dec 20 '13 at 10:26
  • @maetra: Yes, I follow your assumption. The underscore in /ActualText is not copied by AR 9.5/Linux. An other PDF viewer Xpdf 3.03 works correctly. – Heiko Oberdiek Dec 21 '13 at 21:52
  • Can this be made to work with unicode? E.g. \copyable{$π≠∞$} (which displays correctly when the unicode characters are defined) just copies as $$ for me (using okular). – Scz Aug 30 '16 at 13:59
14

It is also possible to use the Mathpix desktop app (for Mac and PC(beta)) to take a screenshot of the PDF and it will render the LaTeX instantly (and already have it copied to your clipboard, all you have to do is paste into your editor of choice).

Here is a brief demo using Mathpix to extract LaTeX from a PDF and pasting into Overleaf:

enter image description here

Jonas
  • 3,353
Kaitlin
  • 201
1

Using https://tex.stackexchange.com/a/621117/250119, it's possible to redefine various commands in order to make the copypaste work automatically.

Adapted a bit from https://tex.stackexchange.com/a/233397/250119 and the top answer in https://tex.stackexchange.com/a/119718/250119.

Unfortunately, this solution redefines various standard LaTeX math environments so it can be fragile and breaks other unexpected solutions. If something breaks, first step to debug is to comment out this part.

Only \(...\), \[...\], \begin{align*}, \begin{gather*} are handled.


\documentclass{article}
\usepackage{amsmath}
\usepackage{accsupp}
\newcommand\copypaste[2]{%
    \BeginAccSupp{method=escape,ActualText={\detokenize{#2}}}%
        #1%
    \EndAccSupp{}%
}

\ExplSyntaxOn % ======== redefine $...$ \char_set_catcode_active:N $ \cs_new:Npn $ #1 $ { \c_math_toggle_token \copypaste{#1}{$#1$} \c_math_toggle_token } \char_set_catcode_other:N $ \AtBeginDocument{ \char_set_catcode_active:N $ } % ======== redefine (...) \cs_new_eq:NN \jonas__old_open_parenthesis: ( \cs_gset:Npn ( #1 ) { \jonas__old_open_parenthesis: \copypaste{#1}{( #1 )} ) } % ======== redefine [...] \cs_new_eq:NN \jonas__old_open_square_bracket: [ \cs_gset:Npn [ #1 ] { \jonas__old_open_square_bracket: \copypaste{#1}{[ #1 ]} ] } % ======== redefine align* \ProvideDocumentCommand \NewEnvironmentCopy {mm} { % lines taken from https://tex.stackexchange.com/a/680717/250119, can be deleted if LaTeX version is new enough \expandafter \NewCommandCopy \csname#1\expandafter\endcsname \csname#2\endcsname \expandafter \NewCommandCopy \csname end#1\expandafter\endcsname \csname end#2\endcsname } \ProvideDocumentCommand \RenewEnvironmentCopy {mm} { \expandafter \RenewCommandCopy \csname#1\expandafter\endcsname \csname#2\endcsname \expandafter \RenewCommandCopy \csname end#1\expandafter\endcsname \csname end#2\endcsname }

\clist_map_inline:nn { align,gather } { \message{======== #1} \NewEnvironmentCopy{old-#1}{#1} % backup the old definition \RenewDocumentEnvironment{#1}{b}{ \copypaste{ \begin{old-#1} ##1 \end{old-#1} }{ \begin{#1} ^^J ##1 ^^J \end{#1} } }{} } % ======== done \ExplSyntaxOff

\begin{document} What I have is $x^2 + y^2 = z^2$ and (\frac{1}{2}). Try to copy/paste me. [ y = \frac{\cos{x}}{1+\cos{x}} ] Some aligned equations: \begin{align} 1+2 &= 3 \ 4+5 &= 9 \end{align} Gather environment: \begin{gather} 1+2 = 3 \ 4+5 = 8+1 \end{gather} Other environments (not handled): \begin{align} 1+2 &= 3 \end{align} \begin{equation} 1+2=3 \end{equation} \begin{equation} 1+2=3 \end{equation}

\end{document}

output

user202729
  • 7,143