23

how can I convert tex to plain text (txt) or doc ? I'm using texstudio.

I have problem with characters when I coppied from pdf

Sean Allred
  • 27,421
  • tex file is a plain text file. You can open it with any text editor. What you wish is probably to copy from the output pdf file and paste the contents somewhere. This is not so simple since the pdf could have a lot of formatted text. – Sigur Jun 25 '15 at 14:29
  • I need text with tex syntax – mskuratowski Jun 25 '15 at 14:31
  • It is not clear what you want. Do you want to copy from the pdf and paste it as TeX commands? – Sigur Jun 25 '15 at 14:32
  • 1
    I have both tex and pdf file. I want have plain text without code syntax which I have used in tex file – mskuratowski Jun 25 '15 at 14:35
  • If you really want to strip all LaTeX commands from the *.tex file you could use pandoc. See here: http://blog.philippklaus.de/2010/11/use-pandoc-to-convert-latex-to-markdown/ for example. Pandoc: http://pandoc.org/ – phx Jun 25 '15 at 14:35
  • 2
    You might also do well to read http://www.seanallred.com/tex/2015/05/25/tex-terminology.html, OP. See detex. – Sean Allred Jun 25 '15 at 15:38
  • Use the detex program. – musarithmia Jun 25 '15 at 15:38
  • Or see this question, http://tex.stackexchange.com/questions/34029/can-one-define-an-expandable-command-that-removes-control-sequences-from-its-arg, to strip code of its control sequences. – Steven B. Segletes Jun 25 '15 at 15:43
  • Sounds like mskuratowski wants an ascii text file minus the manuscript, and only the tex commands. I found this question because I want to convert a latex file to plain text minus the tex markup, only formated with whitespace and ascii characters (not everything crammed together in one giant paragraph and zero whitespace). How to convert latex to text? – user12711 Dec 30 '21 at 23:15

8 Answers8

19
pandoc --to=plain --wrap=none evaluation.tex

I found pandoc very useful. You can convert latex to various formats including doc and plain text. The installation instructions are very easy and all well-known operating systems (Linux, Max OS X and Window) are supported. To try it online see this.

Flow
  • 1,013
MajidL
  • 301
8
detex yourfile > yourfile.txt

https://en.wikibooks.org/wiki/LaTeX/Export_To_Other_Formats#Convert_to_plain_text

yihui.dev
  • 251
5

Since modern word processors (like MS Word or openoffice/libreoffice) allow HTML input, there is an alternative path: Convert your LaTeX document to HTML (with latex2html or tex4ht) and import the HTML to the word processor.

There is also a tool called latex2rtf converting latex to rtf, a generic word processor format.

5

I'm developing a GUI tool that can do the exact thing, remove latex commands. Check out PyDetex. If you encounter any issue, please let me know =)

You have three options to run this app:

  1. Install it through pip, running:

    python3 pip install --upgrade pydetex

    python3 -m pydetex.gui

  2. Download a binary from PyDetex GitHub releases

  3. Run through your web browser (not recommended, but for testing purposes it's great!) https://repl.it/github/ppizarror/PyDetex

Then, just simply copy your code in the first text area, and press "Process" button (you can also process from your clipboard, or copy the results to it). The software should automatically recognize the language, check repeated words (if enabled through its settings), etc. It also can convert equations to plain text.

Also, you can import files using "Open File" button, open a dictionary that retrieves synonyms, antonyms, and definitions from the selected word in the app, etc.

If you have any suggestions, let me know. Also, I'll be happy if you contribute, or create any issue through Github.

3

I've seen extremely similar questions to yours posted on this site before, so doing a prior search might help. However, the general consensus that I've gleaned from those questions' answers is that programs for removing (La)TeX syntax directly from your .tex file frequently return only semi-successful results.

You mentioned that a .doc(x) file would be acceptable. If you have a current version of MS Word, you should use it to import the PDF rather than copying and pasting the PDF contents.

More information about the characters that are causing problems would be helpful (e.g., are they symbols in equations?), but using packages for Unicode functionality in your .tex file prior to creating the PDF may help.

crai_n
  • 301
2

I've just used TeX4ht in MikTex 21.2 to convert to .HTML, then open that in Word and save as .DOCX works quite well

klefftech
  • 31
  • 4
0

You could use OpenDetex.

detex my.tex > my.txt
Flow
  • 1,013
0

One possibility nobody mentioned so far is to compile to dvi to pdf first and then convert that to plain text. Two possibilities:

  1. pdftotext
  2. catdvi
stefanct
  • 841
  • 6
  • 16