4

When I convert a PDF created in MiKTeX to plain text, all diacritics are at the wrong place. I need to send my thesis to online antiplagiator checking page.

Example:

pr´ce a diskusia mˆˇu tvorit aj jednu y y y a oz samostatn´ ˇast a spoloˇne

How can I fix it?

Ok, here is some working example:

\documentclass[12pt, oneside]{book}
\usepackage[T1]{fontenc} % <---- Like this?
\usepackage[utf8]{inputenc}
\usepackage{graphicx}
\usepackage[slovak]{babel}
\linespread{1.2}

\begin{document}     
ľščťžýáíéäúôň %these are some misbehave characters 
\end{document}
MikeS
  • 43
  • 3
    Probably by using \usepackage[T1]{fontenc}. But as you didn't show your code this is pure guessing. – Ulrike Fischer Mar 23 '15 at 14:36
  • This is in my preamble: \documentclass[12pt, oneside]{book} \usepackage[utf8]{inputenc} \usepackage{graphicx} \usepackage[slovak]{babel} \linespread{1.2} – MikeS Mar 23 '15 at 14:42
  • 1
    @MikeS, that does not provide the text, please update your question with all the inteformation. Comments are not really for posting large pieces of code – daleif Mar 23 '15 at 14:59
  • 1
    @MikeS, try Ulrikes suggestion, (I usually add it before inputenc). Then I seem to get the correct text out both from copying from Adobe Reader and via pdftotext on the command line. Everything done via TeX Live 2014 on Linux – daleif Mar 23 '15 at 15:24
  • You mean like in my edited example? It did not work for me (from Acrobat to Word). Text is still wrong and kind of pixelated... :-( – MikeS Mar 23 '15 at 15:37
  • 2
    If the font is pixelated: Install the cm-super package or use \usepackage{lmodern} to switch to a type1 font. – Ulrike Fischer Mar 23 '15 at 16:02

1 Answers1

7

I'm using glyphtounicode support file (even with Czech/Slovak related IL2 font encoding) from TeX Live for some time now, I cannot test it with MiKTeX, unfortunately. We run:

pdflatex mal-sk.tex
pdftotext -enc UTF-8 mal-sk.pdf

The result is: 1 ľščťžýáíéäúôň, it shows the page number and the document content. I enclose the code. Please, try it if it fits your needs.

% pdflatex mal-sk.tex
% pdftotext -enc UTF-8 mal-sk.pdf
\documentclass[12pt, oneside]{book}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{graphicx}
\usepackage[slovak]{babel}
\linespread{1.2}
\input glyphtounicode
\pdfgentounicode=1
\begin{document}     
ľščťžýáíéäúôň
\end{document}
Malipivo
  • 13,287