37

I am trying to make PDF/A from LaTeX (not from LuaTeX etc.) with TeX Live 2013 on Fedora Linux. This works:

\documentclass{article}

\begin{document}

What cat says? \label{meow}

See question on page \pageref{meow}.

\end{document}

With commands latex, dvips and last

gs -dPDFA -dBATCH -dNOPAUSE -dNOOUTERSAVE -dUseCIEColor
-sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite
-sOutputFile=meow.pdf meow.ps

Now, when I add \usepackage[pdfa]{hyperref} I got validation errors like

dc:description :: Wrong value type. Expected type 'lang alt'.
The XMP property 'dc:title' is not synchronized with the document
information entry 'Title'.
A device-specific color space (Annotation C or IC) without an
appropriate output intent is used.

Documentation of hyperref says

-- result is usually not in PDF/A, because many features aren't controlled by hyperref --

but how to control those?

Should I even try this path? There are alternatives like pdfx. Which path to use?

tanGIS
  • 1,645
  • 15
  • 32

1 Answers1

30
  1. Put the color profile eciRGB_v2.icc in the working directory.
  2. Add the flowing code to your preamble:
    \usepackage{hyperxmp}
    \usepackage[pdfa, linktoc=none]{hyperref}

    % ===============================
    % Embedding the color profile.
    % Requires eciRGB_v2.icc in the working directory
    % http://www.eci.org/_media/downloads/icc_profiles_from_eci/ecirgbv20.zip
    \immediate\pdfobj stream attr{/N 3}  file{eciRGB_v2.icc}
    \pdfcatalog{%
        /OutputIntents [ <<
            /Type /OutputIntent
            /S/GTS_PDFA1
            /DestOutputProfile \the\pdflastobj\space 0 R
            /OutputConditionIdentifier (eciRGB v2)
            /Info(eciRGB v2)
        >> ]
    }
  1. Add metadata with hyperxmp and \hypersetup:

    \title{Title}
    \author{First Author, Last Author}
    \hypersetup{%
                 pdfauthortitle={Title of the Author},
                 pdfcopyright={Copyright (C) 20xx, Copyrightholder},
                 pdfsubject={Something},
                 pdfkeywords={Keyword1, Keyword2},
                 pdflicenseurl={http://creativecommons.org/licenses/by-nc-nd/3.0/},
                 pdfcaptionwriter={Scott Pakin},
                 pdfcontactaddress={Street},
                 pdfcontactcity={City},
                 pdfcontactpostcode={101},
                 pdfcontactcountry={Country},
                 pdfcontactemail={email@institute.edu},
                 pdfcontacturl={http://www.institute.edu},
                 pdflang={en},
                 bookmarksopen=true,
                 bookmarksopenlevel=3,
                 hypertexnames=false,
                 linktocpage=true,
                 plainpages=false,
                 breaklinks
             }
    

Everything put together results in a document like this:

    % ===============================
    % Filename: test.tex

    \documentclass{article}
    \usepackage{hyperxmp}
    \usepackage[pdfa, linktoc=none]{hyperref}

    % ===============================
    % Embedding the color profile.
    % Requires eciRGB_v2.icc in the working directory
    % http://www.eci.org/_media/downloads/icc_profiles_from_eci/ecirgbv20.zip
    \immediate\pdfobj stream attr{/N 3}  file{eciRGB_v2.icc}
    \pdfcatalog{%
        /OutputIntents [ <<
            /Type /OutputIntent
            /S/GTS_PDFA1
            /DestOutputProfile \the\pdflastobj\space 0 R
            /OutputConditionIdentifier (eciRGB v2)
            /Info(eciRGB v2)
        >> ]
    }

    % ----------------------------------------------
    % Add metadata
    \title{Title}
    \author{First Author, Last Author}
    \hypersetup{%
                 pdfauthortitle={Title of the Author},
                 pdfcopyright={Copyright (C) 20xx, Copyrightholder},
                 pdfsubject={Something},
                 pdfkeywords={Keyword1, Keyword2},
                 pdflicenseurl={http://creativecommons.org/licenses/by-nc-nd/3.0/},
                 pdfcaptionwriter={Scott Pakin},
                 pdfcontactaddress={Street},
                 pdfcontactcity={City},
                 pdfcontactpostcode={101},
                 pdfcontactcountry={Country},
                 pdfcontactemail={email@institute.edu},
                 pdfcontacturl={http://www.institute.edu},
                 pdflang={en},
                 bookmarksopen=true,
                 bookmarksopenlevel=3,
                 hypertexnames=false,
                 linktocpage=true,
                 plainpages=false,
                 breaklinks
             }

    \begin{document}
    What cat says? \label{meow}
    See question on page \pageref{meow}.
    \end{document}

Which is PDF/A-1b compliant:

Result of online pdf-validation tool

Update: Since the original answer, things have changed several times. For a while luatex85 had to be loaded but today it would break compilation. But it's also not needed anymore. The packages have to be loaded before setting the OutputIntent today.

This works only with pdfLaTeX, LuaHBTeX and LuaLaTeX.

tanGIS
  • 1,645
  • 15
  • 32
DG'
  • 21,727
  • Thanks, now we are close to solution. However, I guess that you mean "compile with latex", not with pdflatex. Also, I still got complains like "The XMP property 'dc:description' is not synchronized with the document information entry 'Subject'." – Jori Mäntysalo Oct 07 '13 at 12:54
  • Well, LaTeX + dvips won’t work with ghostscript (only Adobe Distiller), so you have to resort to Dvipdfm instead of dvips. – DG' Oct 07 '13 at 14:34
  • dvipdfm gives plenty of warnings, but produces .pdf anyway. But result file has no hyperlinks. gs on my system won't work if input and output files are same. – Jori Mäntysalo Oct 08 '13 at 05:56
  • There is, as far as I can see, no good solution for working hyperlinks with your setup. You should really consider using pdflatex instead of plain latex. – DG' Oct 08 '13 at 08:38
  • If I start with your example, I get no hyperlinks at all. So, I remove draft-option, and get PDF with hyperlinks. And then, when I use gs-command to PDF, I got PDF that is almost PDF/A, but not exactly; see for example dc:description -error above. – Jori Mäntysalo Oct 09 '13 at 11:01
  • LaTeX simply is not the right tool for this job, and you should either switch to pdflatex (which is the standard LaTeX-engine in TeXlive) or try to live without hyperlinks. – DG' Oct 09 '13 at 12:00
  • Sorry about confusing --- I used pdflatex and get PDF, but not exactly PDF/A. – Jori Mäntysalo Oct 09 '13 at 12:39
  • Updated my answer. Should work now. – DG' Oct 10 '13 at 10:18
  • This is hard one... /ICCProfile must have absolute path as argument. Also gs . . . PDFA_def.ps test.pdf must be gs . . . /absolute/path/PDFA_def.ps test.pdf. But then, Evinces shows PDF but complains "Fontconfig warning: ignoring finnish: not a valid language tag". I had LANG=finnish on environment, but setting it to C didn't help. And also validator says "The value of the key N is 4 but must be 3." etc. In what operating system did you get this working? – Jori Mäntysalo Oct 11 '13 at 12:56
  • I am on OSX. PDFA_def.ps should reside within the library directory of ghostscript, which varies (look it up in the documentation). The easy way would be to find the file and just change the /ICCProfile entry to /ICCProfile (eciRGB_v2.icc) You can also delete all the metadata except for the title. – DG' Oct 11 '13 at 18:49
  • I marked this as accepted. At least I can now compile on mac, if I really must have good PDF/A. And I guess I will come back reading this example after TeXlive 2014 will be out. – Jori Mäntysalo Oct 24 '13 at 11:24
  • What changes do we have to make to get it to output PDF/X-1a complaint PDF ? – Ibn Saeed May 05 '15 at 17:49
  • 1
    --> http://tex.stackexchange.com/a/242314/29873 – DG' May 05 '15 at 19:05
  • Great solution. Better then configuring pdfx packagage including the *.xmp files. Thank you! – user3072843 Feb 28 '18 at 15:13