4

I written one XSD schema, I documented that schema using xmlpad, It generates the HTML documentation automatically for that XSD schema. Now I want to generate a PDF document for that HTML document. I thought to convert the HTML document to LaTeX and I am using pdflatex to generate the PDF document. But I don't know how to convert a HTML document to LaTeX? Is there any open source tool? Can anyone suggest a tool or any other way? I need to convert a HTML document to PDF.

Thorsten
  • 12,872
pavani
  • 2,973
  • 2
    I don't think it is a good idea. Every browser's "print" button can convert HTML to PDF directly, without the need to go through latex. – Federico Poloni Jan 27 '12 at 12:53
  • 2
    Have a look at this question: http://tex.stackexchange.com/questions/3079/how-do-i-convert-html-to-latex – Psirus Jan 27 '12 at 13:01
  • I used the open-source-project TCPDF several times already for directly converting HTML-code to PDF. I don't see any advantage in doing this with LaTex, to be honest. However, TCPDF requires a local PHP-installation or a webserver to run on. – dhst Jan 27 '12 at 13:54

3 Answers3

2

I don't know about LaTeX, but with ConTeXt MkIV, you can parse XML. For an example of parsing HTML, see the My Way by Thomas Schmitz

Aditya
  • 62,301
2

You could use pandoc to convert the HTML to LaTeX and then generate the PDF from the output.

Thorsten
  • 12,872
dfc
  • 542
0

Thanks to the solution from dfc! This is based on his suggestion.

You can directly use pandoc to convert HTML to PDF, e.g. convert google-cpp-styleguide.html to PDF documents:

pandoc cppguide.html -o cppguide.pdf

However, I prefer to firstly convert to LaTeX via pandoc, then edit .tex file to fit my desires (since the default PDF generated above is using article class in LaTeX, I prefer to tune the page layout to save papers). For example,

pandoc -s cppguide.html -o cppguide.tex

then use your favorite LaTeX (e.g. AucTeX) to edit it as you want, and pdflatex it.

Refer to http://pandoc.org/demos.html for pandoc usage demos.

oracleyue
  • 296
  • You can specify a custom preamble for Pandoc-generated LaTeX files, so there's probably no need to edit the resulting file. – Nicola Gigante Sep 06 '16 at 13:05