I'm reproducing a book from its archive.org PDF using latex. For QA purposes I'd like to have a side by side display of the original page and the latex-typeset page with the requirements that :
- synctex works, so I can jump directly from pdf to source to fix issues (I use TeXstudio).
- page numbers (e.g. folio, not pdf-pagenum) are not changed.
- The typesetting is unaltered by the inclusion of the image.
However, every approach I've tried so far has one or more issues, so here I am looking for sage advice.
More info on the workflow
Generating the .tex source is largely automated using scripts. I've settled on
having one pageXXX.tex per page, collected into directories by parts and chapters
which I can work and typset in isolation. original linebreaks are preserved by
inserting \linebreak everywhere. Page breaks are implcitly handled by matching
the page geometry and font sizes. Including some custom tex code on each page
is not a problem.
What I've tried (in the order I've tried them)
- Method #1: use pdftk to interleavs pages from the original.
- Advantages:
- latex typesetting is unaffected.
- Simple to get working, no changes to .tex files needed
- Problems:
- Breaks synctex (as I recall) because the page numbers are changed .
- Before correction its common that a single
page.texspills over to the next page quite often. Everytime that happens, all subsequent page images get shifted w.r.t to the typset page number and appear at the wrong location.
- Advantages:
- Method #2: call
pdfpagesin everypageXXX.texand insert the original page after the page.- Advantages:
- The sync between page image and page is preserved because a specific page number is requested on every new
pageXXX.texfile,
- The sync between page image and page is preserved because a specific page number is requested on every new
- Problems:
- Calling pdfpages forces a new paragraph which alters the typesetting.
- Sometimes results in ghost pages or orphans that otherwise wouldn't occur.
- Strange interactions with lettrine: pageWithDropCap+pdfpage sometimes results in a dropcap-shaped "hole" on the following page.
- pdfpages modifies the page counter, which has to be corrected for by more, brittle, hacks.
- Advantages:
- Method #3: Double the page width and use the extra space for the image,
using
atbegshi+[absolute]textpos+\includegraphics- Advantages:
- page numbers and typography are unmodified
- Each page asks for a specific image, no sync drift issues if a page overflows.
- Problems:
- I'm including a
\AtBeginShipoutNextat the top of everypageXXX.tex, but actual shipout occurs when latex decides to do it. It seems like sometimes that's after the last line of apageXXX.tex, and somtimes only after looking at the first line of the nextpageXXX.tex, without including that line in the page. The result is that\AtBeginShipoutNextget squashed together and apply to a single page, with unfortunate results.
- I'm including a
- Advantages:
I'm open to any alternatives, but this last method should work perfectly if only I could ensure a (single) shipout at the end of each page.tex. I've tried manually including \pagebreak[4] but this sometimes results in extra blank pages depending on whether latex has already decided the page is full and shipped it or not.
I've also tried using the needspace package to try and improvise an "idempotent" pagebreak, but things didn't seem to work as expected (spurious paragraph breaks, vertical spacing issues)
I've included as much detail as possible in hope that the information will be useful to others working on similar projects.

minipageandsamepagemay be helpfull http://tex.stackexchange.com/questions/30734/how-make-sure-two-elements-stay-on-the-same-page – touhami Jul 07 '15 at 22:28