8

I'm wondering if there's an easy way to have LaTeX output two lists:

  1. of pages containing and
  2. of pages not containing figures and/or other color objects.

I'd like to print color pages on a color printer (expensive) and print the rest of the pages on B&W. I'd like to be able to have one PDF and two txt files with CSV lists I can just dump into a print dialog.

If this can't easily be done, then is there is way to modify the list of figures format so it just lists the pages with comma separation? That way I could compile a temporary version of the PDF with that LOF, and copy them to a text file. Don't know how I'd get the complement of that set of page numbers then though.

I saw this post which recommended doing it manually or using PDFpages, but it seemed a little inconclusive.

Update

Andrey has provided a nice solution below to output a CSV list. I now realize that for my thesis, the list is long enough that it won't fit in a print dialog, and my printing place says they prefer two separate PDF docs anyway.

So, I wonder is it possible to integrate the pdfpages method demonstrated here so that it takes in the CSV lists produced by Andrey's method and uses them to create two additional PDF files, one with color pages, and one B&W. That method is as follows, where inputPDFfilename is the full PDF:

\documentclass{article}
\usepackage{pdfpages}
\begin{document}
\includepdf[pages={3-6, 17, 28, 29-31}]{inputPDFfilename}
\end{document}

One tricky aspect is that pdfpages takes absolute page numbers. I'm not familiar enough with the packages and syntax in Andrey's method to modify it myself at present. It would be cool if this could be fully integrated into the same LaTeX project that creates my thesis.

SSilk
  • 3,732

2 Answers2

5

This solution hacks into the figure environment to output a list of color pages. You can also use \MarkColorPage to mark other places with color content. The data is input back on the next LaTeX pass and is used to construct the black-white list as well. The lists (with absolute page numbers) are then written to files <jobname>.bwlist and <jobname>.colorlist. I had to add \clearpage in the end, otherwise the last page number was off by one.

The reason that two passes are needed is that a correct page number can only be obtained (?) at page shipout time, which \write and \iow_shipout_x:Nn do. If there was a way to similarly defer code for execution on shipout, correct lists could have been obtained in one pass.

(This is my very first attempt at programming with LaTeX3. Comments are highly appreciated!)

\documentclass{article}

\usepackage{expl3,xparse}
\usepackage{atbegshi}

\ExplSyntaxOn
\clist_new:N \g_lp_bw_clist
\clist_new:N \g_lp_color_clist
\int_new:N   \g_lp_page_int

\cs_new:Nn \lp_add_bw_page:n {
  \clist_gput_right:Nx \g_lp_bw_clist {#1}
}
\cs_new:Nn \lp_add_color_page:n {
  \clist_gput_right:Nx \g_lp_color_clist {#1}
}
\cs_new:Nn \lp_test_page: {
  \int_gincr:N \g_lp_page_int
  \exp_args:NNx \clist_if_in:NnF \g_lp_color_clist { \int_use:N \g_lp_page_int } {
    \lp_add_bw_page:n { \int_use:N \g_lp_page_int }
  }
}
\cs_new:Nn \lp_write_list:Nn {
  \iow_open:Nn \g_lp_stream { \tl_use:N \c_job_name_tl .#2 }
  \iow_now:Nx  \g_lp_stream { \clist_use:N #1 }
  \iow_close:N \g_lp_stream
}
\cs_new:Nn \lp_write_lists: {
  \clist_gremove_duplicates:N \g_lp_color_clist
  \lp_write_list:Nn \g_lp_bw_clist    { bwlist }
  \lp_write_list:Nn \g_lp_color_clist { colorlist }
}
\cs_new:Nn \lp_mark_color_page: {
  \iow_shipout_x:Nn \g_lp_stream {
    \exp_not:N \lp_add_color_page:n { \int_use:N \g_lp_page_int }
  }
}

\AtBeginDocument{
  \ExplSyntaxNamesOn
  \file_input:n { \tl_use:N \c_job_name_tl .clp }
  \ExplSyntaxNamesOff
  \iow_open:Nn \g_lp_stream { \tl_use:N \c_job_name_tl .clp }
}
\AtBeginShipout{
  \lp_test_page:
}
\AtEndDocument{
  \clearpage
  \iow_close:N \g_lp_stream
  \lp_write_lists:
}
\cs_set_eq:NN \MarkColorPage \lp_mark_color_page:

\char_set_catcode_letter:N @
\RenewDocumentEnvironment { figure } { o } {
  \IfNoValueTF {#1} {
    \@float { figure }
  }{
    \@float { figure } [#1]
  }
  \lp_mark_color_page:
}{
  \end@float
}
\char_set_catcode_other:N @
\ExplSyntaxOff

\begin{document}

\listoffigures

\clearpage
\begin{figure}
\caption{Test}
\end{figure}

\clearpage
\begin{figure}[!htbp]
\caption{Test}
\end{figure}

\clearpage
Hello World

\end{document}

To create separate PDFs, use this document code after the page lists have been produced:

\documentclass{article}

\usepackage{pdfpages}

\newread\pagein
\openin\pagein=thesis.\listtype
\read\pagein to \pages
\closein\pagein

\begin{document}

\edef\optarg{[pages={\pages}]}
\expandafter\includepdf\optarg{thesis.pdf}

\end{document}

\listtype should be either bwlist or colorlist. You can either create two copies of the document and replace \listtype with appropriate name in each one, or define it on command line (see this question for details):

pdflatex '\def\liststype{bwlist} \input{something.tex}'
Andrey Vihrov
  • 22,325
  • 1
    Rather than the \bool_while_do construct with \tmp_counter you can use \prg_stepwise_inline:nnnn {#1} {1} {#2-1} { \clist_gput_right:Nx \bwlist {##1} }. The convention for variables is to start with the scope (l, g or c), then the package name if any, then the name of the variable itself, and finally the type. For instance, \g_fig_color_clist, \g_fig_bw_clist, \g_fig_last_page_int, \g_fig_stream, and \l_fig_tmpa_int if you had needed that counter. – Bruno Le Floch Aug 25 '11 at 10:38
  • @Bruno Le Floch: Thanks. I completely forgot about naming conventions. – Andrey Vihrov Aug 25 '11 at 11:49
  • @Andrey: Thanks for the response! Unfortunately, this doesn't work very well for me. I just tried it with my thesis, and it sort of seems to do what it's supposed to, except the color pages list only picks up probably 50% of the pages with figures, and it occasionally picks up a page that doesn't have a figure or any color stuff on it. Any thoughts? My thesis is too big to post an example, obviously. Thanks. – SSilk Aug 26 '11 at 02:31
  • @SSilk: The method I used to obtain page numbers was not reliable. I now took a different approach. It's a bit slower and is not so extensible, but all pages are considered, including roman-numbered ones etc. – Andrey Vihrov Aug 26 '11 at 10:45
  • @Andrey: This works nicely now. I just have a further question, regarding whether this can be integrated with pdfpages to automatically create two separate PDFs, one color and one B&W, from the same project that creates my thesis. Could you look at the update to my question and let me know if that's an easy modification to your code? If it's not, I will accept your answer and start a new question for that modification. Thanks. – SSilk Aug 28 '11 at 15:30
  • @Andrey: Is it possible to maybe use zref or something like that to get the absolute values instead? I had an earlier question asking about absolute page numbers that might be helpful. http://tex.stackexchange.com/questions/24236/how-to-find-absolute-page-number-as-an-integer – SSilk Aug 30 '11 at 21:32
  • @SSilk: Do you mind a solution that uses Unix/POSIX utilities? – Andrey Vihrov Aug 31 '11 at 10:57
  • @Andrey: I don't know much about that stuff. I'm compiling my thesis under Windows using MikTeX and TeXnic center, so a solution that integrates reasonably with those items would be preferable. I don't mind running command line stuff though, e.g. if a simple batch file can be run after compiling my thesis that would then split it into color & BW files. Thanks. – SSilk Aug 31 '11 at 19:43
  • @SSilk: Test it now. – Andrey Vihrov Sep 01 '11 at 13:22
  • @Andrey: This seems to be outputting color pages, but has a bug. When I use it with my thesis, everywhere I've put figure placement tags (e.g. !htbp) gets turned into text, i.e. the tag shows up in the PDF. As a result, some of my floats are now too big for their pages (giving me warnings), and overall the layout has shifted figures around to different pages, so while the pagelist output is correct, it's the correct output for this new and wrong layout. Any thoughts? See next comment for warnings details (one is related to your code). – SSilk Sep 01 '11 at 16:24
  • @Andrey: When I compile with this version, I get a warning xparse warning: "redefine-environment" with arg. spec.. This warning points me to the last closing brace (third last line) of the preamble portion of your code. This is followed by several float too large warnings that I don't get without your code. Thanks. – SSilk Sep 01 '11 at 16:26
  • 1
    @SSilk: I hope I fixed the bug now. The warning is still there, but it should not cause issues. Let's see if I can get this working any time soon :-) – Andrey Vihrov Sep 01 '11 at 18:59
  • @Andrey: That seems to do it, at least for the color pages. I didn't check the B&W ones carefully, but I assume they're good too. I'm running it from a batch file manually since I realize I don't want this running every time I compile my thesis. Thanks! – SSilk Sep 02 '11 at 01:12
  • @Andrey: I recently updated MikTeX, and this solution stopped working for me. It now gives me a warning Undefined Control Sequence \iow_new:N at the line \iow_new:N \g_lp_out_stream. When I first tried your solution, I was running MikTeX 2.8. I then upgraded to 2.9, and I think it built fine then, but more recently I just updated some packages for 2.9, and it has stopped. Do you have any thoughts on this? Thanks. – SSilk Sep 27 '11 at 01:33
  • @SSilk: The LaTeX3 bundle was recently updated. I changed the code to work with the new version. – Andrey Vihrov Sep 28 '11 at 05:46
  • Note that \c_job_name_tl has become a string, so you should do \str_use:N \c_sys_jobname_str – egreg Sep 30 '15 at 09:47
4

You could do this on the PDF directly without involving LaTeX. See e.g. Split a PDF into separate files containing colour and B&W pages.

Alan Munn
  • 218,180
hiwk
  • 41