7

I am trying to use pdfpages and maintain the internal bookmarks within the included PDF files. The pdfpages README suggests using the pax package for this purpose, so I have installed that from CTAN and refreshed my database (MiKTeX 2.9).

I am running Windows 7 (64-bit) and have installed JRE and JDK (in that order) and Strawberry Perl (to folder C:\StrawberryPerl\).

I downloaded PDFBox version 0.7.3 (which is supposed to be compatible with pax) from http://sourceforge.net/projects/pdfbox/files/ and installed it to C:\PDFBox.

Then I added C:\PDFBox\ and C:\MiKTeX\scripts\pax\ to my system Path variable and rebooted.

Then I installed pdfannotextractor.pl using the command line:

perl C:\MiKTeX\scripts\pax\pdfannotextractor.pl --install

with the following result:

C:\>perl C:\MiKTeX\scripts\pax\pdfannotextractor.pl --install
PDFAnnotExtractor 0.1l, 2012/04/18 - Copyright (c) 2008, 2011, 2012 by Heiko Oberdiek.
* Nothing to do, because PDFBox is already found:
  C:\PDFBox

C:\>

So PDFBox seems to be installed satisfactorily. However, when I try to run the pax script using the following command:

java -jar C:\MiKTeX\scripts\pax\pax.jar FileWithBookmarks.pdf

I get this result:

Exception in thread "main" java.lang.NoClassDefFoundError: org/pdfbox/cos/ICOSVisitor
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Unknown Source)
    at java.lang.Class.getMethod0(Unknown Source)
    at java.lang.Class.getMethod(Unknown Source)
    at sun.launcher.LauncherHelper.getMainMethod(Unknown Source)
    at sun.launcher.LauncherHelper.checkAndLoadMain(Unknown Source)
Caused by: java.lang.ClassNotFoundException: org.pdfbox.cos.ICOSVisitor
    at java.net.URLClassLoader$1.run(Unknown Source)
    at java.net.URLClassLoader$1.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(Unknown Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    ... 6 more

If instead I use the following command:

perl C:\MiKTeX\scripts\pax\pdfannotextractor.pl FileWithBookmarks.pdf

I get the same java exception as above.

Can anyone help?

UPDATE: After adding C:\PDFBox\ to my CLASSPATH, here is my command and the debugging results:

C:\>perl C:\MiKTeX\scripts\pax\pdfannotextractor.pl --debug FileWithBookmarks.pdf
PDFAnnotExtractor 0.1l, 2012/04/18 - Copyright (c) 2008, 2011, 2012 by Heiko Oberdiek.
* CLASSPATH: [.;C:\Program Files (x86)\Java\jre6\lib\ext\QTJava.zip;C:\PDFBox\]
* is_win: [1]
* Which kpsewhich: [C:\MiKTeX\miktex\bin\kpsewhich.EXE]
* Backticks: [kpsewhich --progname pdfannotextractor --format texmfscripts pax.jar]
* Exit code: [0/success]
* pax.jar: [C:/MiKTeX/scripts/pax/pax.jar]
* PDFBox in CLASSPATH: [yes]
* Which java: [C:\Windows\system32\java.EXE]
* System: [java -cp C:/MiKTeX/scripts/pax/pax.jar;C:\PDFBox;.;C:\Program Files (x86)\Java\jre6\lib\ext\QTJava.zip;C:\PDFBox\ pax.PDFAnnotExtractor FileWithBookmarks.pdf]
Usage: java [-options] class [args...]
       (to execute a class)
or  java [-options] -jar jarfile [args...]
       (to execute a jar file)
where options include:
-d32          use a 32-bit data model if available
-d64          use a 64-bit data model if available
-server       to select the "server" VM
-hotspot      is a synonym for the "server" VM  [deprecated]
              The default VM is server.

-cp 
-classpath 
              A ; separated list of directories, JAR archives,
              and ZIP archives to search for class files.
-D=
              set a system property
-verbose[:class|gc|jni]
              enable verbose output
-version      print product version and exit
-version:
              require the specified version to run
-showversion  print product version and continue
-jre-restrict-search | -no-jre-restrict-search
              include/exclude user private JREs in the version search
-? -help      print this help message
-X            print help on non-standard options
-ea[:...|:]
-enableassertions[:...|:]
              enable assertions with specified granularity
-da[:...|:]
-disableassertions[:...|:]
              disable assertions with specified granularity
-esa | -enablesystemassertions
              enable system assertions
-dsa | -disablesystemassertions
              disable system assertions
-agentlib:[=]
              load native agent library , e.g. -agentlib:hprof
              see also, -agentlib:jdwp=help and -agentlib:hprof=help
-agentpath:[=]
              load native agent library by full pathname
-javaagent:[=]
              load Java programming language agent, see java.lang.instrument
-splash:
              show splash screen with specified image
See http://www.oracle.com/technetwork/java/javase/documentation/index.html for more details.
* Exit code: [1]

C:\>
Speravir
  • 19,491
Brian
  • 321
  • 2
  • 9
  • The error messages let me guess, that you need to change the classpath for java to (or add to it) at least one of the PDFBox jar files, which one, I don’t know now. A side note: There were no need for installing pax manually from CTAN, you could have used the MiKTeX Package Manager as well. – Speravir Oct 24 '12 at 02:17
  • Please add option --debug, it does give more clues. – Heiko Oberdiek Oct 24 '12 at 13:48
  • See --debug results above (I can't paste them here for some reason). – Brian Oct 24 '12 at 18:19
  • Brian, are the errors gone or not? The classpath in java -cp C:/MiKTeX/scripts/pax/pax.jar;C:\PDFBox;.;C:\Program Files (x86)\Java\jre6\lib\ext\QTJava.zip;C:\PDFBox\ ... looks dubious. I suppose the spaces caused the problems. (You see, that C:\PDFBox\ was already added to the classpath by pax, I think, but probably not seen.) And: You need to attribute Heiko (or me) the way I did it in my comment below. – Speravir Oct 24 '12 at 22:04
  • @HeikoOberdiek: I just want to point you to Brian’s edit. – Speravir Oct 24 '12 at 22:05
  • @Speravir: Thanks, but I'm still getting the same java error. I uninstalled everything Java and reinstalled the JDK and JRE to paths without spaces; and then rebooted my computer. After all that I'm still getting the same java error when I run perl C:\MiKTeX\scripts\pax\pdfannotextractor.pl --debug FileWithBookmarks.pdf (but the java -cp part no longer has the duplicate C:\PDFBox\ references). – Brian Oct 25 '12 at 01:11

3 Answers3

6

In addition to Heiko’s answer and just for convenience (Windows only):

Create a file pax.bat (or pax.cmd or what ever you prefer instead of pax) under the bin subfolder of your local texmf tree. Under MiKTeX you perhaps first need to create one: Create a local texmf tree in MiKTeX.

Now the preferred variant: Executing the perl file (an installation of a Perl distribution is necessary):

Edit pax.bat, adjust paths to your settings

@echo off
SETLOCAL

set CLASSPATH=C:\PDFBox\lib\PDFBox-0.7.3.jar;%CLASSPATH%

perl C:\MiKTeX\scripts\pax\pdfannotextractor.pl %*

You could even leave out the set CLASSPATH line, if you’d create a path <localtexmf>\scripts\pax\lib, put PDFBox-0.7.3.jar in it and refresh the filename database (fndb).

Then on the Command Prompt you can call pax FileWithBookmarks.pdf or pax --debug FileWithBookmarks.pdf > paxdebug.log. This assumes, that there is no other pax.exe or similar on the system path, otherwise always make your call with pax.bat ....

Executing java directly is a bit more complicated:

Again edit pax.bat and adjust paths to your settings

@echo off
SETLOCAL

set CLASSPATH=C:\PDFBox\lib\PDFBox-0.7.3.jar;C:\MiKTeX\scripts\pax\pax.jar;%CLASSPATH%

java pax.PDFAnnotExtractor %*

Note that pax.jar was added to the classpath. I prefer to set the environment variable CLASSPATH, but the command line option -classpath, or short -cp, works as well, as shown by Heiko.

Speravir
  • 19,491
5

Unless you have an unpacked PDFBox in C:\PDFBox, the CLASSPATH is wrong. Instead of the directory, the .jar file is needed: C:\PDFBox\PDFBox-0.7.3.jar.

Neither C:\PDFBox\ nor C:\MiKTeX\scripts\pax\ need to be added to the system Path variable.

The spaces in the argument for java's option -cp should not be a problem, because the Perl script uses the array form of function system. But it can be tested:

java -cp "C:\MiKTeX\scripts\pax\pax.jar;C:\PDFBox\PDFBox-0.7.3.jar" pax.PDFAnnotExtractor FileWithBookmarks.pdf

Remarks:

  • In Linux/Unix the path separator : is used instead of ;.
  • Project pax does not support newer versions of PDFBox. The supported versions are 0.7.2 and 0.7.3.
Heiko Oberdiek
  • 271,626
  • 1
    I'm still getting the same java error when I use the Perl wrapper, but I can now successfully bypass the wrapper and execute the command @HeikoOberdiek provided above (with lib added in the path): java -cp "C:\MiKTeX\scripts\pax\pax.jar;C:\PDFBox\lib\PDFBox-0.7.3.jar" pax.PDFAnnotExtractor FileWithBookmarks.pdf -- which generates the expected FileWithBookmarks.pax file. Thanks to @Speravir and @HeikoOberdiek for their helpful responses. – Brian Oct 25 '12 at 12:56
  • @alfC: I have updated the answer to add the requirement for special versions of PDFBox and cleaned up the comments. – Heiko Oberdiek Jul 21 '13 at 22:21
0

In this installation tutorial, we rely on chocolatey, because it eases installation much.

Preparation steps

To install chocolatey, open an cmd with administrative privileges and run:

@"%SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe" -NoProfile -InputFormat None -ExecutionPolicy Bypass -Command "iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))" && SET "PATH=%PATH%;%ALLUSERSPROFILE%\chocolatey\bin"

See https://chocolatey.org/install for details. Afterwards, follow these steps:

MiKTeX

  • Install pax using the MiKTeX package manager to C:\MiKTeX.
  • Execute perl C:\MiKTeX\scripts\pax\pdfannotextractor.pl --install to enable downloading of a pdfbox version fitting for pax.
  • Ignore the error regarding "MiKTeX Configuration Utility"
  • Start "MiKTeX Settings"
  • Click on "Refresh FNDB"
  • Click on "Update Formats"
  • Now, pdfannotextractor.pl is ready to go

TeXlive

  • Execute mkdir "C:/Users/USERNAME/.texlive2018"
  • Execute perl C:\texlive\2018\texmf-dist\scripts\pax\pdfannotextractor.pl --install

If you did the first command wrong, you'll see something like the following:

> perl perl C:\texlive\2018\texmf-dist\scripts\pax\pdfannotextractor.pl --install
PDFAnnotExtractor 0.1l, 2012/04/18 - Copyright (c) 2008, 2011, 2012 by Heiko Oberdiek.
!!! Error: Cannot create directory `C:/Users/Oliver/.texlive2018/texmf-var'!

(Inspiration by https://tex.stackexchange.com/a/44104/9075)

koppor
  • 3,252