Alright, here's a go at numbering lines in a PDF (or any other image format) without access to the source.
I wrote a little shell script that, using ImageMagick (at least version 6.6.9-4), converts a given PDF into separate raster images for each page, splits these into half pages, shrinks them to a width of one pixel (so takes the horizontal average, basically), turns this into a monochrome image with a given threshold (black=text, white=no text), shrinks every black sequence down to one pixel (=middle of a line), outputs this as a text, pipes it to sed to clean it up and remove all the non-text lines and finally writes a txt file with the position of each line as 1/1000 of the text height.
findlines.sh:
convert $1.pdf -crop 50x100% png:$1
for f in $1-*; do
convert $f -flatten -resize 1X1000! -black-threshold 99% -white-threshold 10% -negate -morphology Erode Diamond -morphology Thinning:-1 Skeleton -black-threshold 50% txt:-| sed -e '1d' -e '/#000000/d' -e 's/^[^,]*,//' -e 's/[(]//g' -e 's/:.*//' -e 's/,/ /g' > $f.txt;
done
Running the script takes about 1 second for one page, resulting in a number of files: basename-<number>.txt, where odd <numbers> contain the positions of the left line numbers, and even <numbers> those of the right page numbers. These files can then be read by pgfplotstable (at least v 1.4) and be used to typeset the line numbers on top of the imported pdf file. I defined a command that takes the page number and four line numbers as arguments, where the four line numbers are used to tell the macro at which "raw" line numbers the "real" text lines start and end in the left and right column. By setting \pgfkeys{print raw line numbers=true}, the raw line numbers as found by the algorithm are shown in red.
\documentclass{article}
\usepackage{tikz}
\usepackage{pgfplotstable}
\newif\ifprintrawlinenumbers
\pgfkeys{print raw line numbers/.is if=printrawlinenumbers,
print raw line numbers=true}
\newcommand{\addlinenumbers}[5]{
\pgfmathtruncatemacro{\leftnumber}{(#1-1)*2}
\pgfmathtruncatemacro{\rightnumber}{(#1-1)*2+1}
\pgfplotstableread{\pdfname-\leftnumber.txt}\leftlines
\pgfplotstableread{\pdfname-\rightnumber.txt}\rightlines
\begin{tikzpicture}[font=\tiny,anchor=east]
\node[anchor=south west,inner sep=0] (image) at (0,0) {\includegraphics[width=14cm,page=#1]{\pdfname.pdf}};
\begin{scope}[x={(image.south east)},y={(image.north west)}]
\pgfplotstableforeachcolumnelement{[index] 0}\of\leftlines\as\position{
\ifprintrawlinenumbers
\node [font=\tiny,red] at (0.04,1-\position/1000) {\pgfplotstablerow};
\fi
\pgfmathtruncatemacro{\checkexcluded}{
(\pgfplotstablerow>=#2 && \pgfplotstablerow<=#3) ? 1 : 0)
}
\ifnum\checkexcluded=1
\pgfmathtruncatemacro\linenumber{\pgfplotstablerow-#2+1}
\node [font=\tiny,align=right,anchor=east] at (0.08,1-\position/1000) {\linenumber};
\fi
}
\pgfplotstablegetrowsof{\leftlines}
\pgfmathtruncatemacro\rightstart{min((\pgfplotsretval-#2),(#3-#2+1))}
\pgfplotstableforeachcolumnelement{[index] 0}\of\rightlines\as\position{
\ifprintrawlinenumbers
\node [font=\tiny,red,anchor=east] at (1.0,1-\position/1000) {\pgfplotstablerow};
\fi
\pgfmathtruncatemacro{\checkexcluded}{
(\pgfplotstablerow>=#4 && \pgfplotstablerow<=#5) ? 1 : 0)
}
\ifnum\checkexcluded=1
\pgfmathtruncatemacro\linenumber{\pgfplotstablerow-#4+\rightstart+1}
\node [font=\tiny] at (0.96,1-\position/1000) {\linenumber};
\fi
}
\end{scope}
\end{tikzpicture}
}
\begin{document}
\def\pdfname{article}
\addlinenumbers{1}{20}{50}{2}{65}
\pgfkeys{print raw line numbers=false}
\addlinenumbers{2}{0}{69}{0}{64}
\addlinenumbers{3}{19}{47}{21}{48}
\end{document}
As a proof of concept, here's the output for the first two pages of an article from the Environmental Science & Technology Journal. I think it works really well. I haven't been able to call findlines.sh from within LaTeX, though, this step has to be performed manually before compiling the .tex file.


-morphologyoperator needed to reduce sequences of consecutive white pixels down to one). Are you sure that you're using GraphicsMagick, though? On my system, I call GraphicsMagick usinggm convert(instead of justconvert). What output do you get when you callconvert -version? – Jake May 23 '11 at 16:05\listfilesshows mypgfplotsandTikZversions to bepgfplots.sty 2010/07/14 Version 1.4.1 (git show 1.4.1-1-g64c9e95 ),tikz.sty 2010/10/13 v2.10 (rcs-revision 1.76). – Jake Jun 01 '11 at 04:15findlines.shscript so that it's more accurate? Here's the example PDF I used. (I'm reviewing papers for a conference that uses the same template, this is my submission.) – Joe Corneli Mar 15 '16 at 22:00sed -e '/black/d'was looking for), but rather as numerical values with transparency (graya(0,0,0,1)). I believe this might be due to changes in recent versions of ImageMagic. I've edited the answer to avoid the problem by looking for#000000instead. I've tested this on your file using ImageMagick6.7.7-10, and it seems to work fine. I'd be grateful if you could let me know whether it works for you. – Jake Mar 16 '16 at 18:21\addlinenumberscommand per page? If that's not the problem, it would be good if you could open a new question and provide an example PDF and the version number of your ImageMagick program. About the scaling, I'll try to come up with a way of keeping the PDF at the original size. I'll let you know when I update the answer. – Jake May 25 '16 at 17:59line 4: -black-threshold: command not found convert: Skeleton @ error/convert.c/ConvertImageCommand/3339.and only empty.txtfiles – Arne Jun 30 '23 at 08:59Version: ImageMagick 7.1.1-12 Q16-HDRI x86_64 21239– Arne Jul 03 '23 at 12:07