9

Is there any tool that compares two dvi files. I would like to check if they are identical after I make some changes with source tex file.

Baranas
  • 877
  • Any special reason you want to work on the dvi file directly and not convert them to ps and then to pdf so that you can employ tools like diffpdf or pdftotext, depending on whether you're interested in the text content or the layout? – Christian Feb 15 '13 at 11:05
  • Not quite an answer to the question, but might be useful to anyone with this question: to avoid the timestamp annoyance, you can make TeX think its run started at some fixed time, using a program like faketime. For example, run faketime '2008-12-24 08:15:42' tex foo.tex or faketime '2008-12-24 08:15:42' pdftex --output-format=dvi foo.tex – ShreevatsaR Nov 21 '19 at 21:03

4 Answers4

10

TeX adds a comment into the DVI file, e.g.:

 TeX output 2013.02.15:1010

It contains the date and time that complicates the comparison of the DVI files:

  • The time stamp can be set explicitly by setting \year, \month, \day, and \time. If the two files for comparison have the same values, then the time stamp in the DVI files are the same and you can compare the files using the usual programs (diff, comp, …).

  • The comment has a fixed location in the DVI file format in the preamble. The fifteenth byte contains the length of the comment that follows. With the usual standard case, just ignore the first 42 bytes of the file. The missing values for numerator, denominator and magnification are repeated in the postamble anyway. In Linux the stripping could be done with tail by setting option -c to the file length minus 42.

  • Another possibility is dvitype that outputs the DVI contents in a more human readable manner, here the eigths line with the DVI comment and timestamp can be ignored in the comparison.

ShreevatsaR
  • 45,428
  • 10
  • 117
  • 149
Heiko Oberdiek
  • 271,626
9

You can use dvitype, that converts a DVI file to "human readable" form. The start of the obtained file is

This is DVItype, Version 3.6 (TeX Live 2012)
Options selected:
  Starting page = * 
  Maximum number of pages = 1000000
  Output level = 4 (the works)
  Resolution = 300.00000000 pixels per inch
numerator/denominator=25400000/473628672
magnification=1000;       0.00006334 pixels per DVI unit
' TeX output 2013.02.15:1014'

and so you should ignore the lines up to and including the last shown line with the time stamp that will surely be different.

The rest of the file will contain something like

Postamble starts at byte 353.
maxv=41484288, maxh=26673152, maxstackdepth=13, totalpages=1
Font 14: cmtt10---loaded at size 655360 DVI units 
Font 7: cmr10---loaded at size 655360 DVI units 

42: beginning of page 1 
87: down4 41484288 v:=0+41484288=41484288, vv:=2628 
92: push 
level 0:(h=0,v=41484288,w=0,x=0,y=0,z=0,hh=0,vv=2628) 
93: down4 -39649280 v:=41484288-39649280=1835008, vv:=116 
98: down4 37683200 v:=1835008+37683200=39518208, vv:=2503 
103: push 
...

that are low level instructions for associating fonts to a unique number and for setting type on the pages.

egreg
  • 1,121,712
7

The dvii utility can calculate message digest for each page, e.g.

dvii -p -M1 filex > before.md
cat before.md
[message digest: simple sum (ignore font)]
p:[1/1]::9C8E26458F1B019011D2F28DA18B18CC
p:[2/2]::9C8E26468F1B029011D2F28DA18B18CC
p:[3/3]::9C8E26478F1B039011D2F28DA18B18CC
p:[4/4]::9C8E26488F1B049011D2F28DA18B18CC

You can then compare .md files, instead of .dvi.

Joseph Wright
  • 259,911
  • 34
  • 706
  • 1,036
deimi
  • 1,173
2

If you want to dump/get a textual representation of DVI file, as an alternative to dvitype as mentioned in egreg's answer, you can also use dviasm.

I find the output to be shorter and more readable than that of dvitype.

For example. For a file d.dvi, dviasm d.dvi produces the following (41 lines):

[preamble]
id: 2
numerator: 25400000
denominator: 473628672
magnification: 1000
comment: ' TeX output 2024.02.07:0323'

[postamble] maxv: 633pt maxh: 407pt maxs: 3 pages: 1

[font definitions] fntdef: cmr10 at 10pt

[page 1 0 0 0 0 0 0 0 0 0] down: 633pt push: down: -605pt down: 575pt push: down: -540pt push: right: 77pt fnt: cmr10 at 10pt set: 'hello' right: 3.333328pt set: 'w' right: -0.277786pt set: 'orld' pop: pop: down: 30pt push: push: right: 232pt set: '1' pop: pop: pop:

while dvitype d.dvi produces the following (60 lines):

This is DVItype, Version 3.6 (TeX Live 2023/Arch Linux)
Options selected:
  Starting page = * 
  Maximum number of pages = 1000000
  Output level = 4 (the works)
  Resolution = 300.00000000 pixels per inch
numerator/denominator=25400000/473628672
magnification=1000;       0.00006334 pixels per DVI unit
' TeX output 2024.02.07:0323'
Postamble starts at byte 171.
maxv=41484288, maxh=26673152, maxstackdepth=3, totalpages=1
Font 23: cmr10---loaded at size 655360 DVI units

42: beginning of page 1 87: down4 41484288 v:=0+41484288=41484288, vv:=2628 92: push level 0:(h=0,v=41484288,w=0,x=0,y=0,z=0,hh=0,vv=2628) 93: down4 -39649280 v:=41484288-39649280=1835008, vv:=116 98: down4 37683200 v:=1835008+37683200=39518208, vv:=2503 103: push level 1:(h=0,v=39518208,w=0,x=0,y=0,z=0,hh=0,vv=2503) 104: down4 -35389440 v:=39518208-35389440=4128768, vv:=262 109: push level 2:(h=0,v=4128768,w=0,x=0,y=0,z=0,hh=0,vv=262) 110: right3 5046272 h:=0+5046272=5046272, hh:=320 [ ] 114: fntdef1 23: cmr10 135: fntnum23 current font is cmr10 136: setchar104 h:=5046272+364090=5410362, hh:=343 137: setchar101 h:=5410362+291271=5701633, hh:=361 138: setchar108 h:=5701633+182045=5883678, hh:=373 139: setchar108 h:=5883678+182045=6065723, hh:=385 140: setchar111 h:=6065723+327681=6393404, hh:=406 141: right3 218453 h:=6393404+218453=6611857, hh:=419 145: setchar119 h:=6611857+473316=7085173, hh:=449 146: right2 -18205 h:=7085173-18205=7066968, hh:=448 149: setchar111 h:=7066968+327681=7394649, hh:=469 150: setchar114 h:=7394649+256683=7651332, hh:=485 151: setchar108 h:=7651332+182045=7833377, hh:=497 152: setchar100 h:=7833377+364090=8197467, hh:=520 [hello world] 153: pop level 2:(h=0,v=4128768,w=0,x=0,y=0,z=0,hh=0,vv=262) 154: pop level 1:(h=0,v=39518208,w=0,x=0,y=0,z=0,hh=0,vv=2503) 155: down3 1966080 v:=39518208+1966080=41484288, vv:=2628 159: push level 1:(h=0,v=41484288,w=0,x=0,y=0,z=0,hh=0,vv=2628) 160: push level 2:(h=0,v=41484288,w=0,x=0,y=0,z=0,hh=0,vv=2628) 161: right4 15204352 h:=0+15204352=15204352, hh:=963 166: setchar49 h:=15204352+327681=15532033, hh:=984 [ 1] 167: pop level 2:(h=0,v=41484288,w=0,x=0,y=0,z=0,hh=0,vv=2628) 168: pop level 1:(h=0,v=41484288,w=0,x=0,y=0,z=0,hh=0,vv=2628) 169: pop level 0:(h=0,v=41484288,w=0,x=0,y=0,z=0,hh=0,vv=2628) 170: eop

Also the indentation makes it easier easier to read as well. On the other hand dvitype output might be more useful in some cases (definitely not for human consumption however) because it also outputs the computed coordinates.

Credit: https://tex.stackexchange.com/a/305550/250119

user202729
  • 7,143