Donald Knuth at the Desktop Publishing Pioneers meeting laments that converting to pdf using his fonts yields pdf files without searchable text. Is this fixed yet?
Asked
Active
Viewed 353 times
2 Answers
7
Type 3 fonts can be given encoding vectors as of the pdftex in the 2018 release, thanks to Pali Rohar.
Karl Berry
- 2,102
-
1Will, Latin Modern is simply not a replacement for Computer Modern. We extensively looked into the idea of using LM pfbs (to get the improvements) with CM tfms, but there are just a few discrepancies in the metrics, mostly relating to accent positioning. But any discrepancies at all are too many, and Jacko did not want to change LM. (Even less so at this late date, I'm sure.) So bluesky pfbs remain the only usable set.
as for xetex and luatex, they are nothing like drop-in replacements for pdftex, let alone knuthian tex, so they aren't the answer here either. – Karl Berry May 26 '19 at 22:45 -
1i did discover that apparently dvipdfmx creates searchable pdfs. i suppose only for basic ASCII, but that's better than nothing. i didn't experiment beyond echo '\relax hello\end' | tex; dvipdfmx texput.dvi, but it's interesting.
i believe that knuth wants to use dvips, though (since he knows the code), so it's not an answer either. – Karl Berry May 26 '19 at 22:49anyway, bb and i asked tom rokicki, who set up Knuth's system (ubuntu), to consider options. maybe it'll be a topic at the conference :).
4
The fix for this would be for Latin Modern to be updated to match the preferred shapes, and to use it in an up-to-date TeX variant such as xe(la)tex or lua(la)tex.
WillAdams
- 7,187
-
5Since it's Knuth asking, this isn't an option for him. Some of the shapes in Latin Modern (e.g. the
\ss) are noticeably different, and Knuth explicitly said he wants to keep using fonts that are under his control. Also, Knuth does not and will not use LaTeX. – barbara beeton May 22 '19 at 15:08 -
Okay, xetex and luatex then --- surprised by the font shape issue --- I'd thought that the Latin Modern fonts were done to match the latest Computer Modern shapes. Was there an update to the Blue Sky Research fonts to make them acceptable? If not, didn't Richard Kinch make a set of Type 1 fonts derived from CM? Isn't there some tool which will take a .ps file with Type 3 fonts CM fonts and replace with the matching Type 1s? – WillAdams May 23 '19 at 01:55
-
2The Blue Sky Computer Modern is Type 1; those were updated by Y&Y to agree with Knuth's shape changes. But apparently Knuth prefers to use the original Metafont set, which will never be anything but Type 3. – barbara beeton May 23 '19 at 02:58
-
1Sorry, this is not an answer to this question. If you use Latin Modern then you're no longer using Knuth's fonts; you're using someone else's approximation of the Computer Modern shapes. Not only do they often give poorer results, more importantly for Knuth the rasterization is no longer under his control as it's up to the PDF viewer to rasterize the shapes on-the-fly depending on the screen resolution and zoom level and whatever, rather than rasterize (convert to bitmaps) once, check the results, and make sure that always the same bitmaps are used. – ShreevatsaR May 23 '19 at 22:08
-
1BTW the tool which takes a .ps file with Type 3 CM fonts and replaces them with corresponding Type 1 fonts is pkfix originally by Heiko Oberdiek, and it's great for casual users. (Type 1 fonts have many advantages for on-screen reading, after all.) But as the fonts are not exactly matching and as they are no longer pre-rasterized fonts, it's not an answer to this question. – ShreevatsaR May 23 '19 at 22:32
Encodingdictionary orToUnicodeCMap, both of which are supported for Type 3 fonts (see 9.6.5 "Type 3 Fonts", Table 112 on p. 259). Missing feature in pdfTeX, maybe? – ShreevatsaR May 23 '19 at 22:20