1

TeX4ebook makes an "epub" folder which it automatically zips to make a .epub file. Occasionally (doing my best to be nice here) tex4ht produces code which needs to be hand-corrected. There must be an easy way at the command line to re-zip the folder to make the epub file after hand-editing, and I would love to know what it is rather than try to figure it out! Thanks!

Nat Kuhn
  • 676
  • With all due respect, you should have made clear in your question that you were not asking about command line tools to create zip archives but, rather, specifically about how to use whatever command line tool you prefer to create a zip archive with whatever features. But, if you already know how to make a zip archive and you know that epubs are zips, I can't for the life of me understand why you don't just try it and see if it works. How many seconds does it take to create a zip compared with asking a question here? I naturally assumed that you unable to perform such an experiment. – cfr Nov 20 '14 at 01:09
  • Actually, either way this question is off-topic for this site since it apparently has nothing to do with TeX. – cfr Nov 20 '14 at 01:11
  • Tidying up a bit... – cfr Nov 20 '14 at 01:12
  • @cfr: I tagged it "tex4ebook" which is the system I am using. As I mentioned, when you are using tex4ebook it happens fairly frequently that the output files need hand-editing, and then manual repackaging. That is why I believe it is on-topic. – Nat Kuhn Nov 20 '14 at 01:16
  • But your question has nothing to do with TeX. It is about the epub format i.e. it is essentially 'Is an .epub just a .zip archive? If not, what other conditions must be satisfied?' The fact that TeX is involved at an earlier point is no more relevant, as far as I can tell, than the fact that I use TeX to produce a set of slides is relevant to the question of how to project them on the screen. Your question isn't about the TeX part of the workflow. As such, it is off-topic as far as I can tell. If not, please clarify your question to explain. – cfr Nov 20 '14 at 01:20
  • I've posted a query asking if there's a better site to ask questions about the epub format. – cfr Nov 20 '14 at 01:22
  • @cfr, as I noted above "there may be some subtlety involved that works under some circumstances and fails under others" so an experiment is not a great idea for something where I am making a professional-level result. – Nat Kuhn Nov 20 '14 at 01:22
  • But surely you would always check? It isn't ask if you'd just create a whole bunch and publish without checking each one? – cfr Nov 20 '14 at 01:24
  • @cfr: you may not be familiar with tex4ebook, built and maintained by michal.h21. It automatically packages tex4ht output into an epub. How to deal with situations where the tex4ht output needs to be hand-edited is part of the tex4ebook workflow. The tex4ebook tag was proposed and accepted. So obviously there may be a better place to get an answer to this question, but I believe it is on-topic. – Nat Kuhn Nov 20 '14 at 01:27
  • That doesn't follow. All kinds of things are part of the workflow but it doesn't mean they are on-topic. And michal.h21 won't get pinged because it only works when somebody has already commented on the same question/answer. – cfr Nov 20 '14 at 01:35
  • http://www.web-books.com/Publishing/epub.htm – cfr Nov 20 '14 at 03:41
  • Thank you @cfr. I found that page helpful for my general understanding. I don't see anything there about -qXr9D which is why I was hoping for a more specific answer. – Nat Kuhn Nov 20 '14 at 11:56

1 Answers1

4

To make this question on topic, I will answer two questions, one which was asked, other question is whether exists a better solution for modifying the output.

To answer first question, generated and post-processed files are saved in directory filename-outputformat/OEBPS/, so if you want to edit files by hand, do it in this directory.

Say you have file named sample.tex and output format is mobi for Kindle. Open the terminal and go to directory:

cd dir with the TeX file/sample-mobi

edit files in OEBPS directory and run commands:

zip -qXr9D sample.epub OEBPS
kindlegen sample.epub

Better way than manually editing output files is to make make4ht build file and create filters for fixing problems automatically. Build files have same base name as main TeX file and extension .mk4. So it would be sample.mk4 in out case:

local filter = require "make4ht-filter"
local cssfix = function(s)
  return s:gsub("%,%s*%{","{")
end
local process = filter{"cleanspan", "fixligatures", "hruletohr"}
local cssprocess = filter{cssfix}
Make:htlatex()
Make:htlatex()
Make:match("html$",process)
Make:match("css$",cssprocess)

it is Lua script. Make:htlatex run Latex with tex4ht included one time, we call it two times (default used by htlatex is three passes).

Make:match will run a function on all output files which match regular expression. we use functions process for .html files and cssprocess for .css files. These functions are created by filter function, which in turn takes table with processing functions, or names of filters (see make4ht documentation for details).

To fix issue with trailing comma in the css file, I created function cssfix, which takes the css file as string and replaces all strings directly before left brace. The modified string is then saved.

michal.h21
  • 50,697
  • Thank you very much, Michal. You have anticipated my next question, but I had thought it best to separate them. As you can see, I'm still new to the site and learning how to do things. Oddly, kindlegen does not seem to take mobi input, so I am using epub. Just want to confirm it is the same -qXr9D switches to package an epub... Thanks again. – Nat Kuhn Nov 20 '14 at 11:51
  • @NatKuhn with tex4ebook -f mobi epub version is created first and then it is converted with kindlegen. -qXr9D is just switch for zip command to use full compression, as epub is basically just zip file with different extension – michal.h21 Nov 20 '14 at 12:03
  • I don't know if OS X still does this but, if it does, I assume it would be best to delete files such as .DS_Store prior to repackaging if files are edited by hand? – cfr Nov 20 '14 at 15:22
  • @cfr I don't use OS X, so I don't know whether it creates such directory. does it exist for every directory? and does zip command pack even hidden files and folders? – michal.h21 Nov 20 '14 at 15:30
  • It certainly used to create them all over the place. (It was a file not a directory.) I still have them all over the place after migrating to Linux because eliminating them is a pain (and I'm worried about letting a find rip through my home directory automatically). I don't know if zip does or not but it is a problem when creating archives on OS X in general. (E.g. a tar will include such files which then show up when used on other systems.) – cfr Nov 20 '14 at 17:46
  • 1
    FWIW I just ran the output file through epubcheck and while it flagged various figures as existing in the zip file but not being declared in the OPF file it did not flag .DS_Store. My guess is that the zip utility that ships with OSX is (now) smart enough not to do this. – Nat Kuhn Nov 20 '14 at 18:51
  • 1
    @cfr it seems that dot files are added by zip command as well. but validation error would be produced by epubcheck if some spurious file was included in the epub file, so maybe this file isn't created when directory is created programmatically. (or no tex4ebook used uses OS X) – michal.h21 Nov 20 '14 at 19:01
  • (I started composing previous message before Nat added his comment) – michal.h21 Nov 20 '14 at 19:03
  • @NatKuhn Good to know. – cfr Nov 20 '14 at 21:20
  • Looks like I spoke too soon. @michal.h21 I think you are right that when the directory is created by the program, there is no .DS_Store, but after you visit the directory in the "Finder" it is there, so subsequent builds will include it. There is an explanation here (http://osxdaily.com/2013/04/30/how-to-exclude-files-from-a-zip-archive/) of how to deal with the problem. When I try the "zip" as above but with the switches, I can't get the file to pass epubcheck the way your packaged file does, though. – Nat Kuhn Nov 21 '14 at 00:41
  • @NatKuhn so does tex4ebook include .DS_Store or don't? I am not sure from your previous comment. – michal.h21 Nov 21 '14 at 07:22
  • It looks as though when you first run it and it creates the project-format directory, .DS_Store is not included. It seems that OSX doesn't create .DS_Store until you look at the folder in a Finder window. If you re-run tex4ebook at that point, .DS_Store will be included. – Nat Kuhn Nov 21 '14 at 11:27
  • The shell command find . -name .DS_Store -delete will delete all the .DS_Store files in the current directory and all subdirectories. (http://superuser.com/questions/112078/delete-matching-files-in-all-subdirectories) – Nat Kuhn Nov 22 '14 at 02:26
  • So when I package the OEBPS directory as you are suggesting, it seems to work in kindlegen, but it completely fails epubcheck, with two ERRORs: Mimetype entry missing or not the first in archive and Required META-INF/container.xml resource is missing, and then it stops. Running epubcheck on the tex4ebook .epub file generates > 1000 "ERRORs", going through the whole thing. When I zip the entire project-epub directory, I get all the >1000 "ERRORs" and one additional one: Mimetype entry missing or not the first in archive – Nat Kuhn Nov 22 '14 at 03:12
  • @NatKuhn I think best are two ways: add some option to zip command to exclude .DS_Store files from packing (or all files whose filename begins with ., to prevent some other possible metadata files), the second way is to call command from the mk4 file to delete such files. maybe you can post that as standalone question, so we can research on that problem better with some code samples? – michal.h21 Nov 23 '14 at 11:10