1

Problem

I use LaTeX for two different things: My mathematics thesis, and writing a novel (because who isn't an amateur novelist these days?). From my experiance with Vimdiff (diff as a back end) merging different sets of edits together is horribly painful. I spent more time merging edits in a particular chapter once then I did writing it. Diff does a horrible job of actually identifying what hasn't changed. It wants to delete a paragraph and then add it back in a slightly different way later among a ton of other paragraphs.

What I find often happens is that Diff will falsely flag a bunch of lines/paragraphs as having been changed when in fact they have not. This is the source of my problems--wading through the false positives. Diff suggests deleting perfectly good lines to add them back in. there is a bit more to it than that, but that is the gist of it.

However, this is with a two-way merge. Does a three-way merge mitigate this?

Specific Use-Case

  • I write a chapter
  • branch at the end of that chapter and then edit it.
  • I write another chapter to give myself time to forget my preconcieved notions of what the previous chapter should be like (easier to kill darlings and identify when details are missing)
  • I then go to the previous chapter, uneditted, and edit it again.
  • I merge this group of edits into the first set of edits (this is what hurts).

So what I want to do is write a chapter in the "Writing" branch. This way I have an easy way to see exactly what my story looks like without any edits. Good for finding regressions.

After committing I then branch this into a "first edit branch". I keep the original manuscript unchanged, and then edit the .tex file making changes.

I check out the writing branch again, after committing to the first edit branch. This time I write another chapter. Supposing I had just finished editing chapter 4, I will be writing chapter 5.

After committing the writing branch, I branch it again to the second writing branch, and do the same thing as I had before.

I then write another chapter in Writing branch, before going in for more edits. This time, instead of branching the writing branch, I merge the second set of edits and the first set of edits together. The edits should be different because I am looking at them at different times. This is useful for me.

And this is the point at which I am concerned. The merger.

The Question

How well does Git handle that? I use LOTS of comments, which seems to be the main reason diff gets confused. For what its worth, latex-diff doesn't seem to have these problems, presumably since it is designed specifically for LaTeX.

To be clear, I don't have any practical experience using LaTeX and Git. Before I get myself into something I think could turn VERY messy, I want to better understand if Git can work as I want, or if I should just expect.

Edits/Assides

I'm not saying diff is useless or itself horrible, just that it is the wrong tool for the job. It also isn't part the question. The question is about Git. Not Diff.

Latex-diff, in my experience, does a great job. Is it possible to use that as a back end? Or something like that? It goes word by word.

This is not about producing a differenced latex document for viewing like in the case of latex-diff or latex-diff-vc (the vc standing for version control). Potentially, I might seek a solution of using latex-diff in place of diff3 (what I've read Git uses).

For what its worth, I'm essentially using this workflow already, but without the aid of branches. I'm hoping that the use of branches will help me suppress commenting out text as a means of version control.

As a side note, this should help improve merging: https://stackoverflow.com/questions/5587626/git-merging-within-a-line. In particular, achoo5000's which is sentence by sentence.

Nero gris
  • 155
  • You might find the answers to https://tex.stackexchange.com/q/54140/86 useful. – Andrew Stacey Jul 16 '18 at 21:39
  • 1
    You know that diff handles files line-based, right? It doesn't do a horrible job at all if you know how to use it properly (using many shorter lines rather than one big etc.). It has its flaws though, especially if there is a small change in many lines. As always, there are alternatives. E.g., https://stackoverflow.com/q/112932/5853002 shows possible (GUI) alternatives. git offers not only basic diff, but other diff modes as well (--diff-algorithm). Also look at --minimal, --patience, and --word-diff arguments of git diff. – nox Jul 16 '18 at 21:43
  • I am aware that diff functions line by line. Which is why it is inadequate for my needs. It would indeed work fine for diff, say, the C or Julia code for my thesis, but the latex for a novel where paragraphs can be quite long? Not so great. Its a case of the wrong tool for the wrong job. A hammer does a horrible job at fixing two pieces of paper to each other. You use a stapler for that. That doesn't mean that a hammer is useless or it self horrible, just at the task mentioned. – Nero gris Jul 16 '18 at 21:48
  • 4
    you should not write your paragraphs as a single line, wrap them to some consistent length such as 80 characters. This is good practice even without considering source control but if using any source control/diff system it is the most important thing to make the system usable. – David Carlisle Jul 16 '18 at 21:49
  • Ideally, this would certainly help. However that is not a natural way to write prose. When programming, I absolutely stick to 80 character lines. I have a nice red line set in Vim for that. – Nero gris Jul 16 '18 at 21:51
  • 4
    Why??? In what way is it natural to write an entire paragraph as a single line? I understand that some wysiwyg systems use the return key to mean a paragraph break but that's some specific input syntax for those systems but not anything at all natural and it is very unnatural for a plain text format like tex. – David Carlisle Jul 16 '18 at 21:53
  • Why is not relevant to this question. Accept that for me it is highly unnatural. To answer the question, however, it breaks my flow. – Nero gris Jul 16 '18 at 21:54
  • 2
    well if your paragraphs are on one line any diff based merge tool whether svn or git or anything else is going to be painful, I agree it's your choice but the answer to your question is "git is good at merges, given reasonable input" – David Carlisle Jul 16 '18 at 21:56
  • I will ask an alternative question. Please see edits in a few minutes. – Nero gris Jul 16 '18 at 21:58
  • This might now be a duplicate of https://tex.stackexchange.com/questions/1325/using-latexdiff-with-git, the accepted answer for which refers to https://gitlab.com/git-latexdiff/git-latexdiff – John Palmieri Jul 16 '18 at 22:05
  • It is not. That is about created a differenced output for viewing as a PDF. Furthermore, latex-diff-vc is for easily making differenced LaTeX code for visual comparison. Neither deal with merging – Nero gris Jul 16 '18 at 22:08
  • 1
    It is not clear to me why you are branching and what you are trying to merge. Write a draft chapter, commit (or commit as often as you want), write a second draft chapter (again committing whenever you want), edit the first chapter (with commits). There is no merging or branching. – StrongBad Jul 16 '18 at 22:23
  • I'll clarify in the body of the question. Just give me a few minutes. – Nero gris Jul 16 '18 at 22:25
  • @StrongBad Its a bit more verbose. Let me know if that helps. – Nero gris Jul 16 '18 at 22:34
  • There are a lot of people who espouse the wonders of the branching workflow that you describe but, equally, others criticise this approach because it leads to excessive and unnecessary merging. I think that it is better to create feature branches only when when you need them rather than building them into your workflow. I use git all of the time for LaTeX and find it is indispensable. To pull up past commits and apply latex diff I use this script. –  Jul 16 '18 at 23:01
  • For what its worth, this is essentially what I'm already doing, but without the aid of branches. The difference is I have two parallel edits, as opposed to write(N+1)==>edit(N+1), edit(N),edit(N-1)==> write (N+2). – Nero gris Jul 16 '18 at 23:25
  • 1
    @DavidCarlisle I can't imagine breaking at 80 characters. That does feel unnatural. Breaking at the end of each sentence is much more natural and generally works reasonably well. (It's not great for long sentences, I admit, but probably there are fewer of those in a novel, so maybe that wouldn't be so much of an issue.) – cfr Jul 17 '18 at 01:25
  • @cfr 80 chars is perfectly natural. – Johannes_B Jul 17 '18 at 04:51
  • 1
    This question is off-topic. – Johannes_B Jul 17 '18 at 04:51
  • @cfr l would not break by hand just set the editor to wrap at a sensible column – David Carlisle Jul 17 '18 at 05:50
  • Or one could use tools like fmt or par before you commit in order to limit the chars per line, that's what I'm doing. You can nicely integrate them in, e.g., vim. – nox Jul 17 '18 at 06:57
  • @Johannes_B If questions about git's interaction with LaTeX is off topic, why are their tags for it? And one for revision control. – Nero gris Jul 17 '18 at 12:34
  • @Nerogris But this is not about integration or interaction. It is just about git. – cfr Jul 17 '18 at 17:18

1 Answers1

1

From the description of your workflow, you have an original version of a chapter and two edited versions of the chapter. I run into this problem in writing with academic collaborators all the time. I write an original draft and send it out to two co-authors. I get back their, often conflicting, recommendations and need to decide when I want to keep (my text, or some of their text). No matter what, this type of merge is always going to be a nightmare. Consider the case where version B changes the beginning of a sentence and version C changes the end of the sentence and both changes accomplish the same thing.

Original (Version A)

"Merging sucks!"

Version B

Nero gris screamed "Merging sucks!"

Version C

"Merging sucks!" screamed Nero gris.

Even if you write only a single sentence per line, a line level (or even a word level) diff is going to say that text got added. It comes down to you to decide which version you like better. A graphical representation of this might look something like

Nero gris screamed "Merging sucks!" screamed Nero gris.

which is, in my opinion, not helpful. I prefer merging B with A to create A' and then C with A' to create a new version D.

Merging, however, is separate issue from version control. Once you figure out how to merge your three versions together, committing them to GIT (or whatever VCS floats your boat) is a piece of cake.

StrongBad
  • 20,495
  • Yeah, this scenario will always require user input. My understanding is that in this case Git forces user intervention. However my issue has been when Diff falsely identifies a line as having been edited when it has, in fact, stayed the same. It seems to get a bit. . .over eager to say something has changed (better than a false negative!) and suggests deleting several lines just to add them back in. Sorry. I should have mentioned this as one of my chief issue. I'll add it to the body. So I suppose a BETTER question is does Git better identify what changed? – Nero gris Jul 16 '18 at 23:33
  • @Nerogris Does Git do this itself at all? It doesn't use a backend? When I use Git, it uses vimdiff, I thought. (But maybe that's my package manager rather than Git.) – cfr Jul 17 '18 at 01:28
  • @cfr I'm not referring to git diff but rather git merge which uses diff3 as a backend. – Nero gris Jul 17 '18 at 02:09