In many of the questions asked here, people recommend the use of control versioning for the write-up LaTeX documents, especially for documents that are large, such as theses.
The issue with doing that is that in order for the traditional control versioning diff to be of any use, sentences must be broken into lines. Otherwise, diff doesn't really provide any insight as to what changed in the document and things get messy really fast.
The plan is for all my new documents to follow the "each-sentence-in-a-new-line" format. But I would like to start using control versioning for some of my existing (and large) documents.
I was wondering whether there's a tool that takes a .tex file as input and spits out a new .tex file in which the sentences are separated by a newline character. I am mainly interested in a UNIX tool, but the more portable, the better.
Note: The problem is not as simple as inserting a newline after every period unfortunately. For example, when the tool finds "e.g." in the text, it needs to be smart enough to avoid inserting a newline there. Or, more of an annoyance instead of a big issue, since the last character in a paragraph is very likely to be a period, it needs to avoid inserting an extra newline there. Maybe a tool that utilises the LaTeX internals to identify when a sentence has actually ended?
:-)Can you show an example of that please? – sudosensei Dec 19 '13 at 18:03/([.,;:!?"”–—])\w+([:uppercase:])/$1 \n$2/, i.e. look for punctuation followed by at least one whitespace (\wor[ \t]) followed by an uppercase letter ([A-Z]would work for English), and replace the whitespace by a space followed by a single newline, keep the rest as is. Note that some RX flavors use\1instead of$1. – Crissov Dec 19 '13 at 21:47