To improve my writing, I want to reduce the number of nominalizations (a.k.a. zombie nouns) that I use. For that purpose I would like to have all nominalizations automatically highlighted in the generated PDF.
Essentially I am trying to automatically have words highlighted in the same manner as in the Writer's Diet Test, which, given a sample text, highlights be-verbs (am, is, are, was, were, be, being, been), nominalizations, prepositions, adjectives and adverbs, and waste words (it, this, that, and there).
Nominalizations can be detected as words ending with the following suffixes: ion, ism, ty, ment, ness, ance or ence (although with some false positives, e.g., the word "city" would be detected as a nominalization, but this would be OK). So I think it should be possible to highlight nominalizations using regular expressions.
As a starting point, I used the accepted solution to the question Highlight every occurrence of a list of words?.
Specifically, I used the luahighlight.sty package and the lua module highlight.lua provided in the solution.
The following is a MWE that uses luahighlight.sty and highlight.lua to highlight be-verbs, prepositions, and waste words:
\documentclass[a4paper]{article}
\usepackage[pdftex]{xcolor}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\usepackage{luahighlight}
% highlight be-verbs
\highlight[orange]{am}
\highlight[orange]{is}
\highlight[orange]{are}
\highlight[orange]{was}
\highlight[orange]{were}
\highlight[orange]{be}
\highlight[orange]{being}
\highlight[orange]{been}
% highlight prepositions
\highlight[green]{about}
\highlight[green]{above}
\highlight[green]{across}
\highlight[green]{after}
\highlight[green]{against}
\highlight[green]{along}
\highlight[green]{among}
\highlight[green]{around}
\highlight[green]{at}
\highlight[green]{before}
\highlight[green]{behind}
\highlight[green]{below}
\highlight[green]{beneath}
\highlight[green]{beside}
\highlight[green]{between}
\highlight[green]{beyond}
\highlight[green]{by}
\highlight[green]{down}
\highlight[green]{during}
\highlight[green]{for}
\highlight[green]{from}
\highlight[green]{in}
\highlight[green]{inside}
\highlight[green]{into}
\highlight[green]{like}
\highlight[green]{near}
\highlight[green]{of}
\highlight[green]{off}
\highlight[green]{on}
\highlight[green]{onto}
\highlight[green]{out}
\highlight[green]{outside}
\highlight[green]{over}
\highlight[green]{past}
\highlight[green]{since}
\highlight[green]{through}
\highlight[green]{throughout}
\highlight[green]{till}
\highlight[green]{to}
\highlight[green]{toward}
\highlight[green]{under}
\highlight[green]{underneath}
\highlight[green]{until}
\highlight[green]{up}
\highlight[green]{upon}
\highlight[green]{with}
\highlight[green]{within}
\highlight[green]{without}
% highlight waste words
\highlight[pink]{it}
\highlight[pink]{this}
\highlight[pink]{that}
\highlight[pink]{there}
\highlight[pink]{these}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}
The proliferation of nominalizations in a discursive formation may be an
indication of a tendency toward pomposity and abstraction.
\end{document}
It gives the following output:
How can I now also highlight nominalizations? In the sample text the words proliferation, nominalizations, formation, indication, pomposity, and abstraction should be highlighted.
At the end the output should be highlighted like this:
(The above output is the result of running the sample text through the online Writer's Diet test).


