8

Writing a proposal with strict character limits.

Using the following in the preamble

% Compile with  --enable-write18 or --shell-escape options   
\immediate\write18{texcount -char -inc -tex -sum <file> > <path>/count.tex}

The following throughout the document as necessary

%TC:ignore 
%TC:endignore 

As well as the following at the end

% display information on document
\section{Document info}
\verbatiminput{<path>/count}

Here is an mwe:

\documentclass{article}     
\usepackage{moreverb}

% Compile with --enable-write18 or --shell-escape options
\immediate\write18{texcount -char -inc -tex -sum mwe.tex > count.tex}

\begin{document}

\section*{Section A}

Content for section A

\section*{Another section with subsections}

Introduction to section

\subsection*{The first subsection}

This subsection has somewhat more voluminous content which tends to go on and on...

\subsection*{The second subsection }

Content aimed at testing whether math and symbols are counted: $\int$, $\Sigma$.

\section*{An section with externalized content}

\input{externalized.tex}

%TC:ignore

\section*{An ignored section}

\input{externalized.tex}

%TC:endignore

% display information on document \section{Document info} \verbatiminput{count.tex}

\end{document}

Problem is, too much time spent deciphering results.

Here is what a good solution looks like in my view:

  1. User specifies constraint / max number of characters for each section, stored in \def\thecharacterlimit{<limit>}

  2. Output looks like: <section name>: <actual chars> / <limit no. chars>. I think that a common use case may be the equivalent measured by words instead of characters.

  3. The expression <actual chars> / <limit no. chars> is colored red or green conditionally based being above or below limit, respectively.

  4. Ideally the results for a given section would be displayed at the end of the corresponding section.

  5. Could easily display the over limit delta.

That said, a quick fix or a few tricks would be hugely valuable.

** UPDATE **

I am now getting the output:

This is pdfTeX, Version 3.14159265-2.6-1.40.21 (MiKTeX 20.12)
entering extended mode
(mwe.tex
LaTeX2e <2020-10-01> patch level 2
L3 programming layer <2020-12-07> xparse <2020-03-03>
("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/base\article.cls"
Document Class: article 2020/04/10 v1.4m Standard LaTeX document class
("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/base\size10.clo"))

("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/filecontents\filec ontents.sty"

Package filecontents Warning: This package is obsolete. Disabling it and (filecontents) passing control to the filecontents environment (filecontents) defined by the LaTeX kernel.

)'texcount' is not recognized as an internal or external command, operable program or batch file.

LaTeX Warning: File `texcountinc.tex' already exists on the system. Not generating it from this source.

("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/xstring\xstring.st y" ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/generic/xstring\xstring. tex")) ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/base\ifthen.sty") ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/xcolor\xcolor.sty"

("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/graphics-cfg\color .cfg") ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/graphics-def\pdfte x.def")) ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/hyperref\nameref.s ty" ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/refcount\refcount. sty" ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/generic/ltxcmds\ltxcmds. sty") ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/generic/infwarerr\infwar err.sty")) ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/generic/gettitlestring\g ettitlestring.sty" ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/kvoptions\kvoption s.sty" ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/graphics\keyval.st y") ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/generic/kvsetkeys\kvsetk eys.sty")))) ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/lipsum\lipsum.sty" ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/l3kernel\expl3.sty " ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/l3backend\l3backen d-pdftex.def")) ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/l3packages/xparse
xparse.sty" ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/l3packages/xparse
xparse-generic.tex")) ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/latex/lipsum\lipsum.ltd. tex")) (mwe.aux) ("C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\tex/context/base/mkii\supp-p df.mkii" [Loading MPS to PDF converter (version 2006.09.02).] ) (texcountinc.tex) \sectioncount=

Are there any hints within as two why the implementation is no longer working? Any dependent packages out of date?

Via TeXworks with MiKTeX, I configure the typsetting as follows

adding paths

other settings

I uninstalled, and re-installed texcount. I can confirm that the texcount.exe is installed here:

C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\miktex\bin\x64

and texcount.pl is installed here:

C:\Users\Chris\AppData\Local\Programs\MiKTeX 2.9\scripts\texcount

  • 1
    Counting words in a TeX document is hard. You can probably achieve your aim with LuaTeX but that involves non-trivial processing. For an external solution texcount works well. – TeXnician Feb 09 '20 at 22:10
  • 1
    Edited question to express that full solution described is not required, quick fix would be helpful. – John Chris Feb 10 '20 at 18:23
  • 1
    In your MWE the section called An ignored section is the only thing that is not ignored (the rest of the document is between ignore and endignore), is that intentional? Also, do you want the counting information only for one specified section, or for all (not-ignored) sections in the document? – Marijn May 30 '20 at 13:36
  • @Marijn, good find. Edited question. Does it make sense now? – John Chris May 30 '20 at 13:48
  • 1
    The new error that you get seems to indicate that you don't have texcount installed (anymore). Does it work when you run texcount from a terminal/command prompt? Alternatively, since it is Windows, it could also be a path issue (i.e., texcount is installed but the editor or command prompt that you use to compile your document does not have access to the path where texcount is). Unrelated question: is it still the same proposal? Did it pass? – Marijn Apr 27 '21 at 12:43
  • @Marijn, kind thanks for this. I added some notes. Regarding the unrelated question (smiling) no we haven't submitted yet, waiting on a solution to this technical issue ;) Fingers crossed. – John Chris Apr 27 '21 at 14:26
  • Interestingly, I now find that the mwe proposed below compiles on one windows system, but not another. I wonder what I should be looking for? – John Chris Apr 27 '21 at 14:39
  • 1
    I am not really familiar with TeXworks, but these paths seem to be when TeXworks looks for programs such as texcount when you select them from a menu (or press a button or keyboard shortcut etc.). However, the MWE runs texcount from within pdflatex, so TeXworks does not have any control of the paths there, this is presumably handled by the operating system (i.e., Windows). So, a few things to check: first, check in Windows Explorer (on both systems) if texcount is actually present in the indicated folder, or in another folder (use search in Explorer if you don't see it), or – Marijn Apr 27 '21 at 15:18
  • 1
    if texcount is not present at all. If it is not there, install it (through the MikTeX package manager). If it is there, open a Command Prompt and type texcount + enter. If that doesn't work, find the system-wide path settings in the Windows Control Panel and add the path for texcount. Close the command prompt and open a new one before you test. If it does work in the Command Prompt but not through TeXworks (if you modify the Windows path then you possibly need to restart TeXworks as well) then you need to find a way to sync the path variable somehow. – Marijn Apr 27 '21 at 15:23
  • 1
    Problem solved. Yes, the windows `Path', environmental variable is the key. Kind thanks for your suggestions. – John Chris Apr 28 '21 at 09:27

1 Answers1

7

The obvious answer (to me at least) is to slip | scriptingprogram in before >. But let's assume you want an entirely TeX based solution. I made the following changes:

  • The hardest part (for me) ended up being that texcount puts a # in its output. I finally decided to ignore that line and simply only use lines with section in them (I also couldn't get grep -v '#' to work).
  • It seemed like -inc wasn't showing section counts of input files; -merge does.
  • I'm using \jobname to make this a bit more portable (but \input{\jobname suffix} confuses texcount).

After each [sub]section, you can use the command \withlimit{#} to set a limit for that region. If you don't, you'll end up with -1 at the end. texcount gets confused if \withlimit is on the same line as the section command. TeX doesn't really work with associative arrays, so \withlimit defines some commands based on the section title. This means that the section titles are really fragile. Anything strange (math, other commands, etc) in them will mess things up.

I adapted reading a file line by line from https://tex.stackexchange.com/a/137198/107497 and used a fair bit of string manipulation. This (and the associative arrays) suggests to me that a LaTeX3 (or any other programming language) approach would be simpler.

\documentclass{article}     

\usepackage{filecontents}
\begin{filecontents*}{texcountinc.tex}
Input contents.

\subsection*{Input section title}
\withlimit{40}
File contents that will get input elsewhere. $a^2+b^2=c^2$.
\[\frac{\sin A}a=\frac{\sin B}b=\frac{\sin C}c\]
\end{filecontents*}

% Compile with  --enable-write18 or --shell-escape options   
\immediate\write18{texcount -char -merge -tex -sum \jobname.tex | grep -i section > \jobname Count.txt} % counts characters
%\immediate\write18{texcount -merge -tex -sum \jobname.tex | grep -i section > \jobname Count.txt} % counts words
\newcommand{\limitcount}{-1}

\usepackage{xstring}
\usepackage{ifthen}
\usepackage{xcolor}
\usepackage{nameref}

\newcommand{\processCount}{%
 \newread\counts
 \def\zpar{\par}
 \openin\counts=\jobname Count.txt
 \loop
 \read\counts to \sectioncount
 \ifx\sectioncount\zpar\else
 \showcount{\sectioncount}\\
 \fi
 \ifeof\counts
 \else
 \repeat
}

\newcommand*{\showcount}[1]{%
 % e.g. 67+18+0 (1/0/0/0) S[ubs]ection: The first subsection
 \StrBehind{#1}{ection: }[\sectiontitleplusspace]
 \StrGobbleRight{\sectiontitleplusspace}{1}[\sectiontitle]
 \StrBefore{#1}{+}[\thiscount]
 \expandafter\ifcsname\sectiontitle limit\endcsname%
  \renewcommand{\limitcount}{\csname\sectiontitle limit\endcsname}%
 \else%
  \renewcommand{\limitcount}{-1}%
 \fi%
 \sectiontitle:
 {%
  \ifthenelse{\thiscount>\limitcount}{%
   \textcolor{red}{\thiscount/\limitcount}%
   \ifthenelse{\limitcount>-1}{%
    \ (over by \number\numexpr\thiscount-\limitcount\relax)%
   }{}%
  }{%
   \textcolor{green}{\thiscount/\limitcount}%
  }%
 }
}

\makeatletter
\newcommand*{\withlimit}[1]{%
 \expandafter\newcommand\csname\@currentlabelname limit\endcsname{#1}
}
\makeatother

\usepackage{lipsum}

\begin{document}

\section*{Section A Under Limit}
\withlimit{25}

Content for section A

\section*{Another section with subsections over limit}
\withlimit{15}

Introduction to section

\subsection*{The first subsection no stated limit}

This subsection has somewhat more voluminous content which tends to go on and on...

\subsection*{Extra spaces ignored  }
\withlimit{50}

Content aimed at testing whether math and symbols are counted: $\int$, $\Sigma$.


\section*{texcount doesn't understand lipsum}
\withlimit{1}

\lipsum[1]

\subsection*{Inputing Content}
\withlimit{18}

\input{texcountinc}

%TC:ignore 

\section*{An ignored section}

\lipsum[2]

%TC:endignore 

% display information on document
\section{Document info}
\processCount

\end{document}

final sample output

Teepeemm
  • 6,708
  • 1
    Brilliant stuff. Three questions/comments emerge: (1) I would be curious to see explicit examples of how to count words versus characters. (2) Does this work on externalized section content (e.g. if \input is called within a section) ? (3) Note that \limitcount must be redefined in each section - each section will have a different value for \limitcount, – John Chris Jun 02 '20 at 10:29
  • 2
    @JohnChris The answer to (1) and (2) is that we're simply parsing the output of texcount, so anything it can do so can we. Removing -char counts words instead. \input seems to work with -merge (but not -inc). I've implemented (3), as long as you don't have anything unusual in the section titles. – Teepeemm Jun 02 '20 at 16:06
  • Testing on a W10 system with TeXworks and Perl v5.30.0, the file <name>Count.txt is written, yet is an empty file. Can you post the compile sequence? – John Chris Jul 05 '20 at 15:51
  • 1
    @JohnChris Just pdflatex --enable-write18 <name>.tex on a Mac with Perl v5.24.2 (although I am now seeing an error "File texcountinc not found in path"). – Teepeemm Jul 05 '20 at 16:12
  • 1
    Note that one must install grep. On a windows system, the download available here: http://gnuwin32.sourceforge.net/packages/grep.htm worked. One must also add the location of the grep executable to the user's path definition. – John Chris Jul 05 '20 at 17:08
  • Any simple way to exclude sections from counting? (for instance the last section in which the count results are displayed) – John Chris Jul 05 '20 at 17:15
  • 1
    @JohnChris Isn't that the purpose of %TC:(end)ignore? You could skip the grep part by iterating through the file and only using lines that contain section. – Teepeemm Jul 05 '20 at 17:18
  • Yes, correct. My apologies. The implementation works brilliantly after installing grep. Well done. – John Chris Jul 05 '20 at 17:32
  • When porting to a more complex doc which call more packages, getting ! LaTeX Error: Command \limit already defined. Any chance there could be conflict with variable names chosen for this implementation? – John Chris Jul 05 '20 at 18:02
  • 1
    @JohnChris It wouldn't surprise me. You should be able to change the three occurrences of limit to something unique to avoid conflict. But "\limit already defined" would suggest that \@currentlabelname is empty, which sounds like you have an empty section heading. – Teepeemm Jul 05 '20 at 22:13
  • Thanks, @Teepeemm. Replacing limit does not solve the issue. To my best knowledge, no empty section headings. I have some long section headings with question marks, ? and /. I also have \chapter{<text>}. – John Chris Jul 06 '20 at 09:28
  • 1
    It might be worthwhile making this into a new question, that links back to this, with a minimal example. – Teepeemm Jul 06 '20 at 12:36
  • It seems there is a conflict with the titlesec package. I will product an mwe in a new question. – John Chris Jul 12 '20 at 17:14
  • I can no longer get this mwe working – John Chris Apr 27 '21 at 11:00