4

I tried out a - at least in my eyes - very simple way to capitalize words:

\documentclass{article}

\begin{document}
\begingroup%
\obeyspaces%\catcode`\ \active
\def {\space\MakeUppercase}%
Hello world
\endgroup
\end{document}

Needlessly to say, I also tried to wrap this into a macro like

\documentclass{article}

\newcommand{\capitalize}[1]{\begingroup\obeyspaces\def {\space\MakeUppercase}#1\endgroup}

\begin{document}
\capitalize{Hello world}
\end{document}

which will cause TeX to complain about the syntax of the inner \def and make it \inaccessible.

What exactly is going wrong and is there a way to get around this?

egreg
  • 1,121,712
Ruben
  • 13,448

5 Answers5

6

This is the usual \verb not working in an argument problem.

\obeyspaces changes the catcode of space which means that a space character in a file is converted to an active token. catcode changes have no effect on tokens already created. In your case the entire argument of \newcommand has been tokenized and so there is no space token at all after \def it is \def{.

You need to change the catcode of space before the \newcommand and \capitalize would need to change the catcode of space before taking its argument. (For this and other reasons I wouldn't use a catcode change for this, instead simply use a delimited argument to find normal spaces)

\documentclass{article}

\newcommand{\capitalize}[1]{\xcapitalize#1 \relax}
\def\xcapitalize#1 #2{%
#1%
\ifx\relax#2%
\else
\space\MakeUppercase{#2}%
\expandafter\xcapitalize
\fi}

\begin{document}
\capitalize{Hello world and this}
\end{document}

a version requested in comments that capitalizes the first letter as well, and allows for the argument to be a macro:

\documentclass{article}

\newcommand{\capitalize}[1]{\ignorespaces
\expandafter\expandafter\expandafter
\xcapitalize\expandafter\space #1 \relax}
\def\xcapitalize#1 #2{%
#1%
\ifx\relax#2%
\else
\space\MakeUppercase{#2}%
\expandafter\xcapitalize
\fi}

\begin{document}
\capitalize{hello world and this}

\newcommand\zzz{hello world and this}
\capitalize{\zzz}
\end{document}
David Carlisle
  • 757,742
  • Can you work out the latter point? – Ruben Dec 19 '14 at 16:57
  • @Ruben see updated answer – David Carlisle Dec 19 '14 at 17:17
  • I see... You recursively loop over a "space-seperated-list". – Ruben Dec 19 '14 at 17:26
  • 1
    @Ruben yes, egreg's version does the same but he uses latex3 code because I'm in the latex3 project and he isn't or something:-) – David Carlisle Dec 19 '14 at 17:29
  • Ok! I better not ask why a member of the LaTeX-3-team refuses to use expl3 :) As an additional request: How would you make this code work with macros as input argument, e.g. \capitalize{\abc} or \capitalize\abc? – Ruben Dec 19 '14 at 23:03
  • @Ruben \newcommand{\capitalize}[1]{\expandafter\xcapitalize#1 \relax} so long as you only want to expand one level of macros before looking for space. (Incidentally there'd be no way to do that with a catcode based version) – David Carlisle Dec 19 '14 at 23:07
  • The first word isn't capitalized. – wipet Dec 20 '14 at 04:56
  • @wipet totally right. I guess a quick fix would be to add something like \newcommand{\capitalizer}[1]{\capitalize{\MakeUppercase#1}} before. (\capitalizer is then the top-level macro.) – Ruben Dec 20 '14 at 11:17
  • @wipet I know, it is strange but I thought that was the requested spec, perhaps the OP has too much exposure to Java or something:-) It's what the version in the question did, just capitalize after a space – David Carlisle Dec 20 '14 at 11:20
  • @Ruben I didn't capitalize the first letter as the code in the question did not, and had an example input where that was not required – David Carlisle Dec 20 '14 at 11:22
  • I actually did not request this one. But, in fact it would be good. Do you think the fix I proposed above is ok? – Ruben Dec 20 '14 at 11:50
  • @Ruben I put a version in the updated answer – David Carlisle Dec 20 '14 at 12:03
  • Neat. Thought about a phantom space too, but I did not manage to get there on my own. Thank you for your time. Definitely worth the accept vote. – Ruben Dec 20 '14 at 12:10
3

This solution requires a recently updated TeX distribution. It has the advantage, over the approach with \obeyspaces, that no category code is changed, so the macro can also go in the argument to other commands.

\documentclass{article}
\usepackage{xparse}

\ExplSyntaxOn
\NewDocumentCommand{\capitalize}{m}
 {
  \ruben_capitalize:n { #1 }
 }

\seq_new:N \l__ruben_capitalize_words_seq
\seq_new:N \l__ruben_capitalize_out_seq

\cs_new_protected:Npn \ruben_capitalize:n #1
 {
  \seq_set_split:Nnn \l__ruben_capitalize_words_seq { ~ } { #1 }
  \seq_set_map:NNn \l__ruben_capitalize_out_seq \l__ruben_capitalize_words_seq
   {
    \tl_mixed_case:n { ##1 }
   }
  \seq_use:Nn \l__ruben_capitalize_out_seq { ~ }
 }
\ExplSyntaxOff

\begin{document}
\capitalize{Hello world}
\end{document}

If you want to be able also to capitalize strings that are passed as a macro, just change the definition of \capitalize into

\NewDocumentCommand{\capitalize}{m}
 {
  \ruben_capitalize:o { #1 }
 }

and add

\cs_generate_variant:Nn \ruben_capitalize:n { o }

after the definition of \ruben_capitalize:n (that is, just before \ExplSyntaxOff).

Complete example:

\documentclass{article}
\usepackage{xparse}

\ExplSyntaxOn
\NewDocumentCommand{\capitalize}{m}
 {
  \ruben_capitalize:o { #1 }
 }

\seq_new:N \l__ruben_capitalize_words_seq
\seq_new:N \l__ruben_capitalize_out_seq

\cs_new_protected:Npn \ruben_capitalize:n #1
 {
  \seq_set_split:Nnn \l__ruben_capitalize_words_seq { ~ } { #1 }
  \seq_set_map:NNn \l__ruben_capitalize_out_seq \l__ruben_capitalize_words_seq
   {
    \tl_mixed_case:n { ##1 }
   }
  \seq_use:Nn \l__ruben_capitalize_out_seq { ~ }
 }
\cs_generate_variant:Nn \ruben_capitalize:n { o }
\ExplSyntaxOff

\newcommand{\myhello}{hello world}

\begin{document}
\capitalize{Hello world}

\capitalize{hello world}

\capitalize{\myhello}

\capitalize\myhello
\end{document}

enter image description here


The classical approach with \obeylines requires that you issue it before absorbing the argument:

% First setup obeyspace and give a meaning to active space
\newcommand{\capitalize}{\begingroup\obeyspaces\setupcapspace\docapitalize}
% Just absorb the argument and end the group
\newcommand{\docapitalize}[1]{#1\endgroup}
% Define (locally) the behavior of active space
\begingroup\lccode`~=`\ % <--- don't forget this one
  \lowercase{\endgroup\newcommand\setupcapspace{\def~{\space\MakeUppercase}}}

The last two lines can also be

{\obeyspaces\gdef\setupcapspace{\def {\space\MakeUppercase}}}

but the \lowercase approach avoids \obeyspaces and possible problems with spurious spaces.

However, a delimited argument approach is surely better, because it allows \capitalize to be in the argument to other command.

egreg
  • 1,121,712
3

A LuaLaTeX-based solution. We define a new macro called \capitalize that employs the Lua functions string.upper and tex.sprint. The argument of \capitalize can be either a hard-coded string or a macro that, presumably, generates a string.

enter image description here

% !TEX TS-program = lualatex
\documentclass{article}
{\catcode\%=12 
 \gdef\capitalize#1{
   \directlua{ str="#1"; 
               tex.sprint ( string.gsub(" "..str, "%W%l",
                string.upper):sub(2)) } }
} 
\begin{document}

\capitalize{Once upon a time there was a princess
  who lived in a great palace that was close to the 
  edge of a dark and mysterious forest.}
\end{document}
Mico
  • 506,678
  • Only the first letter of each word should be capitalized, not the whole string. – egreg Dec 20 '14 at 12:21
  • 2
    Maybe {\catcode%=12 \gdef\capitalize#1{\directlua{str="#1"; tex.sprint(string.gsub(" "..str, "%W%l", string.upper):sub(2))}}}` would be better. – Mark Wibrow Dec 20 '14 at 12:42
  • @egreg - Ooops, I had missed that! (I was thinking, this is way too easy...) Mark Wibrow's suggested alternative code is perfect; I'll modify the answer accordingly. – Mico Dec 20 '14 at 12:57
  • @MarkWibrow - Many thanks, your code is perfect! – Mico Dec 20 '14 at 12:58
2

For namely this purpose (\capitalize) I prefer to not use active space but to separate the parameter into words. The reason is that user can type more than only one space between words, the end of line is interpreted by token processor as a space etc. So, my suggestion is:

\def\capitalize#1{\def\tmp{}\capitalizeA#1 {} }
\def\capitalizeA#1 {\ifx\end#1\end\else \capitalizeB#1 \expandafter\capitalizeA\fi}
\def\capitalizeB#1#2 {\tmp\def\tmp{ }\uppercase{#1}#2}

\capitalize{Hello world, how are you?}

This macro works in plain TeX and in LaTeX because only TeX primitives are used.

To the subject of active character inside \def:

The plainTeX macrofile OPmac provides the macro \adef which activates and defines the character. It can be used inside other macro. Example:

\input opmac

\def\MakeUppercase#1{\uppercase{#1}}
\def\capitalize{\begingroup\adef{ }{ \MakeUppercase}\capitalizeA}
\def\capitalizeA#1{#1\endgroup}

\capitalize{Hello world, how are you?}

\end
wipet
  • 74,238
  • Why is the detour with \MakeUppercase needed? – Ruben Dec 19 '14 at 17:46
  • @Ruben I think, because \uppercase needs the {.} in its argument, i.e., \uppercase a wouldn't work. While \MakeUppercase a would work. – Manuel Dec 19 '14 at 17:48
  • @Manuel but here, \MakeUppercase is a synonym of \uppercase, or not? – Ruben Dec 19 '14 at 17:50
  • @Ruben \uppercase is a primitive, so it has its own “syntax” (in this case, it requires the braces around the argument), \MakeUppercase is a normal defined macro which acts like all of them: if there are braces, then the argument is the content of the braces, if there are no braces, then the first token is the argument, so \MakeUppercase world would be converted to \uppercase{w}orld while \uppercase world would throw an error because the missing braces around w. – Manuel Dec 19 '14 at 17:53
  • @Ruben note this isn't the definition of \MakeUppercase used in latex as in your code in the question and in the other answers. The latex \MakeUpperCase uppercases ß to SS for example) – David Carlisle Dec 19 '14 at 23:15
  • @DavidCarlisle OK, but this isn't the problem of first letter in the word. If we need ß to SS then this is engine dependent. For example works \def\Uppercase#1{\begingroup\def\ss{SS}\uppercase{\edef\tmp{#1}}\expandafter\endgroup\tmp } \Uppercase{ß} in csplain, because encTeX is activated here. – wipet Dec 20 '14 at 04:53
  • but still for a question that is clearly using latex it would be better if you didn't give an answer that "works with plain or latex" that redefines (breaks) a core command defined in the latex format. You could have called that command anything in this answer, there was no need to over-write the latex command – David Carlisle Dec 20 '14 at 11:06
  • 1
    @DavidCarlisle The first part of my answer is usable for plain or LaTeX, the second part is for plain only. The \MakeUppercase is defined only in the second (plain only) part. – wipet Dec 20 '14 at 11:13
1

While your question is about nested \def,s the application is capitalizing words. the titlecaps package does this with the \titlecap macro. It allows wide flexibility in the argument, including font style and size changes. It also allows you to set exclusion words not to capitalize (except optionally as the first word of the argument). It can, to a large extent, overcome leading punctuation (like parens and brackets) when capitalizing words, etc. It can capitalize diacritics, etc.

In the MWE, I show your "hello world" example and then employ the over-the-top sample from the package documentation.

\documentclass{article}
\usepackage{titlecaps}
\def\bs{$\backslash$}

\begin{document}
\titlecap{hello world}

\Addlcwords{for a is but and with of in as the etc on to if}
\titlecap{% 
to know that none of the words typed in this paragraph were initially
upper cased might be of interest to you.  it is done to demonstrate the
behavioral features of this package.  first, you should know the words
that i have pre-designated as lower case.  they are:  ``for a is but and
with of in as the etc on to if.''  you can define your own list.  note
that punctuation, like the period following the word ``if'' did not mess
up the search for lower case (nor did the quotation marks just now).
punctuation which is screened out of the lower-cased word search pattern
include . , : ; ( ) [ ] ? ! ` ' however, I cannot screen text braces;
\{for example in\} is titled, versus (for example in), since the braces
are not screened out in the search for pre-designated lower-case words
like for and in.  However, \texttt{\bs textnc} provides a workaround:
\{\textnc{for example in}\}.  titlecap will consider capitalizing
following a (, [, \{, or - symbol, such as (abc-def).  you can use your
text\textit{\relax xx} commands, like i just did here with the prior xx,
but if you want the argument of that command to not be titled, you
either need, in this example, to add \textit{xx} to the lowercase word
list, which you can see i did not.  instead, i put ``\bs relax~xx'' as
the argument, so that, in essence, the \bs relax was capitalized, not
the x.  Or you could use \texttt{\bs textnc} .  here i demonstrate that
text boldface, \textbf{as in the \bs textbf command}, also works fine,
as do \texttt{texttt}, \textsl{textsl}, \textsc{textsc},
\textsf{textsf}, \textit{etc}.  titlecap will work on diacritical marks,
such as \"apfel, \c cacao \textit{etc.}, \scriptsize fontsize \LARGE
changing commands\normalsize\unskip, as well as national symbols such as
\o laf, \ae gis, and \oe dipus.  unfortunately, i could not get it to
work on the \aa~nor the \l~symbols. the method will work with some
things in math mode, capitalizing symbols if there is a leading space,
$x^2$ can become $ x^2$, and it can process but it will not capitalize
the greek symbols, such as $\alpha$, and will choke on most macros, if
they are not direct character expansions.  Additionally,
\textsf{titlecaps} also works with font changing declarations, for
example, \bs itshape\bs sffamily. \itshape\sffamily you can see that it
works fine.  likewise, any subsequent \bs textxx command will, upon
completion, return the font to its prior state, such as this
\textbf{textbf of some text}.  you can see that i have returned to the
prior font, which was italic sans-serif. now I will return to upright
roman\upshape\rmfamily.  a condition that will not behave well is inner
braces, such as \ttfamily \bs titlecap\{blah \{inner brace material\}
blah-blah\}. \rmfamily see the section on quirks and limitations for a
workaround involving \texttt{\bs textnc}.  titlecap will always
capitalize the first word of the argument (\textbf{even if it is on the
lower-case word list}), unless \texttt{\bs titlecap} is invoked with an
optional argument that is anything other than a capital p.  in that case,
the first word will be titled \textit{unless} it is on the lowercase
word list.  for example, i will do a \bs titlecap[\relax s]\{\relax
a~big~man\} and get ``\titlecap[s]\textnc{a big man}'' with the ``a''
not titled.  i hope this package is useful to you, but as far as using
\textsf{titlecaps} on such large paragraphs\ldots \textbf{do not try
this at home!}}
\end{document}

enter image description here