15

This question based on this answer.

Found letter missing when it occurs next to dash, as per the below MWE:

\documentclass{article}
\usepackage{readarray}
\usepackage{ifthen}
\newcounter{index}\setcounter{index}{0}
\def\firstletters#1{%
  \getargsC{#1}%
  \whiledo{\theindex<\narg}{%
    \stepcounter{index}%
    \edef\nextword{\csname arg\romannumeral\theindex\endcsname}%
    \expandafter\getfirst\nextword\relax%
  }%
}
\def\getfirst#1#2\relax{#1}
\begin{document}
\firstletters{This is a test of the Emergency Broadcast System. This-Test. for sample. This T.}
\end{document}

Output

Kumaresh PS
  • 1,267

6 Answers6

12

The datatool package provides \DTLinitials. For example:

\documentclass{article}

\usepackage{datatool-base}

\begin{document}

\DTLinitials{This is a test of the Emergency Broadcast System.
This-Test. for sample. This T.}

\end{document}

T.i.a.t.o.t.E.B.S.T.-T.f.s.T.T.

This automatically inserts a period after each initial, but that can be prevented by redefining \DTLafterinitials, \DTLbetweeninitials and \DTLafterinitialbeforehyphen to do nothing.

\documentclass{article}

\usepackage{datatool-base}

\renewcommand*{\DTLbetweeninitials}{}
\renewcommand*{\DTLafterinitials}{}
\renewcommand*{\DTLafterinitialbeforehyphen}{}

\begin{document}

\DTLinitials{This is a test of the Emergency Broadcast System.
This-Test. for sample. This T.}

\end{document}

TiatotEBST-TfsTT

If you need the initials in an expandable context, you first need to use \DTLstoreinitials, which will save the initials in the command provided in the second argument:

\DTLstoreinitials{This is a test of the Emergency Broadcast System.
This-Test. for sample. This T.}{\initials}

\initials

Edit: if you also want to remove the hyphen from the initials, just redefine \DTLinitialhyphen to do nothing as well:

\renewcommand*{\DTLinitialhyphen}{}

Edit2: Note that \DTLinitials is designed primarily for names (its original purpose was for use with the abbreviated bibliography style provided by databib) so it assumes its argument is a series of letters separated by spaces or hyphens. Additionally from the manual:

Be careful if the initial letter has an accent. The accented letter needs to be placed in a group, if you want the initial to also have an accent, otherwise the accent command will be ignored.

So, as per your comment below:

\DTLinitials{{\"{O}}zg\"{u}r}

Or use XeLaTeX or LuaLaTeX with UTF-8 characters. This is similar to the limitations on \makefirstuc (from mfirstuc)

Also from the datatool manual:

In fact, any command which appears at the start of the name that is not enclosed in a group will be ignored.

This means that, say

\DTLinitials{\MakeUppercase{m}ary ann}

will produce m.a. not M.a.

Nicola Talbot
  • 41,153
7

Here is a solution based on classical TeX only:

\def\firstletters{\bgroup \catcode`-=10 \catcode`(=10 \filA}
\def\filA#1{\filB#1 {\end} }
\def\filB#1#2 {\ifx\end#1\egroup \else#1\expandafter\filB\fi} 

\firstletters{This is a test of the Emergency Broadcast System. 
   This-Test. for sample (per se). This T.}

\bye
wipet
  • 74,238
  • I like this simplicity! Is there a way to preserve specific other characters? For example, given the input: "Hello, you wonderful world!" I would like the output: "H, yww!" (preserving the ',' and the '!') – Dan Cranston Jun 05 '23 at 14:52
  • Also, I am much less familiar with TeX than with LaTeX, so could you offer a little explanation of what your solution is doing? (I get the general idea of \def and #1#2, but don't understand the \bgroup, \egroup, and \catcode) Thanks! – Dan Cranston Jun 05 '23 at 14:54
4
\documentclass{scrartcl}

\usepackage{xparse}
\ExplSyntaxOn
\NewDocumentCommand \firstletters { m } { \kumaresh_firstletters:n { #1 } }
\cs_new_protected:Npn \kumaresh_firstletters:n #1
 {
  \tl_set:Nn \l_tmpa_tl { #1 }
  \tl_replace_all:Nnn \l_tmpa_tl { - } { ~ }
  \seq_set_split:NnV \l_tmpa_seq { ~ } \l_tmpa_tl
  \seq_map_inline:Nn \l_tmpa_seq { \tl_head:n { ##1 } }
 }
\ExplSyntaxOff

\begin{document}

\firstletters{This is a test of the Emergency Broadcast System. This-Test. for sample. This T.}

\end{document}

Here's a version that copes with traditional TeX accents (I did not put the whole list, just a few, add anything you want to the definition). This is probably on the limit of complexity while using predefined variables from expl3, it's recommended to define your own variables rather than use the default tmpa, etc.

Also, this version copes in a basic way with functions of the type \emph{words here} and will convert that to \emph{wh}. And also with [brackets and (parenthesis)] (and whatever you add) and it will convert that to bap.

\documentclass{scrartcl}

\usepackage{xparse}
\ExplSyntaxOn
\NewDocumentCommand \firstletters { m } { \kumaresh_firstletters:n { #1 } }
\cs_new_protected:Npn \kumaresh_firstletters:n #1
 {
  \tl_set:Nn \l_tmpa_tl { #1 }
  \tl_replace_all:Nnn \l_tmpa_tl { - } { ~ } % here we convert dashes into spaces for our function
  \tl_map_inline:nn { [( } % here we remove certain symbols (and whatever you add) so that it doesn't interfere
   { \tl_remove_all:Nn \l_tmpa_tl { ##1 } }
  \seq_set_split:NnV \l_tmpa_seq { ~ } \l_tmpa_tl
  \seq_map_inline:Nn \l_tmpa_seq { \kumaresh_firstletters_head:n { ##1 } }
 }
\cs_generate_variant:Nn \tl_if_in:NnTF { NV }
\tl_const:Nn \c_kumaresh_accents_tl
 { \^ \" \' \` \H \. \d \~ \v } % here should be all accents
\tl_new:N \g_kumaresh_fl_exceptions_tl
\tl_gset:Nn \g_kumaresh_fl_exceptions_tl
 { \MakeUppercase \emph \textbf } % add here functions for your exceptions
\cs_new_protected:Npn \kumaresh_firstletters_head:n #1
 {
  \tl_set:Nx \l_tmpa_tl { \tl_head:n { #1 } }
  \tl_if_in:NVTF \c_kumaresh_accents_tl \l_tmpa_tl
   { \kumaresh_firstletter_accent:NNw #1 \q_stop }
   {
    \tl_if_in:NVTF \g_kumaresh_fl_exceptions_tl \l_tmpa_tl
     { \kumaresh_firstletter_exception:Nnw #1 \q_stop }
     { \tl_use:N \l_tmpa_tl }
   }
 }
\cs_new_protected:Npn \kumaresh_firstletter_accent:NNw #1 #2 #3 \q_stop
 { #1 {#2} }
\cs_new_protected:Npn \kumaresh_firstletter_exception:Nnw #1 #2 #3 \q_stop
 { #1 { \kumaresh_firstletters:n { #2 } } }
\ExplSyntaxOff

\begin{document}

\firstletters{\"{O}zg\"{u}r \MakeUppercase{This is} a \emph{test of} the \textbf{Emergency Broadcast} System. (This-Test). [for sample]. This \'T.}

\end{document}

enter image description here

Manuel
  • 27,118
  • 1
    Yes, I wrote this and then forgot. – Manuel Jul 08 '16 at 21:41
  • Your macro correctly returns nothing, i.e., an empty string, if the argument of \firstletters is empty. However, the macro doesn't appear to guard against the possibility that the very first few characters of the string might not be letters. E.g., \firstletters{.This} returns . rather than T. – Mico Jul 08 '16 at 21:42
  • This is a general approach that separates each string at spaces and takes the first character. And only one extra step is caring about -. If one needs to have more things in account one might want to do \tl_map_inline:nn { ().; } { \tl_remove_all:Nn \l_tmpa_tl { ##1 } } and that will remove all those symbols from the equation. – Manuel Jul 08 '16 at 21:46
  • \firstletters{\"{O}zg\"{u}r} produces output as O & \firstletters{{\"{O}}zg\"{u}r} produces the proper output Ö... Any advice on this... – – Kumaresh PS Jul 09 '16 at 09:05
  • 1
    @KumareshPS That's simple. I will add in a few minutes. But coping with plain \firstletters{Özgür} requires more work. – Manuel Jul 09 '16 at 11:00
  • 1
    @KumareshPS Done. I think it could easily be even generalized to work on \firstletters{\MakeUppercase{This type} and \emph{that too}} to output \MakeUpercase{Tt}a\emph{tt}. In case you are interested. – Manuel Jul 09 '16 at 11:16
  • @Manuel: Marvelous! You are a L3 Magician... – Kumaresh PS Jul 09 '16 at 11:24
  • 1
    @KumareshPS Added, plus taking care of ([ but you can add whatever to that list. – Manuel Jul 09 '16 at 11:42
4

With a regex we remove everything from a letter to a space or a hyphen.

\documentclass{article}
\usepackage{xparse,l3regex}

\ExplSyntaxOn
\NewDocumentCommand{\firstletters}{m}
 {
  \kumaresh_firstletters:n { #1 }
 }

\tl_new:N \l_kumaresh_fl_input_tl

\cs_new_protected:Nn \kumaresh_firstletters:n
 {
  \tl_set:Nn \l_kumaresh_fl_input_tl { #1 ~ }
  \regex_replace_all:nnN { ([A-Za-z]).*?[-\s]} { \1 } \l_kumaresh_fl_input_tl
  \tl_use:N \l_kumaresh_fl_input_tl
 }
\ExplSyntaxOff

\begin{document}
\firstletters{This is a test of the Emergency Broadcast System. This-Test. for sample. This T.}
\end{document}

enter image description here

egreg
  • 1,121,712
  • Your method doesn't appear to notice that the very first characters of the string may not be alphabetical characters. E.g., if the string is given by "()This ...", your macro returns "()T" rather than "T". Also, it looks like if the string is entirely empty, a single space rather than an empty string is returned. – Mico Jul 08 '16 at 21:32
  • 1
    @Mico There are no real specifications, so I just assumed letters, spaces and hyphens; it's easy to cope with empty strings if needed. – egreg Jul 08 '16 at 21:51
4

Here's another LuaLaTeX-based solution. It tests if the string contains any alphabetical characters, and it does nothing if no alphabetical characters are found. It is not assumed that the first character of the string is a letter-type character. The proposed solution can handle non-ASCII-encoded letters such as ä, Ä, and Å.

enter image description here

\documentclass{article}
\usepackage{fontspec}
\usepackage{luacode} % for 'luacode' env. and '\luaexec' macro
\begin{luacode}
local i, w , wstring
function fl ( s )
   i = unicode.utf8.find ( s , "%w")
   -- Do nothing if i=="nil", i.e., if 's' doesn't 
   -- contain at least one alphabetical character:
   if i ~= nil then
      -- Pick up the first letter of first word:
      wstring = unicode.utf8.sub ( s , i , i ) 
      s = unicode.utf8.sub ( s , i+1 )
      -- Pick up the first letters of all remaining words:
      for w in unicode.utf8.gmatch ( s , "%W%w" ) do
         wstring = wstring .. unicode.utf8.sub ( w , 2 )
      end
      tex.sprint ( wstring )
   end
end
\end{luacode}
\newcommand{\firstletter}[1]{\luaexec{fl(\luastring{#1})}}

\begin{document}
\firstletter{This is a test of the Emergency Broadcast System. This-Test. for sample. This T. per se}

% Same string, but with additional non-letter characters
\firstletter{@--?#&$() []<>^_ This is a test of the 
   Emergency    Broadcast System. This--Test. 
   for sample. This T. 
   (per se)}

% Words that start with non-ASCII-encoded characters
\firstletter{$$$ähnlich "öffentlich *übrigens !?<>Äpfel 
   Özgür  ((((^Übung    .ßcheusslich+++ ,===Ångstrom}

\firstletter{!@#$^&*()!@#$^&*()_+-={}[]|\\;<>?Ö} 

% Two strings without any "words"
a\firstletter{"("§$&/)@@=}b\firstletter{}c 

\end{document}
Mico
  • 506,678
3

This takes the earlier insufficient answer you provide (which was mine by the way), and augments it to make the - active and equal to a space prior to executing the earlier code. Thus, the dash-made-space will allow the subsequent letter to be detected as the beginning of a new word.

\documentclass{article}
\usepackage{readarray}
\usepackage{ifthen}
\newcounter{index}\setcounter{index}{0}
\catcode`-=\active %
\def-{ }
\catcode`-=12 %
\def\firstletters{\catcode`-=\active \firstlettersX}
\def\firstlettersX#1{%
  \getargsC{#1}%
  \whiledo{\theindex<\narg}{%
    \stepcounter{index}%
    \edef\nextword{\csname arg\romannumeral\theindex\endcsname}%
    \expandafter\getfirst\nextword\relax%
  }%
  \catcode`-=12 %
}
\def\getfirst#1#2\relax{#1}
\begin{document}
\firstletters{This is a test of the Emergency Broadcast System. This-Test. for sample. This T.}
- - -Dash restored
\end{document}

enter image description here

An identical approach can be used if you need to capitalize following other punctuation, for example ( or [. For example:

\documentclass{article}
\usepackage{readarray}
\usepackage{ifthen}
\newcounter{index}\setcounter{index}{0}
\catcode`-=\active %
\def-{ }
\catcode`-=12 %
\catcode`(=\active %
\def({}
\catcode`(=12 %
\def\newpunct{%
  \catcode`-=\active %
  \catcode`(=\active %
}
\def\oldpunct{%
  \catcode`-=12 %
  \catcode`(=12 %
}
\def\firstletters{\newpunct\firstlettersX}
\def\firstlettersX#1{%
  \getargsC{#1}%
  \whiledo{\theindex<\narg}{%
    \stepcounter{index}%
    \edef\nextword{\csname arg\romannumeral\theindex\endcsname}%
    \expandafter\getfirst\nextword\relax\relax%
  }%
  \oldpunct%
}
\def\getfirst#1#2\relax{#1}
\begin{document}
\firstletters{This is a test of the Emergency Broadcast System (per se). 
    This-Test. for sample. This T.}
- - -Dash restored (and paren too)
\end{document}

enter image description here

  • 1
    Your methods appear not to pick up the possibility that the very first character of the string needn't be a letter, and they crash if the first character is a space. :-( – Mico Jul 08 '16 at 21:27
  • 1
    @Mico as to the leading space issue, I just edited to add an extra \relax in the code \getfirst\nextword\relax\relax` which fixes that issue. As to the first problem with non-alphabetic lead characters, that is the exact problem that is addressed by this solution. I have shown how to do it with dashes and a left paren... other characters can be added, as well. – Steven B. Segletes Jul 10 '16 at 19:06