I’d like to use \DeclareUnicodeCharacter to define mappings of Unicode characters, represented in decimal or hexadecimal form, with alternative expressions or graphics that should replace the Unicode characters. For example:
\DeclareUnicodeCharacter{014F}{\u{o}}
While this specific example works fine, Unicode characters cannot just be single code points but also grapheme clusters, i.e. sequences of multiple code points that form a unit and a single visible character. Example:
U+006E U+0303 = ñ (sometimes, there are equivalents like U+00F1)
It seems the command \DeclareUnicodeCharacter comes from the inputenc package and supports values between 0 and 10FFFF only, which is enough for single code points, but may not provide any means for composed grapheme clusters. But when using XeLaTeX, the implementation does not come from inputenc, right?
So with inputenc or with a “native” implementation, is there any way to map grapheme clusters instead of just single code points? For example:
\DeclareUnicodeCharacter{006E0303}{...}
# or
\DeclareUnicodeCharacter{006E,0303}{...}
Edit:
The use case is something like \DeclareUnicodeCharacter or \newunicodechar (perhaps without a complete extra package), but for units of multiple code points instead of just single code points, in order to create custom mappings.
It seems TECkit mappings, referenced in the Mapping attribute of fontspec, may provide the exact functionality (including multiple code points being mapped) (Edit: but only to “plain text”, not to commands, apparently), but is not elegant, not contained in the same text/source file, and requires separate tooling.
There’s also \XeTeXinterchartoks, but this doesn’t really make definitions easy to write, especially for multiple individual grapheme clusters (as opposed to character blocks).




\DeclareUnicodeCharacteris not defined in xelatex, andinputencis not usable with xelatex, both are for classic 8-bit TeX systems such as pdflatex. – David Carlisle Jan 06 '22 at 07:45\def\ntilde{...}and/or do regex search/replace in your editor. – user202729 Jan 06 '22 at 10:10\DeclareUnicodeCharacterwould only lack support for pairs of code points for this use case (even though not in XeLaTeX), I thought there would be something similar that allows me to create custom mappings. – caw Jan 06 '22 at 14:13ucharclasses, since that must be doing something similar, and it may beXeTeXinterchartoks. – caw Jan 06 '22 at 14:19\DeclareUnicodeCharacterworks eg\DeclareUnicodeCharacter{014F}{\u{o}}Please show what you are doing as that should give errors with xelatex that the command is undefined. That is, I stopped understanding the question at "While this specific example works fine" – David Carlisle Jan 06 '22 at 15:32