3

I see a question here which has the exact same problem that I am facing, but for a different language. Unfortunately it is closed because of lack of clarity and a reproducible MWE. This is an attempt to reformulate that question by minimizing the unnecessary details.

Assume that in a script X, the correct sequence of letters is {a,c,b,ab}. Notice that ab is supposed to be considered as a letter itself (I know this doesn't make any sense in English, but just assume it for some time).

Now consider the following example:

\documentclass{article}
\usepackage{imakeidx}
\makeindex[title={Index},name={foo}]

\begin{document} abcd\index[foo]{abcd}.

acbd\index[foo]{acbd}.

bcad\index[foo]{bcad}.

bbcd\index[foo]{bbcd}.

\printindex[foo] \end{document}

My terms are (obviously) sorted in the following order:

  1. abcd
  2. acbd
  3. bbcd
  4. bcad

but given the natural order of my script, the sorting I want is:

  1. acbd
  2. bcad
  3. bbcd
  4. abcd

Is there any way to write custom rules for getting something like this? Note that writing rules is going to be important, because it's not a finite list or a handful words on which I am working. I need to automate it with rules for a significantly long list of words.

PS: Sticking to imakeidx is not a necessity. I am open to other packages. The script I am working on anyways needs xe/lualatex, so since lualatex is required lua solutions are also welcome. Basically any approach that works for this is acceptable.

Niranjan
  • 3,435
  • 1
    You can use Xindy for collating in different languages than English. – egreg Jan 20 '23 at 07:55
  • @egreg I tried it, but Xindy also fails to understand ab as a single character. – Niranjan Jan 20 '23 at 09:37
  • I wonder what index does with o\" (also a single letter)? – John Kormylo Jan 20 '23 at 17:48
  • 1
    Ch is single character in Czech. I assume that configurable tools can accept arbitrary "single characters" for different languages. For example, OpTeX is able to set this differently for various languages when it is sorting a list alphabetically (see section 2.33 in OpTeX doc and the \_compoundchars macro). – wipet Jan 22 '23 at 17:36

1 Answers1

5

If I understand well your question, you want to sort by a rule where compound characters can be set. Your example can be configured in OpTeX as follows:

\ii abcd
\ii acbd
\ii bcad
\ii bbcd

_def_sortingdataTEST {a c b ^^A} % language TEST, order: a c b ^^A _def_compoundcharsTEST {ab:^^A} % language TEST, compound char ^^A=ab _def_sortinglang {TEST} % sorting by language TEST

\makeindex

\bye

Run OpTeX two times. First time: the index data is created. Second time: it is used. You don't need to run any external software. Sorting is done at macro level inside OpTeX.

wipet
  • 74,238
  • thanks a lot for the answer, but unfortunately I won't be able to switch my LaTeX style document to OpTeX as of now, but of course as a resource this will always be helpful for others. – Niranjan Jan 23 '23 at 01:42