4

Continuing https://tex.stackexchange.com/a/441877/165772 using @moewe's file babalpha-fl-gs from https://gist.github.com/moewew/158481168f4a2135764f96fc608a1998, consider the following code:

\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[USenglish,french,main=ngerman]{babel}
\usepackage{csquotes}
\usepackage{babelbib}
\usepackage{hyperref}
\bibliographystyle{babalpha-fl-gs}%%% As @moewe pointed out, the same issue exists with babalpha-fl. But if you have to modify it, I'd appreciate it if you modify babalpha-fl-gs instead.
\usepackage{filecontents}
\begin{filecontents}{\jobname.bib}
@string{acmp = {ACM Press}}
@string{aw = {Addison-Wesley}}
@string{ol = {Oldenbourg Wissenschaftsverlag}}
@string{pren = {Prentice Hall}}
@book{Bertrand-CalculDesProbabilites,
  title = {Calcul des probabilit{\'e}s},
  author = {Joseph Bertrand},
  year = 1889,
  publisher = {Gauthier-Villars},
  language = {french}}
@book{Bergstra_89,
  Author = {Jan Aldert Bergstra},
  Isbn = {0-201-41635-2},
  Language = {USenglish},
  Note = {Editors: J. Heering and P. Klint},
  Publisher = acmp # { and } # aw,
  Series = {ACM Press Frontier Series},
  Title = {Algebraic specification},
  Year = {1989}}
@book{Eckel_99,
  Author = {Bruce Eckel},
  Language = {USenglish},
  Publisher = pren,
  Title = {Thinking in {C++}},
  Year = 1999}
@book{Eckel_02,
  Author = {Bruce Eckel},
  Language = {USenglish},
  Publisher = pren,
  Title = {Thinking in {Java}},
Year = {2002}}
@misc{BroyEtAl-ModellierungVerteilterSysteme,
  Author = {Manfred Broy},
  Language = {ngerman},
  Note = {Vorlesungsskript},
  Title = {{Modellierung} {verteilter} {Systeme}},
  Year = 2014}
@book{Brooks_87,
  Author = {Rodney Allen Brooks},
  Language = {ngerman},
  Publisher = ol,
  Title = {{LISP}: Programmieren in Common {L}isp},
  Year = 1987}
\end{filecontents}
\begin{document}
\cite{Bertrand-CalculDesProbabilites,Bergstra_89,Eckel_99,Eckel_02,Brooks_87,BroyEtAl-ModellierungVerteilterSysteme}
\bibliography{\jobname}
\end{document}

In the output of a standard pdflatex-bibtex-loop, two entries get the same abbreviation:

output of pdflatex+bibex loop

How to disambiguate the Ber89 entries while retaining the order of the other entries? (If anyhow possible, I don't really want to change from babalpha-fl to some other style or from bibtex to biber, since it would incur a range of other compatibility problems in a non-minimal example with tons of other packages and, most likely, changes in formatting that a publisher would have to agree to.)

  • Just to point out that the same issue exists with an unmodified version of babelalpha-fl as well. (edit: Turns out alpha.bst has the same issue, ...) – moewe Feb 01 '19 at 20:52
  • This is quite tricky: What order would you like to see when you cite another work by Joseph Bertrand from 1890: [Ber89a], [Ber90], [Ber89b] (chronological) or [Ber89a], [Ber89b], [Ber90] (sorted only by the alphabetic label)? – moewe Feb 01 '19 at 21:07
  • I feared you would say that... – moewe Feb 01 '19 at 21:10
  • Alpha style is a relic of the past, don't use it. It used to be the rule when papers (or even books) were typewritten and the list of references could not be known in advance. With modern technology, you can use author-year style or numeric style. As you clearly see, a reader cannot distinguish between papers by Bertrand and Bergstra. If you like to be verbose, use author-year (a book, for instance); for a paper, numeric style is more than enough. – egreg Feb 09 '19 at 21:13

1 Answers1

4

Original answer

This is quite an interesting edge case.

The alpha-based .bst files actually use slightly different labels for sorting and for citations. The sorting label sort.label is made up of the letter combination and the four-digit year, while the citation label label is made up of the letter combination and the last two digits of the year.

So both entries in the MWE have the label Ber89, but Bertrand-CalculDesProbabilites has sort.label ber1889 and Bergstra_89 has sort.label ber1989.

Interestingly, the extra.label information to avoid name clashes is calculated based on the sort.label, even though only the cite labels will be visible in the document. Because the sort.labels differ, no extra.label is added. (NB: This is true, but it is not the whole story. The sorting itself also plays a role. See the longer explanation below.)

A simple solution is to make sure label and sort.label are the same. That can be done by replacing the entire definition of FUNCTION {calc.label} with

FUNCTION {calc.label}
{ type$ "book" =
  type$ "inbook" =
  or
    'author.editor.key.label
    { type$ "proceedings" =
      'editor.key.organization.label
        { type$ "manual" =
            'author.key.organization.label
            'author.key.label
          if$
        }
      if$
    }
  if$
  year field.or.null purify$ #-1 #2 substring$
  *
  duplicate$
  'label :=
  sortify 'sort.label :=
}

The extended MWE

\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[USenglish,french,main=ngerman]{babel}
\usepackage{csquotes}
\usepackage{babelbib}
\usepackage{hyperref}
\bibliographystyle{babalpha-fl-gs}

\usepackage{filecontents}
\begin{filecontents}{\jobname.bib}
@book{Bertrand-CalculDesProbabilites,
  title     = {Calcul des probabilit{\'e}s},
  author    = {Joseph Bertrand},
  year      = 1889,
  publisher = {Gauthier-Villars},
  language  = {french}
}
@book{Bergstra_89,
  author    = {Jan Aldert Bergstra},
  isbn      = {0-201-41635-2},
  note      = {Editors: J. Heering and P. Klint},
  publisher = {ACM Press and Addison-Wesley},
  series    = {ACM Press Frontier Series},
  title     = {Algebraic specification},
  year      = {1989},
  language  = {USenglish},
}
@book{Bertrand-90,
  title     = {Calcul des probabilit{\'e}s II: Return of the Kolmogorov},
  author    = {Joseph Bertrand},
  year      = 1890,
  publisher = {Gauthier-Villars},
  language  = {french}
}
\end{filecontents}

\begin{document}
\cite{Bertrand-CalculDesProbabilites,Bergstra_89,Bertrand-90}
\bibliography{\jobname}
\end{document}

will then produce

[Ber89a] Jan Aldert Bergstra: Algebraic specification. ACM Press Frontier Series. ACM Press and Addison-Wesley, 1989, ISBN 0-201-41635-2. Editors: J. Heering and P. Klint.//[Ber89b] Joseph Bertrand: Calcul des probabilités. Gauthier-Villars, 1889.//[Ber90] Joseph Bertrand: Calcul des probabilités II : Return of the Kolmogorov. Gauthier-Villars, 1890.


About the edited question.

Sorting alphabetic bibliographies is tricky.

The straightforward sorting method of ordering by the alphabetic labels as seen in the document first and then by adding additional information (let's say name, year, title in that order) to break a tie produces output that some may find objectionable when different centuries are involved

[Elk02] Anne Elk: A Theory on Einiosauruses. 2002

[Elk99] Anne Elk: A Theory on Brontosauruses. 1999.

The people who don't like this output argue that the chronological order should override the strictly "label"-based sorting. That's why alpha actually sorts using the alphabetic part of the label and four digits of the year instead of the two that are shown in the output.

It would be interesting to hear what sort order would be desired for the following entries

[Uth02] Alice Uthor: A Book. 2002.

[Uth03] Emma Uthrinson: B Book 2003.

[Uth04] Alice Uthor: Another Book. 2004.

[Uth05] Emma Uthrinson: Bn Book. 2005.

If you normally argue that the whole year should beat the label to keep works of the same author in chronological order, you may be more inclined to keep works of the same author together and would have to accept that in those cases the author name should beat the label again.

In the end that could lead to a sorting scheme that gives precedence to full names and year over the actual citation label. While that would be more attractive to those who are very familiar with the references in the bibliography (i.e. the author of the bibliography), because works by the same author are kept together and in chronological order, it might be harder to navigate for the (as yet uninformed) reader, who only has the alphabetic labels to go on. I probably can't argue that there is a real risk that a reader would not manage to find the correct citation label when the sorting is more or less decoupled from the only bit of information she has (namely the label), but it is not inconceivable that she would have to spend a bit more time finding the right reference in a large bibliography with several works by authors with similar name abbreviations.

No matter which non-strictly-label-based sorting you go for, you always risk a situation where two labels of the same base form are separated by a different label

[Ber89] Victoria Bergman: Title. 1889.

[Ber90] Victoria Bergman: Title. 1890.

[Ber89] Sophie Bergstra: Title. 1989.

The way alpha and other BibTeX styles based on alpha.bst assign the extra label disambiguation labels is very susceptible for situations like this. It works as follows.

BibTeX iterates over the list of sorted entries (where sorted means sorted as the entries would appear in the bibliography). At each entry it checks if the base sort label (e.g. Ber1889) is the same as the previous base sort label. If that is the case, a counter is incremented and an extra.label letter is added. (BibTeX then has to do a reverse pass to make sure to add 'a' to each first entry with the same base-label. – It couldn't do that in the first step because at the point the first item with a particular label is processed it is not clear whether it will remain the only item with that base label.)

Note that this happens in a very simple loop where only the previous label is remembered. There is no list of all previous labels.

Therefore this method breaks down when the same base labels are passed to BibTeX with a different label in between. In Ber89, Ber90, Ber89 the previous and current label always differ (even if we only look at the label and not the sort label), hence no extra label is generated.

We can try to work around this problem by decoupling the extra label iteration from the actual sorting for the bibliography. First we sort the items by their visible label to generate a list where the same labels follow each other. Based on that list we generate the extra.label. Then we sort the entries for the bibliography using the sort labels.

The diff (against the original babalpha-fl-gs.bst from https://tex.stackexchange.com/a/441877/35864 and https://gist.github.com/moewew/158481168f4a2135764f96fc608a1998) for the required changes is

--- babalpha-fl-gs.bst  2019-02-02 13:38:29.856655800 +0100
+++ babalpha-fl-gs-sort.bst 2019-02-02 13:40:07.936905700 +0100
@@ -1,3 +1,11 @@
+%%%%%% `babalpha-fl-gs-sort.bst'
+%%%%%% babalpha-fl-gs with tweaked sorting
+%%%%%% for https://tex.stackexchange.com/q/472951/35864
+%%%%%% 2019-02-02 MW
+%%%%%% available at
+%%%%%% https://gist.github.com/moewew/6a59fc23db6d2ab219b6f189a3645a06
+%%%%%% header of `babalpha-fl-gs.bst' follows
+%%%%%%
 %%%% `babalpha-fl-gs.bst'
 %%%% a copy of `babalpha-fl.bst' that automatically tries to suppress
 %%%% historically problematic abbreviations
@@ -59,7 +67,7 @@
     year
   }
   {}
-  { label extra.label sort.label }
+  { label extra.label sort.label real.sortkey }

 INTEGERS
   { output.state
@@ -1498,7 +1506,9 @@
   if$
 }

-FUNCTION {presort}
+% label generation: extra label
+
+FUNCTION {calc.real.sortkey}
 {
   calc.label
   sort.label
@@ -1529,10 +1539,22 @@
   sort.format.title
   *
   #1 entry.max$ substring$
+  'real.sortkey :=
+}
+
+FUNCTION {labelgenpresort}
+{
+  calc.real.sortkey
+  label
+  "    "
+  *
+  real.sortkey
+  *
+  #1 entry.max$ substring$
   'sort.key$ :=
 }

-ITERATE {presort}
+ITERATE {labelgenpresort}

 SORT

@@ -1549,13 +1571,13 @@
 }

 FUNCTION {forward.pass}
-{ last.sort.label sort.label =
+{ last.sort.label label =
     { last.extra.num #1 + 'last.extra.num :=
       last.extra.num int.to.chr$ 'extra.label :=
     }
     { "a" chr.to.int$ 'last.extra.num :=
       "" 'extra.label :=
-      sort.label 'last.sort.label :=
+      label 'last.sort.label :=
     }
   if$
 }
@@ -1580,6 +1602,18 @@
 ITERATE {forward.pass}
 REVERSE {reverse.pass}

+% actual sorting
+
+FUNCTION {presort}
+{
+  real.sortkey
+  'sort.key$ :=
+}
+
+ITERATE {presort}
+
+SORT
+
 FUNCTION {begin.bib}
 {
   et.al.char.used

The new file babalpha-fl-gs-sort.bst can be found at https://gist.github.com/moewew/6a59fc23db6d2ab219b6f189a3645a06

\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[USenglish,french,main=ngerman]{babel}
\usepackage{csquotes}
\usepackage{babelbib}
\usepackage{hyperref}
\bibliographystyle{babalpha-fl-gs-sort}

\usepackage{filecontents}
\begin{filecontents}{\jobname.bib}
@string{acmp = {ACM Press}}
@string{aw = {Addison-Wesley}}
@string{ol = {Oldenbourg Wissenschaftsverlag}}
@string{pren = {Prentice Hall}}
@book{Bertrand-CalculDesProbabilites,
  title     = {Calcul des probabilit{\'e}s},
  author    = {Joseph Bertrand},
  year      = 1889,
  publisher = {Gauthier-Villars},
  language  = {french},
}
@book{Bergstra_89,
  author    = {Jan Aldert Bergstra},
  isbn      = {0-201-41635-2},
  language  = {USenglish},
  note      = {Editors: J. Heering and P. Klint},
  publisher = acmp # { and } # aw,
  series    = {ACM Press Frontier Series},
  title     = {Algebraic specification},
  year      = {1989},
}
@book{Eckel_99,
  author    = {Bruce Eckel},
  language  = {USenglish},
  publisher = pren,
  title     = {Thinking in {C++}},
  year      = 1999,
}
@book{Eckel_02,
  author    = {Bruce Eckel},
  language  = {USenglish},
  publisher = pren,
  title     = {Thinking in {Java}},
  year      = {2002},
}
@misc{BroyEtAl-ModellierungVerteilterSysteme,
  author   = {Manfred Broy},
  language = {ngerman},
  note     = {Vorlesungsskript},
  title    = {Modellierung verteilter Systeme},
  year     = 2014,
}
@book{Brooks_87,
  author    = {Rodney Allen Brooks},
  language  = {ngerman},
  publisher = ol,
  title     = {{LISP}: Programmieren in Common {Lisp}},
  year      = 1987,
}
@book{Bertrand-90,
  title     = {Calcul des probabilit{\'e}s II: Return of the Kolmogorov},
  author    = {Joseph Bertrand},
  year      = 1890,
  publisher = {Gauthier-Villars},
  language  = {french},
}
\end{filecontents}

\begin{document}
\nocite{*}
\cite{Bertrand-CalculDesProbabilites,Bergstra_89,Bertrand-90}
\bibliography{\jobname}
\end{document}

produces

[Ber89a] Joseph Bertrand: Calcul des probabilités. Gauthier-Villars, 1889.//[Ber90] Joseph Bertrand: Calcul des probabilités II : Return of the Kolmogorov. Gauthier-Villars, 1890.//[Ber89b] Jan Aldert Bergstra: Algebraic specification. ACM Press Frontier Series. ACM Press and Addison-Wesley, 1989, ISBN 0-201-41635-2. Editors: J. Heering and P. Klint.//[Bro87] Rodney Allen Brooks: LISP: Programmieren in Common Lisp. Oldenbourg Wissenschaftsverlag, 1987.//[Bro14] Manfred Broy: Modellierung verteilter Systeme, 2014. Vorlesungsskript.//[Eck99] Bruce Eckel: Thinking in C++. Prentice Hall, 1999.//[Eck02] Bruce Eckel: Thinking in Java. Prentice Hall, 2002.

moewe
  • 175,683
  • +1 So this seems to be a very old bug, right? – Dr. Manuel Kuehner Feb 01 '19 at 21:30
  • 1
    @Dr.ManuelKuehner If you want to call it a bug (and I would be tempted to say this is more than an undocumented feature), then yes, it is quite old (alpha.bst has had the relevant code since at least 1988). But since issues like this occur only in very specific situations (same author/name label part, same last two digits of the year, but different century) it probably did not come up very often. – moewe Feb 01 '19 at 21:42
  • 1
    @user49915 Well, it depends on your definition of 'wrong'. That's why I asked if you could waive the requirement for chronological sorting. I'm not sure if there is a way here to have your cake and eat it too. I'll try to have a look at that tomorrow, but I wouldn't be too hopeful. – moewe Feb 01 '19 at 22:00
  • @user49915 It should be possible to modify the style to sort by author and year basically ignoring the label to get the Uthor/Uthrinson order you would like to see, so if you are really interested in that you can ask a new question about that or look into it yourself. In the final babalpha-fl-gs-sort.bst you probably only need to change calc.real.sortkey a bit. Implementation wise the style leaves the original sorting scheme intact and adds a decoupled extra run for the disambiguation letters, that means it should be safe and should only change the label disambiguation for the better ... – moewe Feb 04 '19 at 09:19
  • ... (i.e. fix the issue the question was about). The idea of adding additional letters to make the abbreviations unique again is interesting, but I imagine it would be tricky to implement in BibTeX, where a list of all labels is not readily available, so a serious amount of backtracking would be required. Again, if you think it is worthwhile you can ask a new question. I'm quite certain that I would not find an adequate solution, but I would have a look nonetheless. In extreme cases you might even end up with the full name of the author and a year: – moewe Feb 04 '19 at 09:22
  • The possibly variable length would somehow negatively impact the advantages of alphabetic citations, namely their compactness. (Given that and the issue of sorting alphabetic styles, I myself am leaning towards using numeric or author-year styles. Edge cases and some conceptual things are a bit meh with alphabetic.) – moewe Feb 04 '19 at 09:23
  • I will not upload babalpha-fl-gs-sort.bst to CTAN. Firstly, because I have come to the conclusion that I don't believe in the idea of the "gs" bit of the style. Secondly, because uploading packages implies responsibility for maintenance that I'm not happy to take for this project. Thirdly because I believe that the general interest would be quite low (there is already enough stuff on CTAN). But if you believe it would make sense to put it on CTAN and are ready to maintain the style feel free to do so. ... – moewe Feb 04 '19 at 09:28
  • ... I explicitly waive any rights under the stack exchange license agreement that would prevent you from publishing my code under LPPL 1.3c. To be precise I relicense my contributions to babalpha-fl-gs-sort.bst and babalpha-fl-gs.bst discussed and linked above under the LPPL 1.3c. It is my understanding that together with the code not written by me, which is also licensed under the LPPL, the entire file can be distributed under the terms of the LPPL. – moewe Feb 04 '19 at 09:31