15

What is the ideological difference between CSL and bibtex/biblatex? I know there had been discussion here on how one can use CSL with LaTeX, but how (and why) they differ, and why there should exist several methods of styling citations/bibliographies is not obvious at all. Is it that while bib(la)tex is aimed at citing with LaTeX, CSL is aimed at citing with office suites like MS Word/LibreOffice? Which is more universal (and in what sense)? Why does CSL seem to be more popular among software like Zotero and Mendeley (is writing styles simpler than in biblatex?)? Do other standardized systems for formatting citations/references exist, apart from CSL and bib(la)tex?

I had tried to tackle these questions myself. Frankly, I found the CSL specification not very illuminating (for me, biblatex is documented more rigorously). In fact, CSL does not even specify which metadata is required for which document type (well, in biblatex the division required/optional also became somewhat non-strict and style-dependent, but it's still there). Even more strange, the type definitions in CSL are not explained at all! E.g., what is the difference of just article from other articles (article-journal, article-magazine and article-newspaper)? Or the difference between post and post-weblog? Nowhere I could find the answers.

The particularly bizarre thing for me is the following. Both biblatex and CSL have the same ultimate goal: styling the citations and bibliographies. Yet, the overlap between CSL variables and biblatex data fields is roughly about 60-70%. How can they do the same thing with different data? E.g., there are no analogs in biblatex of, e.g., these CSL variables: archive_location, archive-place, call-number, medium, reviewed-author, illustrator, jurisdiction,... At the same time, there are no analogs in CSL of, e.g., these biblatex fields: subtitle, titleaddon, eprint, reprinttitle, origlanguage,...

Sorry that the questions are not strictly TeX-related, but where else can I hope to get the answer to them.

tkp
  • 103
Maximko
  • 253
  • 1
    Though your question seems fairly clear (even if broad), may I ask if you have an agenda in this question (in the sense that you want to use this for solving a particular problem / making a particular choice). I ask because, even though one might compare biblatex and CSL in this sense, one does not really usually have a choice between them, given any particular setup for producing your overall text. So, in LaTeX, either BibTeX or biblatex. In M$Word or LibreOffice some of the major players in bibliography management provide CSL based structures. – gusbrs Jun 04 '18 at 19:04
  • 1
    The context for my question is that I'm developing a desktop app to manage bibliographic items. I found myself unhappy with the analogous software I happened to use (which exactly and why is a different topic :) ). So I want to make an app with the prime focus on accessibility to pdf-files (because I often have to read simultaneously many papers), but also would like to implement the functionality of citation manager, which, ideally, would support both bib(la)tex and CSL equally well. Hence, the question :) – Maximko Jun 05 '18 at 03:13
  • So for myself, there is actually no choice - I've been using bibtex for years (now switched to biblatex). But I thought native support for CSL would also be nice. – Maximko Jun 05 '18 at 03:15
  • 1
    see https://github.com/fiduswriter/biblatex-csl-converter for a converter between CSL BibLaTeX databases. – michal.h21 Jun 05 '18 at 09:00
  • I think, the special case is APA (Amer­i­can Psy­cho­log­i­cal As­so­ci­a­tion) style, which is IMHO more complete in biblaex: https://ctan.org/pkg/biblatex-apa – koppor Jun 05 '18 at 11:36
  • 1
    @Maximko Please check the ultimate comparison of reference management software and the Wikipedia comparision for related work. - In JabRef we are working hard towards new features. Currently, we are working on full text search. Full biblatex support included. You are really invited to join the team to make JabRef even better. – koppor Jun 05 '18 at 11:40
  • @Maximko Ah, I see! Development is one situation where the choice between them would exist. ;) I'll be glad to see your new app around, when it comes. – gusbrs Jun 05 '18 at 11:59

3 Answers3

15

Update

People who are interested in using CSL with LaTeX may want to have a look at https://www.ctan.org/pkg/citation-style-language. See for example https://tex.stackexchange.com/a/618815/35864.


This is probably not going to be a full answer, but maybe it can shed some light on the whole issue of CSL and biblatex.

History and introduction

biblatex development started back in 2006 with big milestones in 2008 and version 1.0 – the first official release – in 2010. According to Wikipedia, CSL came about at roughly the same time if maybe a little earlier (there are pages dating back to 2004 that are associated with it, Zotero supported CSL upon release in 2006).

biblatex basically is a reimplementation and reinvention of the BibTeX way of creating bibliographies that had been with the TeX world since the late 1980ies. With BibTeX the .bst files written in their own reverse Polish notation language determine the output of the bibliography. BibTeX compiles all the information from the .bib file and outputs it to the .bbl in the expected format, LaTeX then simply imports the .bbl and typesets its contents. With biblatex on the other hand the formatting is done on the LaTeX side. That means that bibliography styles and citation commands can be written in TeX, making it more accessible to those who don't want to learn the reverse Polish notation of BibTeX. The backend (BibTeX or Biber, the latter is needed to enjoy the full set of biblatex features) is reduced to extracting the relevant entries from the .bib file, sorting them and parsing them into a usable data structure for LaTeX. All this means that biblatex is still intricately tied to the LaTeX world and that is probably not of much use outside of it. biblatex's standard data model is an extension of the 'usual BibTeX data model' (i.e. those types and fields supported by the BibTeX styles). Even before biblatex a not insignificant number of functions of some bibliography packages was implemented on the LaTeX side and even more styles allowed modifications to be applied via LaTeX macros (think natbib, jurabib, ...), but biblatex offered a unified LaTeX interface for all levels of the style.

CSL is an XML-based language for citation and bibliography styles. It aims to be a universal language that can be used by all kinds of reference managers and word processors. From what I could see CSL has its own version of most – if not all – features of biblatex (I assume it has some features that biblatex does not have and that it can solve some things in a nicer way, I'm especially thinking about citation vs. bibliography sorting). The great advantage of CSL over biblatex is its community and reach: Many people seems to be using CSL and there is a vast number of styles available and ready to use. There are a number of biblatex styles available, but in the end people often end up having to do some modifications to their styles themselves. I assume this happens rarely with CSL styles.

The ideological difference

I don't actually think that there is a big ideological difference other than biblatex is basically LaTeX-only and CSL is supposed to be a universal standard.

Why should there exist different methods?

I assume things like this just happen. Furthermore I'd venture the guess that neither project was initially aware of the other.

BibTeX had been around for a while when biblatex was conceived, but it had been a bit complicated even for experienced TeX users because it used a different language altogether. biblatex is still backwards compatible with BibTeX and so the transition to biblatex was made easy for people. biblatex and BibTeX share the same .bib file base that has been around for years, people have become used to it, many programmes and website can export to .bib ... there is a .bib ecosystem – and once one gets the hang of it, .bib files are quite easy to use.

In the TeX world there is more to bibliographies than just biblatex: Why do amsrefs and biblatex co-exist? Why are there natbib and jurabib, ...?

CSL on the other hand was never only aimed at TeX and so it didn't really need TeX to get off the ground. Since TeX already had a working system for citations and bibliographies there never was a real need for a CSL implementation. I assume these two facts contributed to the result that we see today: No one has implemented CSL for LaTeX yet – and those that consider it would seriously have to ponder if the work spent on such a project is worth it.

Why is CSL popular among reference managers?

Because it is a well-developed standard (at least it seems to be to me) based on the ubiquitous XML that people are familiar with and can actually understand. biblatex on the other hand is written in LaTeX and configured in LaTeX, which means that if a software wanted to use it, it would have to parse LaTeX (in one way or another, not necessarily all of it, but XML seems easier).

I have never written a CSL style, but I would say that writing a CSL style looks easier than writing a biblatex style, especially if you are not familiar with LaTeX. The structure of the standard biblatex styles and most contributed styles (following the examples set by the standard styles) exhibit some idiosyncrasies that one might not necessarily follow if one were to write a biblatex style from scratch in a similar fashion to CSL, so the comparison might be a bit unfair: apa.csl weighs in at around 720 lines, while apa.bbx and apa.cbx (the main part of biblatex-apa) add up to around 2800 lines of code.

Small overlap

On paper the overlap between CSL and biblatex may indeed only be 60-70% (my feeling would be that the overlap is higher, but I haven't checked), but the differences are in field and entry types that are rarely used and not that common. If we add weights proportional to usage I'd say we end up at a level of overlap close to 90%. For the most part your average bibliography (a journal article with DOI, a book, a paper in a collection, a simple website ...) could probably be transferred from one system to the other without loss of information.

As to why the nominal overlap is smaller than one might expect for similar systems, that is probably just a matter of culture and who implemented it. I think it is fair to say that the development of BibTeX and biblatex was carried out by essentially one developer with input from the TeX community (mostly, but certainly not exclusively drawn up from STEM-related fields; biblatex is hugely influenced by the law-focussed jurabib project). CSL development seems to have been driven by one or two people in the beginning as well, but their input probably came not only from TeX, but from a more diverse or at least different group of people (Open/Libre Office, Zotero). Many of the fields from CSL that you list as missing in biblatex look like something a librarian might immediately think of. The fields listed as missing in CSL but present in biblatex are a more inhomogeneous bunch and it seems that they address specific issues that might be considered to be of minor importance by some (subtitle makes for shorter author-title citations where the subtitle can usually be dropped, titleaddon for additional notes, eprint for repositories such as the arXiv).

moewe
  • 175,683
  • Awesome! Thanks. The only bit left unclear to me is why the CSL specification is rather sloppy about the definitions of entry types and variables, given the wide-spread usage of CSL. – Maximko Jun 05 '18 at 07:58
  • @Maximko I wondered about that as well. I could not find any specification about what the entry types mean, but I guess common sense helps for the often used ones. But I'm lost when it comes to post and post-weblog, that is probably a question for https://github.com/citation-style-language/documentation/issues. BTW: You may want to hold off accepting this answer, maybe someone else with a better perspective on CSL comes along and weighs in... – moewe Jun 05 '18 at 08:12
  • @Maximko A comment on the entrytypes and fields. My view is that, for any system, entrytypes and fields can be arbitrary, as long as they are defined in your style/system. You will find, e.g. biblatex-chicago with entrytype article but entrysubtype newspaper or magazine. I myself have at a customized data model archcollection and archdoclocus, which seem very much like CSL's archive_location, archive-place. I'd say this terminology is largely standardized, but is not so on several "fringe cases". But of course, the differences complicate the portability of the data. – gusbrs Jun 05 '18 at 11:55
  • @moewe I'm aware that biblatex in intimately tied to LaTeX, but would it, in principle, be possible to make it communicate with any other app? (I don't mean currently, but whether this would be doable). Or are we here really in the traditional "Only TeX can parse TeX"? – gusbrs Jun 05 '18 at 12:16
  • @gusbrs Some parts of the data model are indeed in flux if you will, but I consider this a bit unfortunate. It is also a result of the user-definable data model where people can add fields and types of their own. (Which also means that the biblatex core does not include and formalise every type or field in use.) I'd have thought that CSL with its centralised style database, rigid specificaton and easily switched styles might have a chance to and would really benefit from nailing down the meanings of types and fields. – moewe Jun 05 '18 at 13:17
  • @moewe I'm not sure how unfortunate it is. There is here a tradeoff between flexibility and standardization I think. Also, given the independence of the different systems, I'm surprised that so much of the specification overlap (no doubt due to considerations of data portability). As to CSL, I've played with it some time ago (not deep though) and, if I remember correctly, one could edit the .csl file and add there an entrytype with its corresponding "driver". – gusbrs Jun 05 '18 at 13:26
  • @gusbrs What do you have in mind when you want biblatex to communicate with another app? I thought about mentioning the "only TeX can parse TeX", but decided not to mention it in the end. biblatex often uses only a subset of biblatex and depending on what you want to do, you might get away with parsing only biblatex-level commands like \printfield, \usebibmacro, \newbibmacro, \DeclareFieldFormat and friends directly. But all that is more or less about other programmes parsing .bbx and .cbx files. – moewe Jun 05 '18 at 13:27
  • @moewe I mean if it would somehow be possible to call biblatex as a "client" by an arbitrary program, supply it with a \cite{key} or \printbibliography[...] (and due data context) and receive the result in formatted text (as opposed to typeset text). (I know, strange idea, but would solve so many problems...) – gusbrs Jun 05 '18 at 13:34
  • 1
    @gusbrs Mhhh, I guess in that case one would have to re-implement biblatex in a way that it outputs the results in a usable format. Due context could also mean the entire text as it should appear (biblatex has tests to check if things are on the same page, etc. etc.), so that could be a bit tricky. My money would have been on a implementing the general functions of biblatex and parsing .bbx and .cbx files... – moewe Jun 05 '18 at 13:47
  • Tricky as I feared. Thanks for sharing your thoughts on the matter. – gusbrs Jun 05 '18 at 13:54
  • @moewe Not trying to lighten up an old thread, just for the record. This old answer https://tex.stackexchange.com/a/91608/105447 suggests that BibDesk sort of tries to use BibTeX as a "client" and is capable of dealing with biblatex styles previews (BibTeX backended). The idea is simple but, as far as I can tell (I haven't yet looked into details), seems promising. – gusbrs Nov 29 '18 at 11:03
  • @gusbrs I assume they just normally produce a PDF with LaTeX and BibTeX (and convert that to text or an image?). JabRef on the other hand seems to offer CSL-based style previews. – moewe Nov 29 '18 at 11:14
  • @moewe Yes, that's what I understand it does. Of course, as it does with BibTex, there is really no reason why it could not be done with biber. Extending the idea, figure the following setting. I have a light markup file, lets say markdown, with biblatex commands. I feed this into some script which strips those commands, prepares a dummy LaTeX file with appropriate biblatex options, run it, convert the result with tex4ht and replace the biblatex commands in the original mardown file. Too wild a scenario? – gusbrs Nov 29 '18 at 11:23
  • @gusbrs Sure, that sounds about feasible, with the usual caveat that one may have to expect some rough edges at each conversion step. The question is why one would want to go through such great lengths to use biblatex. I don't happen to think that biblatex is particularly user-friendly (especially at first contact), so if you are into biblatex already, you are probably quite into LaTeX in general. And if you are not, then CSL might be much easier. (Of course there are situations where even a LaTeX fan might want or be forced to use something else.) – moewe Nov 29 '18 at 12:09
  • @moewe I does sound feasible, but yes rough edges are bound to abound. This is a long standing issue for me, which I'm yet to find a really good solution. As to reasons, I can think of at least two (my own ones). For once, and I've already told you that before, I can't think of any other system/tool besides biblatex which would fulfill my citation/reference requirements on a comparable basis. Second, yes I'm into LaTeX and I prefer to use it, but I frequently have to meet the requirement of a .docx or similar for submission. It is sometimes also convenient for sharing with non LaTeX folks. – gusbrs Nov 29 '18 at 12:18
  • @moewe Again, just for the record. Joseph Wright's comment at GitHub: "Better long-term would be I guess to have a way of redirecting bibmacro output to the log." will leave me eagerly waiting for this test suite for spurious reasons. :) That would be pretty close to what I had in mind to start with. – gusbrs Dec 16 '18 at 13:38
  • @gusbrs Well, don't hold your breath. As I understand it, the current system is fundamentally unexpandable in some places and so writing to the .log or producing 'usable LaTeX code with only minimal markup' (similar to BibTeX's .bbl) are very far off at the moment. – moewe Dec 16 '18 at 13:49
8

I'm the creator of CSL, which I originally created for very simple reasons:

  1. While I used LaTeX, I work in a field which requires Word-compatible files; LaTeX not accepted.
  2. I saw no reason why such styling needed to be tightly-coupled to the output format. So my prototype XSLT implementation had output drivers for LaTeX (bypassing BibTeX), HTML, and RTF, IIRC. I found that worked pretty well and so was a proof-of-concept (I used it to format my first book), and has since remained a priority.

One additional advantage of that output flexibility is a much larger potential user base for CSL styles, which in turn can lead to many more supported styles.

I have a lot of respect for biblatex though; it's a great solution if you can work exclusively in LaTeX.

But you probably want to use CSL if you need non-LaTeX output targets.

  • just one addition regarding BibLaTeX and output formats: TeX4ht can convert documents that use BibLaTeX to HTML and ODT. – michal.h21 May 03 '22 at 20:28
2

Not an answer, more a musing on feasabilty.

Given biblatex's current abilities, a CSL-Latex "style" certainly looks possible.

Further, using multiple citation styles in the one document is also possible with current biblatex, so probably the CSL-equivalent of that could be called something like CSLL, for the meta-language.

Biblatex is doing so many things, the concept of "axes" might be useful. CSL would involve the style axis (and perhaps the item-sequence axis).


Using legal citation as an example, legal citation is basically an author-title system, where the author is understood (a court, a parliament, a regulatory authority), and the title is the name of the case, statute or regulation.

The level of granularity required for a bibentry is driven by the lowest-level specification across all legal citation styles. For example, one style will require an upright v ("versus") followed by an abbreviation dot (v., eg Modern Law Review) where another style specifies dotless italic v (v, eg McGill in Canada, with c, "contre", in Canadian French). So the granularity requires the party separator (if there is more than one party) to be a field in its own right.

So here there are already four style "atoms": (i) whether or not there is a party separator abbreviation in the case name; (ii) whether it is v or c or something else; (iii) whether it is italic or not; (iv) whether it is dotted or not. This is all really just toggling and choices.

Then there a times when multiple citation styles are required in the same document, and having to use just one citation command to handle that would be convenient.

Using key-value pairs is one way to implement a (multi)style like this, so that a particular citation style becomes "just" a package option which sets a set of (usually many) key-value pairs.


I

For example, McGill-style could be selected via package option:

\usepackage[...,lawcitestyle=mcgill,...]biblatex

which in turn sets the style parameters and toggles

\newcommand\lcsetstylemcgill{%
   \togglefalse{partysepdotted}%
   \toggletrue{partysepitalic}%
   \toggletrue{partynamesitalic}%
   ...
}

which in turn means that the bibmacros and fieldformats need only look at their own fields, logically independent of which actual style the user has selected (or switched to or from):

\DeclareFieldFormat{partysep}{%
%#1
\iftoggle{partysepitalic}{%true
\mkbibemph{#1}{\iftoggle{partysepdotted}{\adddot}{}}%
}%end true
{%false
#1{\iftoggle{partysepdotted}{\adddot}{}}%
}%end false
}%end

(Note: the above code could be re-written more efficiently)

Changing styles, even mid-sentence, becomes trivial. Just toggle the toggle(s). Or better: a switch to do that.


II

Sometimes, though, the sequence of items needs to be re-arranged "on the fly".

So with a user-defined sequence of items stored in a bibentry field (in effect, the items are a type of very simple markdown):

@book{ualbertabook,
author={Kevin P McGuinness},
title={Canadian Business Corporations Law},
edition = {3},
volume = {1},
publisher={LexisNexis Canada},
location = {Toronto},
date={2017},
keywords = {lawbook},
yoptions = {name, dot, space, title, comma, space, edition, space, lparen,  location, colon, space, publisher, comma, space, year, rparen, space, volume, dot},
}

the citation command can iterate through the list, doing a regex-replace

\regex_replace_all:nnN 
    { comma } 
    { \c{addcomma} } 
    \l_myscriptname_tl
   ...

Easier to see the item-sequence as a table:

user sequence

This is a "bottom-up" approach, where style "atoms" build up to the result.


III

And rather than necessarily storing the style information in a bibentry, dynamic commands can be used.

Here is a very simple example:

With an intended sequence/structure like this:

guide=aglc
 citetype=case
 items
    item:title
      itemtitlepart:partya
        itemtitlepart:partya:format=italic
        itemtitlepart:partya:delim=space
      itemtitlepart:partysep
        itemtitlepart:partysep:text:default=v
        itemtitlepart:partysep:format=italic
        itemtitlepart:partysep:delim=space
      itemtitlepart:partyb
        itemtitlepart:partyb:format=italic
         itemtitlepart:partya:delim=none
    item:title:format=none
  item:title:delim=space
    item:refmnc
       item:refmnc:year
          item:refmnc:year:format=brackets
          item:refmnc:year:delim=space            
       item:refmnc:courtname
          item:refmnc:courtname:format=none
          item:refmnc:courtname:delim=space 
       item:refmnc:casenumber
          item:refmnc:casenumber:format=none
          item:refmnc:casenumber:delim=none

then using a high-level style-pointer like this:

\renewcommand\lguide{aglc}

and a bibentry like this:

@case{croome,
  partya = {Croome},
  partyb = {Tasmania},
  caseshortname = {Croome},
%paper
  reportyear={1997},
  reportvolume = {191},
  reportseries = {CLR},
  reportpage = {119},
 }

cited like this:

\yycite[\nopp 125]{croome}

which uses the cite command to call (in this case) a core macro

\DeclareCiteCommand{yycite}%
{\usebibmacro{prenote}}%
{%
\usebibmacro{yycore}% <===================
}%
{%
\multicitedelim%
}%
{\usebibmacro{postnote}}

which in turn calls the sequence of items

\newbibmacro{yycore}{%
%\usebibmacro{set:multidelim}%
\usebibmacro{yycore:seq}% <===================
\usebibmacro{yycore:postnoteprelim}%
}

which in this case is static/hard-coded, but can instead be a dynamic macro:

\newbibmacro{yycore:seq}{%
\usebibmacro{yy:case:partya}%
\usebibmacro{yy:case:partyb}%
\usebibmacro{yy:case:mnc}%
\usebibmacro{yy:case:report}%   
}

which in turn unfolds or expands into (for example, looking at the first item) the formatted item and its (formatted) delimiter:

\newbibmacro{yy:case:partya}{%
\iffieldundef{partya}{}{%
\printfield[\lguide :case:item:titlepart:partya:format]{partya}%
\printfield[\lguide :case:item:titlepart:partya:delim]{partya}%
}%
}

the item's formatting for the selected style looking like this

\DeclareFieldFormat[case]{aglc:case:item:titlepart:partya:format}{\mkbibitalic{#1}}

and the delimiter looking like this

\DeclareFieldFormat[case]{aglc:case:item:titlepart:partya:delim}{\addspace}

and so, similarly for the other fields, to produce:

aglc style

the whole aglc:case:item:titlepart:partya:delim string could be dynamically built up from :-separated macros defined from reading an input file (someone will still have to write the legal CSL file(s), though :) ).


Biblatex is already applying styles (through style files), so the equivalent of CSL.

Biblatex is also doing style transformations ("CST") on style objects ("CS:O"), so, intuitively, mapping a CSL-style into a Biblatex-style, at the atom-to-atom level, should be straightforward. Like a bundle of options.

Doing it at the style-level, where the "atoms" already form a style "molecule" so to speak, would be much more intricate (though not impossible) with, presumably, unwrapping and re-wrapping things.

Cicada
  • 10,129
  • There is a lumber-room of code and modules at https://github.com/texcicada/lawcite for the curious. I'll have to systematise it. Too many axes being explored. ;) – Cicada Oct 12 '21 at 12:29
  • there is also new Lua CSL processor: https://github.com/zepinglee/citeproc-lua. It is not ready yet, but it is definitely intended to be used with LuaLaTeX. – michal.h21 Oct 12 '21 at 14:38
  • @michal.h21 Do you want to put an update on https://tex.stackexchange.com/questions/69267/citation-style-language-csl too? – Cicada Oct 13 '21 at 05:39
  • Sure. I've created a simple LaTeX interface for the library and posted it as an answer. – michal.h21 Oct 13 '21 at 13:07