15

Is there a fool-proof way to extract all bibtex citation-keys that are cited in a .tex file?

I do not mean regular-expression magic on the .tex-file because this is bound to cause problems when switching between natbib, apacite etc. which all use different citation commands. Also, citations made using \nocite{*} will not be included ...

I though about looking into the .bbl file which does contain all references included in the final document but the format of the .bbl file differs vastly between packages as well such that the key-extraction is difficult.

Thorsten
  • 12,872
thias
  • 345
  • 2
  • 10

3 Answers3

11

With bibtool you can do as follows:

bibtool -- preserve.key.case=on -x file.aux -o bibliography.bib

This extracts your cited bibliography. Now if you just grep the file for lines with @ in them, you get fairly close to a list of keys. The option preserve.key.case=on ensures that the case of the keys is not altered (in response to the comment below).

ttq
  • 140
Seamus
  • 73,242
  • 1
    nice! However, I want to get along without bibtool. I found that it destroys some of my entries. For example, it decapitalizes my keys all the time... – thias Oct 19 '11 at 13:37
  • See also bibexport [https://tex.stackexchange.com/a/41823/28411] – alexis Jul 23 '20 at 20:20
9

The citations are contained in the .aux file.

\usepackage{atveryend}
\makeatletter
\let\origcitation\citation
\AtEndDocument{\def\mycites{\@gobble}%
  \def\citation#1{\g@addto@macro\mycites{,#1}\origcitation{#1}}}
\AtVeryEndDocument{\typeout{***^^JCited keys: \mycites^^J***}}
\makeatother

This will show on screen and in the .log file, at the end of the LaTeX run, a message such as

***
Cited keys: xxx,yyy,*
***

It would be possible to avoid the appearance of *, but I don't think it's worthy the trouble. Only actually cited keys will appear (BibTeX uses \citation{*} as a signal for including the whole database).

One can output the citations to an auxiliary file, instead:

\makeatletter
\let\origcitation\citation
\AtEndDocument{\def\mycites{}%
  \def\citation#1{\g@addto@macro\mycites{#1^^J}\origcitation{#1}}}
\AtVeryEndDocument{\newwrite\citeout\immediate\openout\citeout=\jobname.cit
  \immediate\write\citeout{\mycites}\immediate\closeout\citeout}
\makeatother

Then, if the file is test.tex, the citation keys will be saved in the file test.cit one per line.

egreg
  • 1,121,712
  • Yes, I noted that \nocite{*} citations do not appear. However, in the .aux file, all items appear in \bibcite{} commands (at least in my current test-setup)... – thias Oct 19 '11 at 13:34
  • \bibcite entries are written when reading the thebibliography environment, so also keys coming from \nocite{*} will be there. The right entries are the \citation ones. – egreg Oct 19 '11 at 14:39
  • so, replacing \citation with \bibcite everywhere in your code will output all the used citation keys? That would be exactly what I need... – thias Oct 19 '11 at 14:58
  • I tested it and it works. One minor issue: latex breaks the output at (probably) exactly 80 characters such that the list is broken at weird places. Since I want to use the output in a script, is there a way to output it without line-breaks? – thias Oct 19 '11 at 15:00
  • @thias Yes, it's possible, see edited answer – egreg Oct 19 '11 at 15:14
  • 2
    In unix-like systems, one can write cat myfile.aux | grep "\\\\citation" | sed 's/\\citation{\(.*\)}/\1/g' | sort | uniq > myfile.cit and the resulting file contains all used citations, and each of them exactly once. – yo' Feb 03 '12 at 13:35
  • @tohecz: Good approach. However, to be fool-proof you additionally have to replace , by newlines in the sed output to get one key per line – the \citation argument may be a comma-separated list of keys. – Daniel Feb 03 '12 at 13:48
  • @Daniel This one should be safer: cat myfile.aux | grep "\\\\citation" | sed -e 's/\\citation{\(.*\)}/\1/g' -e 's/,/\n/g' -e 's/ *\(.*\) */\1/g' | sort | uniq > myfile.cit. How does it work: grep selects the correct lines, sed 1st extracts the parameter, 2nd splits the commas, 3rd trims the whitespace, sort and uniq are self-explanatory. – yo' Feb 03 '12 at 13:57
  • @tohecz: Not really, apparently it is not so easy to insert a newline with sed. However, the following works: cat paper.aux | grep "\\\\citation" | sed -e 's/\\citation{\(.*\)}/\1/g' -e 's/ *\(.*\) */\1/g' | tr ',' '\012' | sort -u – Daniel Feb 03 '12 at 14:09
  • @Daniel well, I tested it and it worked :-/ can you post the line from .aux that caused you the troubles? – yo' Feb 03 '12 at 14:15
  • @Daniel btw, my LaTeX puts in the .aux file \citation{a}<newline>\citation{b} for \cite{a,b} in the code, hence it seems the whole thing with comma is unnecessary – yo' Feb 03 '12 at 14:17
  • @tohecz I can confirm that my sed doesn't allow \n in the substitution string. There are many variations of sed around. When you use natbib, arguments to \citation can have the comma. – egreg Feb 03 '12 at 14:29
  • @tohecz: I guess this depends on the bibliography style. If you use \bibliographystyle{acm}, for instance, \cite{a,b,c} is transformed into \citation{a,b,c}. So substitution of commas to newlines seems to be more robust. – Daniel Feb 03 '12 at 14:32
  • @egreg: What is the biblatex way of doing this? Is it possible? – Dror Feb 04 '14 at 14:21
  • @Dror I believe so, but it's probably better to open a new question. – egreg Feb 04 '14 at 14:27
3

Various TeX-aware programming editors have macros to achieve this. For instance, there's a package called bibmacros for use with winedt which (inter alia) does the job you describe. It works on the .aux file created by latex and BibTeX, and creates a new bib file called jobname-minimal.bib, where jobbame is the name of the aux file (without the "aux" extension, of course). Other editors must have similar macros, either built-in or accessible as extra packages.

Mico
  • 506,678
  • 2
    Hey, that's right! For emacs, I found M-x reftex-create-bibtex-file... – thias Oct 19 '11 at 15:05
  • @thias --- Really? I don't see that option in reftex 4.31. Could you elaborate? (I usually use bibtool, but it doesn't handle cross-references in the .bib file very well, so an emacs solution would be great.) – jon Feb 03 '12 at 14:55
  • Found in menu [Ref] -> [Global Actions] -> [Create BibTeX File] – thisirs Nov 14 '12 at 09:15