116

This question led to a new package:
checkcites

While reading papers, I put those that I'd like to refer to or mention in customized .bib files (one .bib file for each paper). (Thanks to Mendeley it's pretty easy and straight forward - they call it "collections").

My intention is that all entries in the .bib file should be mentioned in the text.

How to check which bibliography entries I have not referred to ?

Tim N
  • 10,219
  • 13
  • 63
  • 88
  • If tex code would be helpfull here, or more details, please let me know, I'd love to improve question. – Grzegorz Wierzowiecki Feb 03 '12 at 13:23
  • 1
    Only those entries \cited will appear in the bibliography. If you want to make a .bib file that consists of only those entries cited, then see this: http://tex.stackexchange.com/questions/32032/extract-all-citations-from-tex-file – Seamus Feb 03 '12 at 13:28
  • If you are in a unix-like environment, I can write for you a short script that does exactly what you need. The information on used references is stored in the .aux file in the form \citation{ref-label}. – yo' Feb 03 '12 at 13:29
  • Maybe you could extract the cited references from your .bib file using bibexport to a new file and then diff both to see the differences. – Paulo Cereda Feb 03 '12 at 13:33

7 Answers7

115

Another way is to use the refcheck package and add \nocite{*} in your document. This package will warn you (amongst other things) about unused references. My MWE:

\begin{filecontents}{\jobname.bib}
@BOOK{foo:2012a,
  title = {My Title One},
  publisher = {My Publisher One},
  year = {2012},
  editor = {My Editor One},
  author = {Author One}
}

@BOOK{foo:2012b,
  title = {My Title Two},
  publisher = {My Publisher Two},
  year = {2012},
  editor = {My Editor Two},
  author = {Author Two}
}

@BOOK{foo:2012c,
  title = {My Title Three},
  publisher = {My Publisher Three},
  year = {2012},
  editor = {My Editor Three},
  author = {Author Three}
}
\end{filecontents}

\documentclass{article}

\usepackage{refcheck}

\begin{document}

Hello world \cite{foo:2012a,foo:2012b}.

\nocite{*}

\bibliographystyle{plain}
\bibliography{\jobname}

\end{document}

The output:

My output

The reference keys not cited in your document will be displayed between ?...?. Besides, refcheck adds the following line to your .log file:

<Info by RefCheck> Unused bibitem `foo:2012c' on page 1.

Hope it helps. :)

Update: Based on this question, egreg and I came up with a Lua script called checkcites. The idea of this script is to detect unused or undefined references from LaTeX auxiliary (.aux) or bibliography (.bib) files.

Consider the following example: document.tex

\begin{filecontents}{\jobname.bib}
@BOOK{foo:2012a,
  title = {My Title One},
  publisher = {My Publisher One},
  year = {2012},
  editor = {My Editor One},
  author = {Author One}
}

@BOOK{foo:2012b,
  title = {My Title Two},
  publisher = {My Publisher Two},
  year = {2012},
  editor = {My Editor Two},
  author = {Author Two}
}

@BOOK{foo:2012c,
  title = {My Title Three},
  publisher = {My Publisher Three},
  year = {2012},
  editor = {My Editor Three},
  author = {Author Three}
}

@BOOK{foo:2012d,
  title = {My Title Four},
  publisher = {My Publisher Four},
  year = {2012},
  editor = {My Editor Four},
  author = {Author Four}
}
\end{filecontents}

\documentclass{article}

\begin{document}

Hello world \cite{foo:2012a,foo:2012c},
how are you \cite{foo:2012e}, and
goodbye \cite{foo:2012d,foo:2012a}.

\bibliographystyle{plain}
\bibliography{\jobname}

\end{document}

After compiling it, an auxiliary file document.aux will be generated. Now we will use texlua (which is available in both MiKTeX and TeX Live distros) to run checkcites on the .aux file:

$ texlua checkcites.lua document.aux

The script will look for all unused and undefined references. The output will be:

checkcites terminal

If you want to look only for unused references in your .bib file, you can add the --unused flag:

$ texlua checkcites.lua --unused document.aux

The argument order doesn't matter, so you can also call:

$ texlua checkcites.lua document.aux --unused 

Similarly, using

$ texlua checkcites.lua --undefined foo.aux

will make the script only look for undefined references in your .tex file.

To look for both unused and undefined references, you can also use the --all flag. If no flag is provided, checkcites will behave as if --all was provided.

The script is now available on CTAN and tlmgr will install it on TeX Live 2011 (or later). Hope you guys like our humble script. :)

Update: As checkcites is now available on CTAN and TeX Live, there's no need of explicitly calling texlua nor the original checkcites.lua file; a simple call to

$ checkcites --unused document.aux

will suffice, since the script is properly wrapped. :)

Paulo Cereda
  • 44,220
  • Wow, I was not aware of this package before. Pretty elegant! – Daniel Feb 03 '12 at 22:56
  • Very nice package. I was wondering, is it possible to extend the package so that it creates an output file? for example, a latex file that cites the uncited entries. Or at least a simple text file with the list of unused entries that the user can later exploit to generate the report. It can be useful when you are collaborating and you want to send a PDF with this information to your coworkers. Thank you! – Felipe Aguirre Aug 23 '12 at 07:33
  • 1
    I just tested it with my Thesis.tex which is divided in several .tex files (I use \included to input each chapter). Sadly, checkcites does not work in this case. A citation in Chap1.tex is logged in Chap1.aux, so there is no record of it in Thesis.aux. I naively tried to run checkcites to Chap1.aux, but it does not have \bibdata with the the name of the .bib, so it won't be able to find the bibliography to compare with. Any ideas? Anyway, it is still a great package. – Felipe Aguirre Aug 23 '12 at 07:50
  • 1
    Hi @Felipe! :) Thanks for the feedback. Currently, checkcites can handle multiple .bib files, but AFAIK only one .aux file. I'll take a look at the code and try to work on an improved version. :) About the output to an external file, a workaround for now would be to redirect the output with $ checkcites mydoc.aux > out.txt. Hope it helps. :) – Paulo Cereda Aug 23 '12 at 10:15
  • That is a good solution for the moment. Although, I encourage you to add the functionality of creating a PDF output. It would be much more elegant! I would do it myself, but I don't know LUA and I am writing my thesis, I don't have much time. Thank you!! – Felipe Aguirre Aug 23 '12 at 10:47
  • on the MWE for refcheck solution, I get a File 'refcheck.sty' not found. I get the same error when trying to use it in my larger document. Any suggestions? – SwimBikeRun Jun 17 '13 at 02:46
  • @SwimBikeRun: Sounds like you have an outdated TeX distro. If you have a vanilla install of TeX Live (the one provided by TUG), you need to run tlmgr in order to update your distro (for both Linux and Windows). Now, if you have TeX provided by your Linux distro, you'll need to rely on the available packages in the software manager (apt-get, yum, etc). Take a look at this topic it might help. :) – Paulo Cereda Jun 17 '13 at 10:33
  • I have a thesis.tex file using biber/biblatex, pointing to only one .bib file in a subdirectory, with thesis.aux file in the same directory as thesis.tex. yet i am getting an error "I found 0 citation(s). I couldn't find any bibliography files. I'm afraid I have nothing to do now.". Can someone point out lines of enquiry I can follow to resolve this please? – DGarside Oct 25 '13 at 08:09
  • @DGarside: I believe checkcites doesn't support files in other directory levels, that is, both .aux and .bib have to be at the same level. A possible solution would be copying the .aux file to the same directory of the .bib file; I know it doesn't sound like a good solution at all, but it's what I can think of at the moment. :) – Paulo Cereda Oct 25 '13 at 10:46
  • Perhaps you could edit your answer a bit, as it seems from Know which entry of bib file has not been used that calling texlua explicitly is not necessary. – Torbjørn T. Mar 26 '15 at 09:16
  • @PauloCereda I cannot get the checkcites to work. Does it require that I use the \cite command in my .tex document? I am using \parencite{} and \textcite{}. Also, the checkcites script is says it does not find any bibliography file, but there .bib file is right there in the directory from which I run the script. – Yoda May 28 '17 at 19:44
  • @Yoda: it's been a while, so pardon if I do not remember how my code works. :) If I recall correctly, the script inspects the .aux file and looks for \bibcite. Regarding the .bib file, I might need more info, if possible. Cheers! – Paulo Cereda May 29 '17 at 19:59
  • This is a great tool! As far as I understand, checkcites is currently not able to handle multiple .aux files (which occur e.g. when using \include). Are you planning on adding this feature? –  Jun 09 '17 at 10:02
  • @jpmath: Hi, thanks for the kind words! You are right, checkcites does not support multiple .aux files at the moment, but we are indeed planning to support this feature! Also, the script is not biblatex compliant, so we want to make it compatible too! I am in a hurry because of my thesis, but I will try to do my best and update the tool as soon as possible. :) – Paulo Cereda Jun 09 '17 at 10:17
  • Those are great news (also making it biblatex compliant). All the best for your thesis! –  Jun 12 '17 at 17:07
  • I have a project split into several *.tex (*.aux) files, and I was going to use a for loop in the shell as a workaround for the fact checkcites only works with one file at a time. However, I always get the I couldn't find any bibliography files. message, likely because the bibliography is only included (directly referenced) in the main.tex file, but not in the other files. I am leaving this comment here because other uses will probably also bump into the same issue. – thiagowfx Jul 10 '17 at 02:08
  • @jpmath: new version updated, support for multiple .aux files was included, as well as biblatex support. Cheers! – Paulo Cereda Aug 26 '17 at 10:25
  • 2
    @thiagowfx: I added support for multiple .aux files, hope it helps! Cheers! – Paulo Cereda Aug 26 '17 at 10:26
  • 3
    For biblatex users, use the new --backend option, .e.g. checkcites --backend biber document. No need to provide a file extension! – ApolloLV Jul 20 '18 at 17:57
  • @ApolloLV: Oh my, I forgot to update my own answer! Will do that ASAP. Thanks for reminding me of this! – Paulo Cereda Jul 20 '18 at 18:05
  • For me checkcites $(find . -type f -name '*.aux') worked pretty well! – Daniel Eisenreich Aug 19 '19 at 11:37
23

This question deserves a brief answer to demonstrate how simple checkcites actually is. At your prompt simply type ...

$ checkcites my_document.aux

... and it prints out a nicely formatted list of unused + undefined references.

This works out-of-the-box if you have a full installation of a recent TeX-Live distribution (2011+), where the command checkcites should already by on your path. It's a simple command line tool; no modification in your document is required (e.g. usepackage).

I hope my answer will prevent others to manually download the script from CTAN, study the docs, just to find out that RTFM is overkill in this case :).

bluenote10
  • 1,479
17

One possibility would be the following:

  1. Switch the bibtex bibliography style to unsrt (with biblatex pass the sorting=none option)
  2. Compile your document --> PDF with cited references
  3. At the very end of your document (before the \bibliography command) add a \nocite{*}
  4. Compile your document --> PDF with all references
  5. Compare the bibliography of both PDFs. The "extra" entries at the end are those you have not yet cited.

However, if you work in a UNIX-like environment, I would probably use tools like grep and sed to extract the keys from the .aux file and the .bib file and diff them:

cat paper.aux | grep '\\citation' | sed -e 's/\\citation{\(.*\)}/\1/g' -e 's/ *\(.*\) */\1/g' | tr ',' '\012' | sort -u > paper.keys
cat paper.bib | grep "@.*{.*," | sed -e 's/.*{\(.*\),/\1/' | sort -u > bib.keys
grep -v -f paper.keys bib.keys

The first line extracts all keys given in \citation{key1[,key2 ...]} commands from the aux-file into distinct lines, sorts and unifies them (every entry is contained once once) and redirects the result into the file paper.keys. The second line does the same for the keys contained in @<some type>{key, lines in the bib file. The third line prints out every line from bib.keys that is not contained in paper.keys, which is the delta you are interested in.

Daniel
  • 37,517
  • Why the four backslashes in the argument to grep? And wouldn't it be simpler starting with grep '\\citation' paper.aux? – egreg Feb 03 '12 at 14:39
  • @egreg I think that Daniel took it from me ;) and these things are just a matter of habit I think ;) – yo' Feb 03 '12 at 15:04
  • @tohecz Well, on my system it doesn't work with four backslashes. – egreg Feb 03 '12 at 15:06
  • @egreg: tohecz an me discussed how to build this line in another question, he is right that I took the grep part from him. The point is: If you put the parameter in double quotes, the shell will substitute them down to two backslashes and grep down to one. If you put it in single quotes (which you probably tried) the shell will not perform substitution, so four backslashes are passed to grep, hence the pattern does not match. I have modified the solution to use single quotes. – Daniel Feb 03 '12 at 16:37
  • @egreg: I prefer cat <file> | grep <expression> just because it is then more visible which <file> I am working on (especially if <expression> is nontrivial). This is slightly slower, though. – Daniel Feb 03 '12 at 16:40
  • @Daniel No, I didn't use single quotes: apparently it's tcsh that doesn't reduce \\ to one backslash inside double quotes. So if you want that the one-liner runs on as much shells as possible, single quotes seem the best choice. – egreg Feb 03 '12 at 16:58
  • +1. In the middle line, all.keys should be bib.keys, since that's what you use in the last grep. – Ramashalanka Aug 23 '12 at 00:57
  • +1 Because in contrast to the script mentioned above this one works perfect for multiple .aux files. – Christophe De Troyer Jun 14 '15 at 10:26
11

Given that you are using bibtex and not biblatex, the Bibtool utility may be able to help you. Say you have a myreferences.bib file you've created in Mendeley or wherever. You write paper.tex with the intention of explicitly citing every item from myreferences.bib. To see if you've forgotten anything, use bibtool to extract a bibliography of all and only the references cited in paper.tex and compare its contents to the original:

bibtool -x paper.aux -o paperrefs.bib

Now you can compare paperrefs.bib to myreferences.bib:

diff myreferences.bib paperrefs.bib

Ideally you'll find there's no difference. You could probably work this up into a script of the sort mentioned by tohecz, but using bibtool means you don't have to come up with a regular expression to catch the various forms of citation. Your original myreferences.bib should be sorted alphabetically for this to work properly.

Kieran
  • 791
5

This answer will give you a clue, but only for biblatex, I believe. (I won't duplicate it here)

It allows you to use \nocite{*} to place all the unreferenced .bib file entries in a separate section of the bibliography.

Normally, one might publish this as 'additional reading', for example, but in your case you can use it to help prune your .bib file.

  • 1
    This works only with biblatex. I don't think that the OP uses biblatex ... P.S. I guess you mean \nocite{*} instead of \nocite(*) – Thorsten Feb 03 '12 at 13:36
5

Under unix-like systems, you can make a following bash script, called e.g. diffrefs:

#!/bin/bash
f=${1%.???}
cat $f.aux | grep "\\\\citation" | sed -e 's/\\citation{\(.*\)}/\1/g' -e 's/,/\n/g' -e 's/ *\(.*\) */\1/g' | sort | uniq > $f.cit
# cat $f.aux | grep "\\\\citation" | sed -e 's/\\citation{\(.*\)}/\1/g' -e 's/ *\(.*\) */\1/g' | sort | uniq > $f.cit
cat $f.bib | grep "^@" | sed 's/.*{\|,.*//g' | sort >$f.bit
echo "Unused but defined references:"
diff $f.cit $f.bit | grep "^>"
echo "Undefined but used references:"
diff $f.cit $f.bit | grep "^<"

Usage: diffrefs myfile

Explanation:

  • cat $f.aux ...: outputs the auxiliary file (cat), takes the information on used citations (grep), extracts the labels, removes white-space (sed), sorts and removes duplicates (sort, uniq)
  • cat $f.bib ...: outputs the bibtex file (cat), takes the first line of each entry (grep) removes evertything before { and after , (sed), and does sort, uniq as before.
  • diff ...: prints differences between two files, the characters < and > say "which file is larger by that line"

Edit: I found that LaTeX automatically perorms the comma-seperation of \cite{...} parameter, so I removed it from the script.

Edit: In 2021 the code of the previous version works with elsarticle (restored with the other version commented #)
The script requires the same name for the .tex and .bib files (in case use ln -s mybib.bib LaTeXFileName.bib).

Hastur
  • 333
yo'
  • 51,322
  • 1
    No, the argument to \citation can contain commas, depending on the package used for citation, for instance natbib. – egreg Feb 03 '12 at 14:43
1

Since I ran into some issues with the refcheck package, a rather simple option would be to load hyperref with the pagebackref option. Then, doing a \nocite{*} in the main TeX file, the bibliography gets back references to the pages where an entry was cited. No citation -> no back reference.

\begin{filecontents}{\jobname.bib}
@BOOK{foo:2012a,
  title = {My Title One},
  publisher = {My Publisher One},
  year = {2012},
  editor = {My Editor One},
  author = {Author One}
}

@BOOK{foo:2012b,
  title = {My Title Two},
  publisher = {My Publisher Two},
  year = {2012},
  editor = {My Editor Two},
  author = {Author Two}
}

@BOOK{foo:2012c,
  title = {My Title Three},
  publisher = {My Publisher Three},
  year = {2012},
  editor = {My Editor Three},
  author = {Author Three}
}
\end{filecontents}

\documentclass{article}

\usepackage[pagebackref]{hyperref}

\begin{document}

Hello world \cite{foo:2012a,foo:2012b}.

\nocite{*}

\bibliographystyle{plain}
\bibliography{\jobname}

\end{document}
mafp
  • 19,096