174

Normally, LaTeX will only create references for the BibTeX entries cited in the text. Is there a way of extracting these entries into a different .bib file automatically? For example if I have a .bib file with two entries, and only one is cited in a particular text, I need a way of creating a new .bib file with just that reference.

lockstep
  • 250,273
  • 2
    For Mac users of BibDesk, there is BBL to BIB with BibDesk. – Adam Liter Sep 19 '13 at 03:08
  • 2
    For Windows, JabRef provides this capacity: see here –  Jul 18 '14 at 14:10
  • might be a duplicate of http://tex.stackexchange.com/questions/417/how-to-split-all-bibtex-referenced-entries-from-a-big-bibtex-database-to-a-copy – lenz Feb 23 '15 at 00:09
  • 1
    BBL to BIB seems cool but does not seem to be compatible with biber/biblatex. – Stefan Müller Nov 29 '18 at 09:18
  • Is it possible to do this in overleaf somehow? – canIchangethis Nov 07 '22 at 09:08
  • @canIchangethis: yes. by creating a custom latexmkrc along the lines of $lualatex = 'lualatex %O %S; bash ./export.sh';, where you can put whatever shell commands you like in export.sh. – Eric Jun 21 '23 at 17:04
  • Thank you @Eric. Your code in an otherwise empty latexmkrc does generate my PDF but does not generate the selective bibfile - at least don't know where I would find it in my overleaf. Any suggestion how I can use a shorter bib export tool in a biblatex, pdflatex, and biber workflow in overleaf? – canIchangethis Dec 19 '23 at 19:28
  • 1
    @canIchangethis: did you add one of the other answers below to the contents of export.sh? If so, the generated file should appear in the other outputs within the log pane of overleaf. – Eric Dec 20 '23 at 17:13
  • wait, export.sh is also a file? How do I create this? – canIchangethis Dec 21 '23 at 20:14

11 Answers11

154

With a TeX Live distribution (possibly also with MiKTeX) there is a bibexport program. Assuming your document is myarticle.tex, you have to compile it normally and then you call

bibexport -o extracted.bib myarticle.aux

where extracted.bib is the name that you want to give to your new .bib file. Notice that you have to give the extension .aux (or no extension at all).

Then you have to change the name of the .bib file in your document, in order to use extracted.bib.

egreg
  • 1,121,712
  • 9
    In MiKTeX 2.9 this tool is missing :-( – Mensch Feb 15 '13 at 16:24
  • 9
    Although this falls outside of the scope of the question, perhaps a brief word about how this tool does not work with biber-driven bibliographies (or with .bib files that use biblatex fields such as date instead of year) is in order. – jon Feb 15 '13 at 16:29
  • 3
    @jon bibexport uses BibTeX for doing its job. The manual tells how to add new fields (section 1.4). For biblatex/Biber compatibility the script should look not only for \citation commands in the aux file but also for \abx@aux@cite. – egreg Feb 15 '13 at 16:42
  • 7
    @egreg -- Right. I was thinking more for people who look at your answer, but don't realize why bibexport isn't working for them; most of them are not going to be modifying bibexport.sh. Another problem that might arise is if you use non-standard entry types (from BibTeX's perspective). I use @Collectio{<key>,..., e.g., for essay collections; that comes through as @{<key>,.... – jon Feb 15 '13 at 20:02
  • +1 for the usage example – the bibexport help page doesn't mention that the main argument is the .aux file (it might be obvious, but wasn't for me). – lenz Feb 23 '15 at 00:05
61

jabref can do this in both command line and gui modes.

First gui mode:

Keep your master.bib file open in jabref. Then in ToolsNew subdatase based on AUX file to get

screenshot of AUX file import dialog box

Here select the .aux file, click parse, and select and the generate. You should get a sub database opened in jabref. Save it.

From command line, assuming that you take care of paths do this:

jabref.jar -a filename[.aux],newBibFile[.bib]

See also command line options to jabref.

  • 5
    In contrast to the bibexport-solution above, this one works when working with biblatex and biber. – Florian Jan 16 '18 at 10:30
43

This is a supplement to pavel's answer which aims to address an issue raised in the comments. It is therefore a more specific solution than the one there: the simpler command will work fine if you don't need to resolve crossref fields in .bib entries.

In order to resolve crossref fields in a .bib file when using biblatex/biber, you need to tell biber what to do.

Given <filename>.tex, run:

pdflatex <filename>.tex
biber --output_format=bibtex --output_resolve <filename>.bcf

Where latex, xelatex, lualatex etc. can be substituted for pdflatex as appropriate. So long as it generates your .bcf it is fine.

cfr
  • 198,882
31

With biblatex/biber you can use

biber document-base-name.bcf --output_format=bibtex

To resolve crossref fields, add the option --output-resolve-crossrefs.

pavel
  • 861
15

In addition to egreg's answer, I'd like to point out an alternative solution. Nelson Beebe has developed utilities called bibextract, citetags and citefind to handle sub-bibliography databases. You can obtain them here.

In this case, you would compile the document normally and then type in a shell

citetags myarticle.aux > myarticle-tags
citefind myarticle-tags mybib.bib > mysubbib.bib

The first command prints all the citation keys used in your .tex, while the second selects all the entries from my bib.bib with keys from myarticle-tags. Of course, one can easily write a script to merge the two commands if needed.

Although it works in a similar fashion as bibexport, this solution has the advantage that it does not delete the biblatex fields such as date, while bibexport does by default as mentioned in the comments. It also works with biblatex if you use bibtex as a backend, but as far as I know, it does not if you use biber.

Corentin
  • 9,981
8

I thought I'd chime in since this came up for me when searching, and none of the given answers worked for me.

As comments allude to, the bibexport program doesn't seem to be included in some LaTeX distributions anymore (it's certainly not in the texlive version I'm using).

I also tried the citetags/citefind commands described by @Corentin but they produced an empty file as output. I don't use biber or OSX either, so the other answers didn't help.

I then found that bibtool can do this, as follows:

bibtool -x article.aux -o NewBib.bib

It also turns out that this answer has already been given at https://tex.stackexchange.com/a/136839/89790

Warbo
  • 191
4

I manage to do it with JabRef 4.3.1 using the command line only. It works in Windows 10 as well. The answer provided by @user11232 is out of date. The right command is:

jabref.jar -n -a old_ref.aux,new_ref.bib old_ref.bib

-n means we do not use the GUI and the output will be in new_ref.bib.

You can find jabref.jar on the new website of JabRef. The documentation for the command-line usage is also updated in a new place.

Hui Li
  • 41
  • Of all the solutions presented in this thread, this is the only one that reliably extracts all used references from my bibliography into a small bibliography. – Christian Herenz Jul 21 '20 at 20:24
1

You can also take the cited references from the .bbl file and remove references from the .bib file not listed in .bbl with this Python script:

bib_file = 'large_bib_file_to_clean.bib'
bbl_file = 'corresponding_bbl_file.bbl'

def get_cited_entries(bbl_file): with open(bbl_file, 'r', encoding='utf-8') as f: bbl_content = f.read()

bibitem_matches = re.findall(r'\\bibitem.*{([^}]*)}', bbl_content)
return set(bibitem_matches)

def remove_unused_entries(bib_file, cited_entries): with open(bib_file, 'r', encoding='utf-8') as f: bib_content = f.read()

# Split .bib content into entries
bib_entries = re.split(r'(@\w+{)', bib_content)
new_bib_content = ''

# Iterate through entries and keep only cited ones
for i in range(1, len(bib_entries), 2):
    entry_type = bib_entries[i].strip()
    entry = bib_entries[i + 1].strip()
    key_match = re.match(r'^([^,]*)', entry)
    if key_match:
        key = key_match.group(1).strip()
        if key in cited_entries:
            new_bib_content += entry_type + entry + '\n\n'

# Write to .bib file
with open(dir+'updated.bib', 'w', encoding='utf-8') as f:
    f.write(new_bib_content)

cited_entries = get_cited_entries(bbl_file) remove_unused_entries(bib_file, cited_entries)

i_gal
  • 11
  • 1
0

Corentin's solution worked for my case where I needed to combine cited references from multiple .tex documents into one bib file. I had trouble installing bibextract on OSX for two reasons: 1) nawk is not installed by default and 2) the CHECKSUM command in the makefile prevented the sh and awk files from being installed. After running, ./configure, modifify the Makefile as follows:

change:

SEDCMD          = $(SED) -e 's=@LIBDIR@=$(LIBDIR)=g' \
                     -e 's=@BINDIR@=$(BINDIR)=g' \
                     -e 's=/bin/sed=$(SED)=g' 

to

SEDCMD          = $(SED) -e 's=@LIBDIR@=$(LIBDIR)=g' \
                     -e 's=@BINDIR@=$(BINDIR)=g' \
                     -e 's=/bin/sed=$(SED)=g' \
                     -e 's=nawk -f=awk -f=g'

then change

        $(SEDCMD) $$f.sh | $(CHECKSUM) > $(BINDIR)/$$f ; \
        $(SEDCMD) $$f.awk | $(CHECKSUM) > $(LIBDIR)/$$f.awk ; \
        $(SEDCMD) $$f.man | $(CHECKSUM) > $(MANDIR)/$$f.$(MANEXT) ; \

to

        $(SEDCMD) $$f.sh > $(BINDIR)/$$f ; \
        $(SEDCMD) $$f.awk > $(LIBDIR)/$$f.awk ; \
        $(SEDCMD) $$f.man > $(MANDIR)/$$f.$(MANEXT) ; \

then run sudo make install and bibextract will work as noted above.

0

If you use WinEdt as the front end to your TeX distribution, I suggest you download and install the bibMacros package (if you haven't already done so). Upon installation, you will see a new ribbon item, labelled "BibTeX". In the tex file that generated the bibliography, click on the "BibTeX" ribbon item to open a drop-down list of further items, and then on the "Extract from Aux" item. These actions let you create a new bib file that contains just the items that are cited in the document.

Mico
  • 506,678
-1

Complete steps to use "bibexport" for 'extracting only the cited references of a bigger .bib file'. Need Unix environment.

  1. Download the bibexport from here. The important files are 'bibexport.dtx' and 'bibexport.ins'.

  2. In the terminal build the '.ins' file using latex. command: latex bibexport.ins. I am assuming that the 'latex' package is already installed in your system.

  3. This will generate a bunch of files named 'bibexport.sh', 'expcites.bst', 'expkeys.bst', 'export.bst' and log file.

  4. Copy all three '.bst' files and 'bibexport.sh' to the folder where you are writing the article in latex. Also, change the permission of the 'bibexport.sh' file to executable e.g, 'chmod 755 bibexport.sh'.

  5. If you have already compiled(successfully) your article, 'MyArticle.tex', in latex then you should have a file in '.aux' extension. If not, then compile your 'MyArticle.tex' file first using 'latex MyArticle.tex' and 'bibtex MyArticle.aux'. Do this a few times until the compilation finishes without any errors or warnings.

  6. Now it's time to extract the "cited" only references from the "big" bibliography file say 'BibliographyAll.bib'. Use the command " ./bibexport.sh -o ./BibliographyForThisPaper.bib ./MyArticle.aux" in the terminal.

  7. A file named 'BibliographyThisPaper.bib' will be generated which contains only the references that has been used in the article.

    You can use this file in your main latex file. \bibliography{./BibliographyForThisPaper}. For other options and usages follow the manual of bibexport which also comes with the package, just you have to build it. Do "pdflatex ./bibexport.dtx"

  • Before installing by hand, check if the package is available prebuilt for your TeX distribution (or as a package for your system). – vonbrand Feb 20 '20 at 22:45