8

I've installed BibTool and it seems to work for e.g. printing a formatted bibliography to the terminal. I'm trying to extract a bibliography from an .aux file from a document compiled with BibLaTeX. As a MWE here's such a document:

\documentclass[12pt, oneside, article, a4paper]{memoir}

\usepackage[backend=biber, natbib, style=authoryear-comp]{biblatex}
\ExecuteBibliographyOptions{alldates=short, language=british, sortcites}

\usepackage{kantlipsum}

\title{BibTool MWE}
\author{Thomas Hodgson}
\date{11 May 2013}
\bibliography{bibtool_mwe}

\begin{document}
\maketitle

\kant

\citet{Melville2007}

\printbibliography
\end{document}

This compiles, and generates a bib tool_mwe.aux file that looks like this:

\relax 
\providecommand*{\memsetcounter}[2]{}
\abx@aux@sortscheme{nyt}
\@writefile{toc}{\boolfalse {citerequest}\boolfalse {citetracker}\boolfalse {pagetracker}\boolfalse {backtracker}\relax }
\@writefile{lof}{\boolfalse {citerequest}\boolfalse {citetracker}\boolfalse {pagetracker}\boolfalse {backtracker}\relax }
\@writefile{lot}{\boolfalse {citerequest}\boolfalse {citetracker}\boolfalse {pagetracker}\boolfalse {backtracker}\relax }
\abx@aux@cite{Melville2007}
\abx@aux@page{1}{3}
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {section}{References}{3}}
\abx@aux@page{2}{3}
\memsetcounter{lastsheet}{3}
\memsetcounter{lastpage}{3}

And, in case it's necessary here's the .bib file:

@book{Melville2007,
    Author = {Melville, Herman},
    Booktitle = {Moby-Dick},
    Date-Added = {2013-05-12 09:35:11 +0000},
    Date-Modified = {2013-05-12 09:37:33 +0000},
    Location = {London},
    Publisher = {Vintage},
    Title = {Moby-Dick},
    Year = {2007}}

When I put the following in the command line BibTool produces a blank line, when I leave it with ctrl-D it generates a .bib file, but it's empty.

bibtool -x bibtool_mwe.aux -o test.bib

For a while I thought this was the same problem as this question. I'm using BibLaTeX as well. But I was getting similar error messages only when I got BibTool's syntax one. In any case, my problem couldn't be solved in the way suggested by the answer to that question because my .aux file is different to the one described there.

twsh
  • 2,718
  • 1
    I guess that BibTool looks for \citation commands in the .aux file and finds none, because biblatex writes \abx@aux@cite instead of \citation. – egreg May 12 '13 at 10:23
  • @MarcoDaniel It's a similar question, but it seems like Seamus (who asked the original question) solved the problem by editing the .aux file. Which I can't do in the same way. – twsh May 12 '13 at 13:39
  • @egreg That makes sense. Do you think there's no way to solve it? It occurred to me that it probably wouldn't be too hard to extract citekeys from a .bbl file. If I decide it's worth my time I'll play around with that. – twsh May 12 '13 at 13:41
  • 1
    The citation keyword is used in tex_aux.c in the C source. Maybe it's not so difficult to adapt the program to accept also \abx@aux@cite; however there's no \bibdata in the .aux file. Probably teaching BibTool to look in the .bcf file is the best thing. It's an XML file, so parsing it shouldn't be so difficult. – egreg May 12 '13 at 13:47
  • What exactly are you trying to do? biber has a "tool" mode now which can output .bib instead of .bbl - this allows you to (re)encode, change entries with complex remapping etc. and then output a .bib based on the changes. It's a bit more semantic than bibtool as the changes are based on an internal data model instead of the syntax of a raw file. – PLK May 12 '13 at 21:01
  • @PLK I would like to be able to easily extract from my main .bib file a .bib file containing the subset of entries actually cited in a particular document. In other words, to do exactly what I could do with BibTool's 'extract from .aux file' feature if I used BibTeX rather than BibLaTeX. – twsh May 12 '13 at 21:16
  • @egreg I wrote myself a script that looks at a .bcf file and gives me a regular expression that BibTool can use to extract entries that are found in the .bcf. I think that's good enough for me. – twsh May 12 '13 at 21:18
  • @Tom Maybe you can share! – egreg May 12 '13 at 21:24
  • I've added my solution as an answer, because code can't be formatted properly in a comment. – twsh May 12 '13 at 21:42

3 Answers3

8

Just run biber as follows:

biber mainfile --output_format bibtex

where mainfile.tex is your tex file. You will then get an file

mainfile_biber.bib

contained the relevant bib entries.

Andrew Swann
  • 95,762
4

I wrote, or at least adopted from the answers to this question on Stack Overflow, the following (in Python) to extract a regular expression from a .bcf file:

def xmltocitekeys(file):

    from xml.dom import minidom

    xmldoc = minidom.parse(file)
    taglist = xmldoc.getElementsByTagName('bcf:citekey')
    keylist = []

    for x in taglist:
        keylist.append(str(x.childNodes[0].nodeValue))

    print('\|'.join(keylist))

The output can then be used with BibTool, for example:

bibtool -X "output" -o extract.bib -i bibliography.bib

By the way, I needed to set an option in BibTool in order to preserve the capitalisation of my citekeys.

twsh
  • 2,718
  • Can you explain more clearly how to use this Python script? This looks very useful, but my knowledge of Python is limited to knowing that it exists.... – jon Sep 03 '13 at 03:49
  • There's a later version on GitHub here. It has a README with some more details. Let me know if that needs clarifying. – twsh Sep 03 '13 at 12:15
  • No, that's easy enough to understand. A provisional test suggests it works fine, although it is grabbing some extra entries from the master bibliography for a reason I don't have time to track down at the moment (about 100 entry keys instead of the 70 that are actually cited). – jon Sep 03 '13 at 15:10
  • Great. If you find out what's going on with the extra keys I'd be interested to hear about it. – twsh Sep 03 '13 at 16:29
  • I've learnt that when BibTool extracts from a bibliography it takes any key that is a superstring of the string you give it. So if you ask for Jones2013 it will take Jones2013a, Jones2013b etc if it finds them. Is that where the extra entries are coming from for you? – twsh Sep 08 '13 at 23:56
  • I'm sorry: I haven't had time to investigate further, but there seems to be more to it than just that. Some of the extracted entries from the main .bib file only overlap insofar as they share an @string 'shorthand' (e.g., a superfluous entry would have location = cambridge, where cambridge expands from @STRING{cambridge="Cambridge" }, as it is written to the new .bib file by bibtool). But I really haven't had time to explore yet.... – jon Sep 09 '13 at 02:02
  • That's puzzling. If you feel like sending me the files you're using I'll look in to it. (My profile here has a link to my site and email.) – twsh Sep 09 '13 at 19:19
  • I've now rewritten the tool so it doesn't use BibTool (see the latest version on GitHub). It doesn't get extra keys for me. – twsh Sep 16 '13 at 10:22
4

Biber can do this for you in "tool" mode. Put this in your biber.conf file (or in any file which you then tell biber about with the -g option):

<config>
  <sourcemap>
    <maps datatype="bibtex" map_overwrite="1">
      <map>
        <map_step map_field_source="entrykey" map_match="^(?!(?:key1|key2))" map_final="1"/>
        <map_step map_entry_null="1"/>
      </map>
    </maps>
  </sourcemap>
</config>

and call biber like this (assuming your .bib file is called "foo.bib"):

biber --tool foo.bib

or if you are not using the default biber.conf file:

biber -g <conf_file> --tool foo.bib

This will output another .bib file called foo_bibertool.bib with only the entries with "key1" and "key2". You can change the regexp accordingly to select what you want. I will add a "map_not_match" option to biber as a convenience for the next version so you don't have to use negative regexps. This config file essentially goes through the .bib file you pass and ignores any entries which match the regexp.

This is completely independent of any .bcf or .aux etc. See the biber manual, section 3.12.

PLK
  • 22,776
  • 2
    The purpose of BibTool (among others) is to extract from a big .bib file only the entries actually appearing in a document, without the need of specifying the keys one by one, but just examining the .aux file. – egreg May 13 '13 at 08:41
  • The biber commandline option --ouput-format=bibtex might be closer to what the questioner is looking for. Check out the manual. – michel Nov 14 '13 at 19:54