1

I used to work with Citavi and recently tried out Jabref.

Due to the nature of the best, i.e. I always had re-touch the .bib-file, I recently exported consolidated my sources I had stored in Citavi and Jabref and worked them into a single .bib-file which I am going to use with Biblatex and biber.

I intend to use either Notepad++, Texstudio or Geany for editing the .bib-files. So now I am thinking about simplifying / improving the workflow when I need to edit it.

First of I thought I'd split it up into 27 files, which are A-Z and numerals. Yet sometimes in the past when I added the sources, I didn't consider that in a case of a collaboration, I didn't specifically account for the corresponding author if there was one, which are of course often named first. That might prove tedious if I'd like to look at the specific code for an entry. In terms of functionality this is irrelevant since I assume one is not going to omit any letters.

What do you do for staying sane when working with large .bib files? Or is it not that much of an hassle for you?

henry
  • 6,594
  • 1
    Having a separate file for the first letter of the different surnames seems a bit inconvenient because you'd usually have to use all of them and it may be difficult to spot an error if you omit some of them. I'd put all definitions in the same file or have different bibliography files for different categories of references. –  Mar 04 '14 at 12:52
  • One file + jabref. That works very well for me. Here I described my workflow: http://tex.stackexchange.com/questions/18848/workflow-for-managing-references/115299#115299 – quinmars Mar 04 '14 at 13:03
  • 1
    I use different files for different kinds of abbreviations. One for author, editor and organisation name. One for journals and series. One for places. One for publishers. One for everything else. Then I have bib files by category e.g. one for general reference (dictionaries etc., regular non-specialist non-fiction), one for literature etc. This makes it easy to cross-reference because all entries for papers in a given anthology, for example, are in the same file as the entry for the anthology. Then I have some scripts for transforming downloaded references into my preferred format. – cfr Mar 04 '14 at 13:09
  • Marc, I meant only to use one letter for one surname of course. :) cfr, your answer would make much more sense to me if you meant source types instead of abbreviations. But you are saying you have created your own system of authors (persons), editors, and other kinds of entities who could be authors? I don't fully understand yet how all sources for one anthology are in one place then. – henry Mar 04 '14 at 13:27
  • Henry, if you use the @user notation, the addressee will get a notification about your comment. (The originator of the question will always be informced.) You write 27 files, which suggests you have one file for surnames starting with a, one for surnames starting with b, and so on, and one for digits. This really doesn't make sense to me and in my comment I tried to explain why I thought it didn't make sense. I don't have my own system of authors; I just have one large bibliography file. –  Mar 04 '14 at 13:49
  • @MarcvanDongen Oh, sorry. Well I meant that the folder A would contain to all sources with keys from Aantes-2009 to Azzam-1973. And to create these, one usually does base it on the corresponding author or the first one by alphabetical order. I do realize I omitted that entirely. :/ Or you realized that and you still find the approach confusing. But anyhow, I am still leaning to keeping it all in one file. – henry Mar 04 '14 at 14:02
  • 2
    Yes, I think it defeats the purpose because in general you don't know all citations in the paper, so you either can include all bibliography files, or a selection. The first approach would work but the second approach is prone to errors as you may omit a file. –  Mar 04 '14 at 14:14
  • @cfr Do you have some overlapping content in the bib files then? (Please also do note my second previous comment where I forgot to address you.) – henry Mar 04 '14 at 14:47
  • The simplest solution, in my opinion, is to have one .bib file, which you use while composing all documents or books (etc.). Once, say, the article has been accepted by a journal and the reference list will be more or less frozen, use a tool -- either bibtool if using BibTeX or biber itself -- to extract the citations you are using for that article into a separate .bib file unique to that article. This solves virtually all danger of reduplicating entries, having identical keys, etc. For entry keys, use lastname+year, and lastname+year+1st letter of 1st 4 title words to disambiguate. – jon Mar 04 '14 at 15:43
  • 1
    @henry I tried explaining in comments but it got really messy so see my (non-)answer which is hopefully a little clearer. – cfr Mar 04 '14 at 17:08

2 Answers2

2

I use one file and notepad++ when I'm on my windows box (jEdit on linux most of the time). My keys are (normally*) in the form Surname_Key_Words. I strip any keywords fields (manually or using a regex find/replace), because I just find thy slow down searching. A find is pretty simple: searching for Nakamura_ would only find me the following entry.

@article{Nakamura_GaN_LED,
    title = {High-Power {GaN} {P}-{N} Junction Blue-Light-Emitting Diodes},
    author = {Shuji Nakamura and Takashi Mukai and Masayuki Senoh},
    journal = {Japanese Journal of Applied Physics},
    volume = {30},
    number = {Part 2, No. 12A},
    pages = {L1998-L2001},
    numpages = {4},
    year = {1991},
    url = {http://jjap.jsap.jp/link?JJAP/30/L1998/},
    doi = {10.7567/JJAP.30.L1998},
    publisher = {The Japan Society of Applied Physics}
}

I also keep copies of the listed papers in a flat folder with the file name equal to the bibtex key. This works on bibtex and biblatex (which I use now but I have to maintain back-compatibility without too much hassle). It seems to work well for ~200 entries and counting.

I just found jabref led to too much click this then click that then type this nonsense, but switched to using quality editors when I had many entries with similar title elements that needed {}, easily solved by a regex find/replace.

I use the same editors for all my .tex files with appropriate command macros for compiling etc.

*other keys:

  • colleagues -- Firstame_key_words
  • me -- Me_keyword
  • others -- company_Key_Words
  • etc.
Chris H
  • 8,705
  • Chris, very interesting answer, thank you. I didn't know of jEdit, thanks. Btw, for maintaining the files on Windows and Unix, I switched from underscores to hyphens because I one is able to use ctrl + (shift +) left/right then, when renaming a file. This is not possible with an underscore or dot AFAIK. – henry Mar 04 '14 at 13:34
  • If I end up doing a lot of LaTeX on windows I'll install jEdit there - though I prefer notepad++ the preference is so slight I'd rather use the same editor on both systems. ctrl+left etc works in nautilus rename and evince save as on ubuntu 12.04 with xfce. the underscores are less good in jEdit as they don't count as word breaks, but I don't find it an issue. – Chris H Mar 04 '14 at 13:49
2

I don't think there is a definitive answer to this question at all. For one thing, I think what suits one person will not suit another. Moreover, people differ in how many entries they have to handle, where they obtain new data from and what they need to be able to do with that data later on.

However, I realised that trying to explain this in comments was not working very well. Maybe this will be a little clearer.

Overview

I currently use different files for different kinds of abbreviations. One for author, editor and organisation names. One for journals and series. One for places. One for publishers. One for everything else. Then I have bib files by category e.g. one for general reference (dictionaries etc., regular non-specialist non-fiction), one for literature etc. This makes it easy to cross-reference because all entries for papers in a given anthology, for example, are in the same file as the entry for the anthology. (Note that not all of the required data is in the same file. But neither bibtex nor biber care about that.) Then I have some scripts for transforming downloaded references into my preferred format.

Example

Hopefully this will make the above a little clearer...

Suppose I have an anthology of papers on aardvarks edited by Camel B. Jones which contains a paper on their diet by Maria N. Davies.

Camel B. Jones (ed.). 1992. The Life and Times of Aardvarks. Oxford: Oxford University Press.

Maria N. Davies. 1992. 'What Aardvarks Like To Eat'. In Camel B. Jones (ed.), The Life and Times of Aardvarks. Oxford: Oxford University Press. Pages 141-156.

Then in authors.bib, I might have:

@string{davies-maria-n = {Davies, Maria N.}}
@string{jones-camel-b = {Jones, Camel B.}} 

In pub.bib, I might have:

@string{oup = {Oxford University Press}}

places.bib might contain:

@string{oxon = {Oxford}}

Then zoo.bib might contain all entries to do with zoology. For new keys, I try to use something like lastname1-lastname2-...-initiallettersoftitle so my entries would look something like this:

@incollection{davies-wal2e,
  author = davies-maria-n,
  crossref = {jones-lta},
  title = {What Aardvarks Like To Eat},
  pages = {141--156}}

@collection{jones-lta, editor = jones-camel-b, publisher = oup, address = oxon, title = {The Life and Times of Aardvarks}, booktitle = {The Life and Times of Aardvarks}, year = 1992}

The idea is to ensure consistency by using strings for things which are used multiple times (or might be used multiple times). Titles are unique - or, rather, they do not vary in consistent, patterned ways. So they go in in the normal way as there is no point in bothering to define strings for them. But anything where consistency is an issue gets a string defined in the appropriate file of abbreviations.

Now, if I need to cite 'What Aardvarks Like To Eat', I of course need to load all of the relevant bib files in the appropriate order. That is, I need to load the files containing the strings before loading the .bib files which use them. For example:

\bibliography{authors,pub,places,zoo}

But that's just code you copy and paste into a new document, put in a template or load using \input.

cfr
  • 198,882
  • To be frank, this is a bit above my head, but nonetheless very interesting. Why do you do this? To somehow "force" some kind of principle of nesting/using macro-like items in this part of your document or to just shorten the time it take to type out every tidbit of information? Or just to divide up the information? Can you give a ballpark figure on how many strings of authors and publishers you have defined (i.e. abbreviated)? – henry Mar 04 '14 at 17:26
  • I very much appreciate the detailed explanation! On second thought, it doesn't seem that complicated. – henry Mar 04 '14 at 21:28
  • 1
    About 3,600 authors/editors/organisations, journals/series, places and publishers according to grep. I initially did it to save typing for frequently used items. But it got irritating to change them when a less frequently used item because frequently used, as often happens. At this point, though, the primary motive is consistency. It means I write an author's name, for example, once or a journal title once. So I can be sure that precisely the same name will be used throughout and that biblatex, for instance, will recognise all relevant entries as authored by the same person. Etc. – cfr Mar 04 '14 at 22:41
  • 1
    @henry Also, I now name directories for authors where I have electronic copies of articles. Lots of older entries don't conform to this but for newer ones, the bibkey is lastname-initialsoftitle and the pdf is lastname-initialsoftitle.pdf and stored in a directory named using the string which abbreviates the author's name. This makes things easy to find. It also allows me to use some custom macros which use this patterning. These are not why I started doing it but it has turned out to be very useful for various reasons of this kind. And it is more complicated to explain than it is to do! – cfr Mar 04 '14 at 22:45