12

Is there any way to get Biblatex to automatically validate and format ISBNs? That is, I would like to be able to enter a field such as

isbn = {9789549090666},

in my .bib file—not worrying about the presence or proper placement of hyphens—and have Biblatex render this as "ISBN 978-954-90906-6-6" when I print the bibliography with \printbibliography. It would also be useful if Biblatex could first validate the ISBN and, if it's an older ISBN-10, convert it to ISBN-13.

If there's no easy way of doing this in Biblatex, is there perhaps an external program I can run on my .bib file to do the validation and formatting? All I've found so far are online tools, and some of them improperly group the digits of the ISBN. (For those that aren't aware, while ISBNs have a fixed number of digits, the lengths of the hyphen-separated groups are variable.)

Psychonaut
  • 3,142
  • 2
    The best way should be to used an external script in python/perl/ruby or what so ever I think. – Romain Picot Oct 18 '15 at 11:52
  • 2
    It can be certainly done with some expl3-code. But I don't see why you should want it. Normally one copy and paste the ISBN. Why removing the hyphens after the paste and then reinsert them later? – Ulrike Fischer Oct 18 '15 at 12:08
  • That is definitely something you want to address in your .bib file, i.e. before the file is processed by Biber (and then later by LaTeX). After all you wouldn't expect LaTeX to fix typos in the author field or to check that certain URLs exist. That is something you (or the citation manager of your choice) need to make sure when writing the .bib file. – moewe Oct 18 '15 at 12:10
  • 2
    @UlrikeFischer: I don't remove the hyphens. The problem is that many online bibliographic databases, and even some publishers, don't properly format their ISBNs. (It's not unheard of for me to find things like "ISBN 978-9549090666" written on the copyright page of books or conference proceedings.) Rather than me manually figuring out the correct hyphenation based on the ISBN country and publisher codes, it would be more convenient for a tool to do it for me. – Psychonaut Oct 18 '15 at 12:38
  • 2
    I had very good experience with the online tool at https://tools.wmflabs.org/isbn/IsbnCheckAndFormat/. I firmly believe script languages like python (see How to automatically apply ISBN hyphenation?) are much better suited to do this than LaTeX, though I believe a LaTeX3 (expl3) implementation could be possible. – moewe Oct 18 '15 at 13:44
  • For proper formatting of the hyphen you will gave to include at least the following three lists from Wikipedia (though I admit there is a pattern) https://en.wikipedia.org/wiki/List_of_ISBN_identifier_groups, https://en.wikipedia.org/wiki/List_of_group-0_ISBN_publisher_codes, https://en.wikipedia.org/wiki/List_of_group-1_ISBN_publisher_codes. But this is only for English language publishing, you will have to have that for all languages. – moewe Oct 18 '15 at 13:51
  • For validating ISBN, see http://tex.stackexchange.com/questions/39719/calculating-checksum – egreg Oct 18 '15 at 14:25
  • 1
    If using biber as a the backend, this will check ISBNs and emit a warning if they are invalid. – PLK Oct 19 '15 at 10:46
  • Biber also checks ISSNs and ISMNs – PLK Oct 19 '15 at 11:06
  • @PLK That is very interesting. Apparently the module Business::ISBN that is used to check for validity can also format the ISBN with hyphens in the right places. Maybe it is possible to enable the use of the as_string routine in Biber to get nicely placed hyphens. – moewe Oct 19 '15 at 13:03
  • Please try the DEV version of biber from Sourceforge- there are options '--isbn10', '--isbn13' and '--isbn-normalise' which force to ISBN10 or ISBN13 and/or normalise with the correct hyphenation patterns respectively. The module used has a large database of hyphenation patterns. – PLK Oct 19 '15 at 21:11

3 Answers3

16

Please try biber 2.2 (along with biblatex 3.1). The --validate-datamodel option will report on invalid ISBNs. The new options --isbn10 will force ISBNs to 10-digit format and --isbn13 to 13-digit. --isbn-normalise will format with hyphens in the correct places.

The module which does this in Biber has a database of ISBN numbers which is updated with new releases of the module.

PLK
  • 22,776
  • This doesn't seem to work at all for me. Biber 2.2 + biblatex 3.1 now strips all hyphenation from ISBNs, regardless whether or not --isbn-normalise is used. – Psychonaut Nov 06 '15 at 11:13
  • 2
    This was a bug, it is fixed in biber 2.3 dev version – PLK Nov 06 '15 at 16:54
9

The main problem with recreating the hyphens is the ISBN itself.

It is build as number with 13 digits:

ISBN: prefix - country - publisher - book - check number

for example: 978 - 3 - 86680 - 192 - 7. The prefix has 3 digits, the check number 1, the country number 1, total 5. You have 8 digits for publisher and book together.

So we have now recreated: 978-3-86680192-7.

And here is the problem, you have to know all publisher numbers to recreate the hyphen between publisher and book. There are publisher with a 7 digit number and only one numer for books (that publisher can only produce maximaum 10 books), others have 3 digits for the publisher and 5 for the book number.

That's the reason I would not try to recreate the lost hyphens with LaTeX. Better use one of the named web sites in the comments to get the lost hyphens back and add them to your bib entry.

Then you can use the method from question Calculating checksum (see comment of @egreg).

To validate a ISBN you need to know if the publisher and book number are valid (current publisher or no longer operating publisher?, was the book available?) and if the check number is valid.

Older ISBN numbers (10 digits) can be build to current 13 digit ISBN numbers by just adding 978-first. Now you have to recalculate the last check number and use it. See for example both ISBN for the LaTeX companion: ISBN-10: 3827316898 and ISBN-13: 978-3827316899. The bold part is equal. With hyphens the ISBN is: 978-3-8273-1689-9 with 3 for German, 8273 for Pearson Studium, 1689 for book "Der LaTeX-Begleiter".

Mensch
  • 65,388
  • 3
    @cfr Older ISBN numbers can be build by just adding 978-first. Now you have to recalculate the last check number and use it. See for example both ISBN for the LaTeX companion: ISBN-10: 3827316898 and ISBN-13: 978-3827316899. The bold part is equal. – Mensch Oct 19 '15 at 00:31
  • I understand the arguments against using LaTeX in particular to add the hyphens, but I don't think it's helpful to claim that a website-based formatter is necessarily a better solution. All the online tools discussed here so far either misformat the ISBNs or process them one at a time. Using these to validate and format the ISBNs of a .bib file containing hundreds of entries would be a nightmare. – Psychonaut Oct 19 '15 at 07:53
  • 1
    @Psychonaut To validate the ISBN, changed or not, you need informations. Where do you have this informations? In a sql data base on your computer? How do you keep it up to date? The internet has this information here and there. LaTeX is not build to connect the internet and validate ISBNs. That is a job for a script, not LaTeX, to parse your bib file, extract the ISBN and title and author and publisher and the check the internet if there is such an isbn with the same title, author and publisher. And with the publisher you know the publisher number and the place for the 4st hyphen. – Mensch Oct 19 '15 at 08:48
  • I agree, the problem is well suited to a script with a database of country and publisher numbers (either stored locally or queried from the Internet). I was taking issue with your recommendation to "use one of the named websites", because so far no one has named any website that properly processes ISBNs in batch. – Psychonaut Oct 19 '15 at 08:59
0

Assuming the ISBN-13 starts with 978, the following plain TeX code will generate the correct check digit (substitute the nine digits following 978 in this example, where zeros are given):

\newcount\isbncount
\newcount\scratchcount
\def\isbncalc#1#2#3#4#5#6#7#8#9{%
    \isbncount=38%total yielded by initial 978
%       i.e. 1x9 plus 3x7 plus 1x8
    \scratchcount=#1 \multiply \scratchcount by 3
    \advance \isbncount by \scratchcount
    \advance \isbncount by #2
    \scratchcount=#3 \multiply \scratchcount by 3
    \advance \isbncount by \scratchcount
    \advance \isbncount by #4
    \scratchcount=#5 \multiply \scratchcount by 3
    \advance \isbncount by \scratchcount
    \advance \isbncount by #6
    \scratchcount=#7 \multiply \scratchcount by 3
    \advance \isbncount by \scratchcount
    \advance \isbncount by #8
    \scratchcount=#9 \multiply \scratchcount by 3
    \advance \isbncount by \scratchcount
    \the \isbncount}
\isbncalc{0}{0}{0}{0}{0}{0}{0}{0}{0}
\bye

At least, that will yield the number which will give the check digit when modulo 10 is applied (so e.g. if 187 is produced, the check digit is 3, which is the number required to round up to the next multiple of 10). Clever people on here might be able to take this further so that the actual check digit is produced.

For correct generation of hyphens I would suggest https://www.loc.gov/publish/pcn/isbncnvt_pcn.html (which will also convert between ISBN-10 and ISBN-13). The digit after 978 is nearly always zero, surrounded by hyphens, and the check digit is preceded by a hyphen, but the position of the other hyphen isn't (AFIK) produced automatically by a mathematical operation.

KersouMan
  • 1,850
  • 3
  • 13
  • 15
  • For some reason all my carriage returns disappeared from the code. It will work if a carriage return is introduced after the first % sign, after '1x8', and (for legibility) after '\the\isbncount}'. Sorry about that - I don't know how the glitch happened. – John in Oxford Aug 26 '21 at 13:34
  • To enter code and retain the carriage returns, highlight the code block with your mouse, then click on the {} icon at the top of the question box. (An edit doing this has been proposed, but needs another approval before taking effect.) – barbara beeton Aug 26 '21 at 13:56