69

Many academic journals practice double-blind peer review which means that the author is required to blind their document by removing any references that can reveal the author's identity. How can I take advantage of the power of TeX and friends to blind or anonymize a document? Or more bluntly put: how do people successfully blind their LaTeX documents?

Here are some techniques I can think of:

  • Search and replace terms such as one's name and one's affiliation.
  • Use macros to write sensitive terms, e.g. \newcommand\myname{N.N.}, and replace the definition when blinding.
  • Keep sensitive information in separate documents which are not included into the main document when it is blinded.
  • Use version control so that the blinding changes can easily tracked and reversed.

Answers which goes into detail on how to implement any of the outlined techniques above or their respective merits will also be appreciated.

N.N.
  • 36,163
  • 1
    Is it acceptable to have a different layout in the anonymized version and the final version? To maintain an identical layout I could imagine using \phantom to remove critical information. – Christian Lindig Jan 23 '12 at 13:03
  • 1
    @ChristianLindig It is acceptable, but solutions that maintain identical layouts are also interesting. Please post the \phantom solution. – N.N. Jan 23 '12 at 13:05
  • 5
    A small nit-pick: I think this is called a double-blind review. Typically author names are visible to the reviewers to see if they are duplicating thier early efforts and that's called blind review. At least in IEEE circles it goes that way. Moreover, most paper submission systems anonymize the .pdf files on-the-fly hence you simply skip your name. – percusse Jan 23 '12 at 13:56
  • 3
    @percusse There may be other terms that needs to be blinded other than the author names. – N.N. Jan 23 '12 at 14:13
  • Are you submitting source or PDF? @percusse I'm not sure about the 'typically'. Double-blind is increasingly standard in my field. Triple-blind is not unheard of, either. – cfr Dec 29 '14 at 03:17
  • @cfr PDF. And I used typically for single-blind part. See the edit. There are still lots of technical fields that use single. But in my opinion, blind review system doesn't work if the field is not big enough. Everybody recognizes the others most of the cases. – percusse Dec 29 '14 at 03:21
  • @percusse That's too bad. I agree there are limits to what can be achieved, but I'm sorry that people do not make the effort in your field. – cfr Dec 29 '14 at 03:28
  • @cfr Not really. I think blinding is not the way to go anyway. I think the reviewers should be also visible so that people behave responsibly. I'm not a big fan of the current format. – percusse Dec 29 '14 at 03:33
  • @percusse I disagree but this is not the place. I'm not a huge fan of the current format either, but I don't think doing away with anonymity would improve things. Not in my field. But, as I say, this is not the place. – cfr Dec 29 '14 at 03:38
  • 1
    I once just used \newcommand\hide[1]{$[$hidden$]$} and \newcommand\hide[1]{#1}, - not layout preserving and not handling bibliography items. – Finn Årup Nielsen Feb 18 '16 at 10:14

7 Answers7

21

Instead of writing special macros I would use the soul's package \hl highlighting feature. On final version you can then cancel its effects out.

\documentclass{article}
\usepackage{lipsum}
\usepackage{xcolor}
\usepackage{soul}
\sethlcolor{black}
\makeatletter
\newif\if@blind
\@blindtrue %use \@blindfalse on final version
\if@blind \sethlcolor{black}\else
   \let\hl\relax
\fi
\title{This is my title}
\author{\hl{Y Lazarides}}
\begin{document}
\maketitle
\hl{In our paper (Jones 22)} \lipsum*[1]
\end{document}

On final version one sets the boolean \if@blind to \@blindfalse (in line 7).

enter image description here

If you just want to gobble the text rather than blank it out change the code to:

\if@blind \def\hl#1{}\else
   \let\hl\relax
\fi

enter image description here

To summarize some of the comments, it does not appear as there is any safe and efficient  way of automatically "blinding" a document. If it is important for you to make sure that you are not discriminated by someone else's prejudices, perhaps the safest bet is to hit save as button and re-write some portions of the paper.

yannisl
  • 117,160
  • Could you please elaborate how \hl works and how you use it to blank out? – N.N. Jan 23 '12 at 12:20
  • Under normal circumstances \hl is used for highlighting. We first set its color to black and on final version we set the macro \hl to relax. I just added some more details. – yannisl Jan 23 '12 at 12:43
  • 7
    But with \@blindtrue you can still see the blinded text by marking it in the resulting PDF. – N.N. Jan 23 '12 at 12:57
  • 5
    @YiannisLazarides The information is still available even if you use @blindtrue. For example, try pdftotext:-) –  Jan 23 '12 at 12:57
  • @N.N. You can gobble the text rather than blank it out if you wish, just added it to the edit. This might reflow the document though. A more complicated way is to replace all the characters one by one and encrypt them, this will also give you a problem as hyphenation might change. – yannisl Jan 23 '12 at 13:13
  • Is a solution that blanks out text but where the text is not accessible by marking it or by pdftotext possible? – N.N. Jan 23 '12 at 13:17
  • @MarcvanDongen I extended it to gobble it the text, however, the looks of the paper would change, as it will reflow the document. If you hyphenating you pretty very much stuck, I guess. A determined peer will always guess the author, one can apply markov analysis;) How about give them only the paper version:) – yannisl Jan 23 '12 at 13:18
  • 3
    @YiannisLazarides The fact that hyphenation exists and that you can see the length of the reference contributes to information about the reference. I'd simply insert a the same text for each reference: [information about reference removed for the purpose of blind reviewing]. –  Jan 23 '12 at 13:18
  • 1
    Reflowing the text should not be so much of an issue for the submitted version of the paper. @MarcvanDongen: With respect to references it is considered good practice not to remove them, but to refer to them in third-person language. If this works, depends on the context, of course. (Actually, with Google and Co, double-blind reviewing has become more or less ridiculous, but it is important to play the game seriously if it is a formal requirement by the journal or conference.) – Daniel Jan 23 '12 at 13:59
  • @Daniel I was commenting on the proposal to blacken out/gobble references, not omitting them. As a matter of fact, if your peers know your work (which they should) they will immediately notice which work you're referring to, even if you leave it out. When I wrote [information about reference removed for the purpose of blind reviewing] I was suggesting a solution for an unpublished work. –  Jan 24 '12 at 20:47
  • @Daniel Double-blind review is not made ridiculous because referees can determine an author's identity. Well-intentioned referees will not immediately hit Google to determine an author's identity because they will respect the need to review blind. (Whereas a well-intentioned referee might well look at the document properties or copy-paste when writing a report or hover a mouse or whatever.) Nobody suggests that blinding work prevents people discovering authors' identities. Rather, it allows people not to know. – cfr Dec 29 '14 at 03:25
21

Try my censor package for obliterating text while preserving its original spacing. EDITED to demonstrate \xblackout and \censorbox, in addition to \censor and \blackout. The \xblackout will bleed slightly into the margins.

Both \blackout and \xblackout work across linebreaks and paragraph boundaries. However, hyphenation is lost inside these macros.

\documentclass{article}
\usepackage[margin=1in]{geometry}
\usepackage{censor,caption}
\parskip 1ex
\begin{document}
The \censor{Liberty} missile, with charge diameter (CD) of
\censor{80}~mm, revealed a penetration capability of 1.30,
1.19, and 1.37~CD in three recent tests into armor steel.

\blackout{%
The Liberty missile, with charge diameter (CD) of 80~mm,
revealed a penetration capability of 1.30, 1.19, and 1.37~CD
in three recent tests into armor steel.}

\xblackout{%
The Liberty missile, with charge diameter (CD) of 80~mm,
revealed a penetration capability of 1.30, 1.19, and 1.37~CD
in three recent tests into armor steel.}

\begin{table}[ht]
\centering
\caption{This is my \protect\censor{censored caption.}}
\censorbox{
\begin{tabular}{|c|c|}
\hline
This & is my\\
tabular & content
\end{tabular}
}
\end{table}
\end{document}

enter image description here

  • This does black out per word and seems less secure than blacking out whole sentences. – N.N. Feb 13 '13 at 06:16
  • The \blackout command leaves spaces intact, so that LaTeX glue works properly in formatting paragraphs (which is the intent of \blackout). But the \censor command of that package will censor spaces. However, it won't work across more than a single line of text. I'll give some thought to how to make the result completely black. I have an idea... – Steven B. Segletes Feb 21 '13 at 16:15
  • Thanks for the package. Is the \censor command suppose to work in twocolumn mode? It does not seem to respect column width. Example: \documentclass[twocolumn]{article}\usepackage{censor}\begin{document}Try my censor package for obliterating text while \censor{preserving its original spacing. EDITED to demonstrate xblackout and censorbox}\end{document}. – Finn Årup Nielsen Feb 17 '16 at 16:34
  • I suppose that the blackout commands are for larger blocks of text (as written in the documentation) and \censor should only be used for individual words? – Finn Årup Nielsen Feb 17 '16 at 16:41
  • 1
    @FinnÅrupNielsen In general, I would say, yes to your question. I'm sure each user will have their own set of exceptions. – Steven B. Segletes Feb 17 '16 at 17:49
  • This is great! To turn it into a good package for blinding though you'd want to have an option to remove cites when they occur under a box. – Peter Gerdes Nov 29 '21 at 00:21
  • @PeterGerdes I'm working on package revisions presently. Can you give an example of what you mean by "cites when they occur under a box"? – Steven B. Segletes Nov 29 '21 at 03:10
  • So suppose you want to anonymize something like this in your paper: \censor{In \cite{berry2020} Berry} shows... but that's the only use of \cite{berry2020} in the document. It would be nice to have an option to suppress appearance of berry2020 in the bibliography (but still have the cite printed out under box to get the right spacing). Not sure if you want to put in your package but would be ideal for paper blinding usage. Or at least issue a warning. – Peter Gerdes Nov 29 '21 at 12:27
  • @PeterGerdes Thanks for the example. I will see what I can do. I will reply on this thread, eventually. – Steven B. Segletes Nov 29 '21 at 13:52
  • 1
    @PeterGerdes I have today uploaded v4.0 of the censor package to CTAN. It should be available for download in the coming days. Flexibility is offered for incorporating macros into censored content, but only if the macro content is expandable. Unfortunately, \ref is not expandable. However, with the refcount package, \getrefnumber is an expandable macro that may serve as an alternative. – Steven B. Segletes Dec 23 '21 at 18:19
  • Wow, thanks for listening to the feedback and I understand the limitations. – Peter Gerdes Dec 24 '21 at 14:56
  • Masking German Umlauts like "a for ä seems to be not possible. – user3072843 Apr 11 '22 at 12:56
  • @user3072843 The current version of censor, 2022/02/09 4.1, can handle umlauts specified as \"a; however, if specified as unicode, ä, it must be compiled via xelatex or lualatex. – Steven B. Segletes Apr 11 '22 at 13:15
  • @user3072843 \documentclass{article} \usepackage{censor}[2022/02/09] \begin{document} %\StopCensoring \blackout{How to handle umlauts ä, ö, ü or \"a, \"o, and \"u?} \end{document} – Steven B. Segletes Apr 11 '22 at 13:26
20

In general, there is more to preparing for double blind reviewing than just syntactically replacing names. It is for instance very easy to reveal one's identity by referring to "my" or "our" previous work, by citing (yet-)unpublished articles, or by otherwise discussing information that is not publicly available. Even if such obvious giveaways are avoided, there are other clues: saying that you build upon work that has been published very recently, discussing the fine points of very recent published research, citing many articles from yourself, even in the third person.

In short, you can't avoid paying attention to these issues in proofreading.

N.N.
  • 36,163
  • Indeed. There still may be more or less efficient ways to handle the more subtle blinding. Simple replacing terms might not suffice for these issues but version control might help track such changes and there might be ways to mark sensitive sentences that should be blinded. – N.N. Jan 24 '12 at 07:33
  • While this is true, there are also limits to what is expected and reasonable. If your article really does build on something else you just published, there may be no way of not saying so without undermining referees' ability to appreciate the point of your article. Similarly, if very few people work on a topic, it may be that the very content of your paper more-or-less gives the game away. There's nothing you can do about that, though, and journals don't expect you to blind stuff to the point of making it impossible to submit work at all. – cfr Dec 29 '14 at 03:21
19

I modified Antal S-Z's answer to this question to allow text to be completely blinded - i.e. the text to hide will be removed entirely from the document such as to prevent it from showing when marking it with the mouse cursor or using some other tool to analyse the document. However, the layout and appearance of the surrounding text will remain intact (although some slight differences may appear due to changes in hyphenation).

Here is the entire code:

\documentclass{minimal}
\usepackage{soul}
\usepackage{tikz}
\usetikzlibrary{calc}

\makeatletter
\newif\if@anonymize

\@anonymizetrue    % Uncomment to hide text
%\@anonymizefalse  % Uncomment to show text

\if@anonymize
  \newcommand{\highlight@DoHighlight}{
    \fill [outer sep = -15pt, inner sep = 0pt, color=black]
          ($(begin highlight)+(0,8pt)$) rectangle ($(end highlight)+(0,-3pt)$) ;
  }

  \newcommand{\highlight@BeginHighlight}{
    \coordinate (begin highlight) at (0,0) ;
  }

  \newcommand{\highlight@EndHighlight}{
    \coordinate (end highlight) at (0,0) ;
  }

  \newdimen\highlight@previous
  \newdimen\highlight@current
  \newlength{\item@width}

  \DeclareRobustCommand*\anonymize{%
    \SOUL@setup
    \def\SOUL@preamble{%
      \begin{tikzpicture}[overlay, remember picture]
        \highlight@BeginHighlight
        \highlight@EndHighlight
      \end{tikzpicture}%
    }%
    %
    \def\SOUL@postamble{%
      \begin{tikzpicture}[overlay, remember picture]
        \highlight@EndHighlight
        \highlight@DoHighlight
      \end{tikzpicture}%
    }%
    %
    \def\SOUL@everyhyphen{%
      \discretionary{%
        \SOUL@setkern\SOUL@hyphkern
        \SOUL@sethyphenchar
        \tikz[overlay, remember picture] \highlight@EndHighlight ;%
      }{%
      }{%
        \SOUL@setkern\SOUL@charkern
      }%
    }%
    %
    \def\SOUL@everyexhyphen##1{%
      \SOUL@setkern\SOUL@hyphkern
      \settowidth{\item@width}{##1}%
      \makebox[\item@width]{}%
      \discretionary{%
        \tikz[overlay, remember picture] \highlight@EndHighlight ;%
      }{%
      }{%
        \SOUL@setkern\SOUL@charkern
      }%
    }%
    %
    \def\SOUL@everysyllable{%
      \begin{tikzpicture}[overlay, remember picture]
        \path let \p0 = (begin highlight), \p1 = (0,0) in \pgfextra
          \global\highlight@previous=\y0
          \global\highlight@current =\y1
        \endpgfextra (0,0) ;
        \ifdim\highlight@current < \highlight@previous
          \highlight@DoHighlight
          \highlight@BeginHighlight
        \fi
      \end{tikzpicture}%
      \settowidth{\item@width}{\the\SOUL@syllable}%
      \makebox[\item@width]{}%
      \tikz[overlay, remember picture] \highlight@EndHighlight ;%
    }%
    \SOUL@
  }
\else
  \newcommand{\anonymize}[1]{#1}
\fi
\makeatother

\begin{document}
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras sit amet urna
nulla. Nam placerat risus quis elit placerat consectetur a in magna. Ut vitae
urna vitae urna sagittis mollis sed a velit. \anonymize{Phasellus enim tellus,
  dictum nec sagittis sit amet, viverra at leo.} Pellentesque faucibus orci non
urna facilisis sed venenatis lacus ultricies. In in diam ut massa sodales
consequat at ut mi. Mauris pharetra tortor et nunc iaculis aliquet sodales
turpis convallis. Pellentesque habitant morbi tristique senectus et netus et
malesuada fames ac turpis egestas. \anonymize{Morbi vulputate}, risus non
accumsan vulputate, justo mauris pretium lorem, nec rhoncus mi nisl sit amet
lacus. Aenean metus nunc, sagittis in dictum sed, facilisis sit amet ante. Fusce
enim lorem, pharetra non congue a, vehicula id metus. Nam facilisis, velit
condimentum volutpat tristique, elit elit tristique est, a varius nulla purus
non mi. Nam elementum viverra ligula sit amet hendrerit. Aenean sit amet tempus
turpis. Suspendisse at risus quis eros semper cursus.

\anonymize{Nunc eleifend, augue non lacinia sagittis, lorem elit ullamcorper
  libero, non placerat massa lectus vel nisi. Phasellus nunc elit, porttitor
  tempor placerat et, semper sed leo. Integer commodo molestie pretium. Ut eu
  dolor velit. Phasellus sed dui nunc. Donec iaculis est eu felis accumsan
  sodales. Vivamus hendrerit dignissim faucibus.}

In congue condimentum metus in ornare. Etiam at diam vitae mi laoreet
consectetur. \anonymize{Curabitur at turpis commodo nisi tempus tincidunt. Nam
  vestibulum lacinia mi, vitae auctor erat consequat ac.} Phasellus semper
blandit orci ac varius. Praesent et magna a mi faucibus porta a non libero. Cum
sociis natoque penatibus et magnis dis parturient montes, \anonymize{nascetur}
ridiculus mus. Nulla facilisi. Nullam commodo volutpat ante ac ornare. Donec
convallis diam accumsan ipsum porta eu elementum leo lacinia. Cras tincidunt
semper mauris, ut mollis lectus consectetur quis. \anonymize{Pellentesque sem
  urna}, fringilla eget faucibus quis, condimentum nec mi. Quisque odio felis,
fermentum quis feugiat placerat, dapibus vitae massa. \anonymize{Ut semper elit
  eget dolor imperdiet posuere.}
\end{document}

And here is an example showing a text with and without anonymization. The setting can be controlled by simple commenting and uncommenting two lines of code.

With anonymity: Text which has been anonymized

Without anonymity: Original text without anonymity

gablin
  • 17,006
  • @YiannisLazarides: True, some slight changes may occur due to changes in hyphenation. I'll update the answer accordingly. – gablin Feb 28 '12 at 12:55
  • 1
    @goblin If you parsing through soul macros, you can just replace the letters with a blank letter, some fonts have them. – yannisl Feb 28 '12 at 13:04
  • @YiannisLazarides: Wouldn't that change the width of the text to anonymize? Or is there a blank equivalent for every visible character? – gablin Feb 28 '12 at 13:08
  • There was one called the null font that if I remember could perhaps do it I am not sure if it still exists. – yannisl Feb 28 '12 at 14:21
  • @YiannisLazarides \newif\ifparanoid\paranoidtrue But then, first the PDF would still contain the text (except it will be in the null font, but a simple copy-paste would remove the font information and give the clear text), and even if it didn't, it would be relatively easy to discover the clear text by comparing the width of those blank glyphs with the width of the glyphs in the normal font. \paranoidfalse – Suzanne Soy Feb 12 '13 at 21:59
  • @GeorgesDupéron There is no safe solution (even with the gobbling of text as per my answer). It is fairly elementary NLP analysis to determine an author (if you have previous papers) it will take less than 5 minutes with the nltk and elementary python. – yannisl Feb 12 '13 at 22:28
  • I have a question. I see mention of the word "overlay" in you code, regarding tikz invocations. Does the anonymize replace the text with black line, or merely overlay it? If it is a mere overlay, there is a good chance that a PDF output would contain the anonymous text in its bowels, even though you couldn't see it visually. – Steven B. Segletes Feb 21 '13 at 16:19
  • I presume that the overlay is only used in order to be able to place the tikz pictures (i.e. the black highlights) arbitrarily across the document; I've merely modified a previous solution so I'm not 100% sure how it works. However, in this solution the actual text that is to be anonymized is completely removed and subsequently replaced with blanks of equal size to the text. So there is no risk of retrieving the text from the PDF. Or at least there shouldn't be. =) – gablin Feb 25 '13 at 07:34
  • @YiannisLazarides, there are many ways of getting round blinding, in many situations, so this is more a method to keep the (reasonably) honest (reasonably) honest. Rather like the lock on a door. Reviewers are generally too busy to put a lot of effort into un-anonmising submissions. Except of course when they feel threatened, but the author of a paper has no control over that. – Chris H Sep 23 '13 at 13:09
  • I copyed tyour code and got whit boxes. How can I turn it black? – user3072843 Jan 30 '20 at 21:47
  • This blows up if you try and cover a cite command with the anonymize. That's unfortunate since covering up cites is one of the standard things you need to do when blinding. – Peter Gerdes Nov 29 '21 at 00:15
6

Instead of defining many commands like \myname, \collabname, etc, I would define just one command that takes nonblinded text and replaces it with blinded text; once the review is done you can redefine it to return the text unchanged, instead.

\documentclass{article}

    % Create the dimension variables outside the macro, so they'll
    % be created once. (Each new creation consumes a new register)
    \newlength{\sohigh}%
    \newlength{\sowide}%

    \def\blind#1{%
% To use blank lines in code, the comment mark is necessary -- 
% else, LaTeX inserts \pars.
%
        % Set the dimensions of the black stripe
        \settoheight{\sohigh}{\hbox{H}}%
        \settowidth{\sowide}{\hbox{#1}}%
%
        % ... and use them.
        \rule{\sowide}{\sohigh}%
    }

% Alternative, for if you don't particularly care about pretty boxes
% or if the length of the blacked-out text would provide a clue.
%\def\blind#1{CENSORED}

% When you no longer want to blind, use this instead.
%\def\blind#1{#1}

\begin{document}
Hello \blind{World}.
\end{document}

There should be a \begin{blind}...\end{blind}, too, but as I have little experience with LaTeX I don't know how to define these. Improvements welcome.

Esteis
  • 2,917
5

Some journals require you to mask citations also. The apa6 class provides mask citation commands.

StrongBad
  • 20,495
  • Is this possible with biblatex? – N.N. Feb 21 '12 at 20:20
  • 1
    The documentation for apa6 says it works with biblatex and natbib. The documentation for biblatex-apa does not mention it, so I am guessing the cite commands are defined in the apa6 .cls file and not the biblatex-apa .cbx file (but this is a guess). – StrongBad Feb 22 '12 at 11:22
1

A simple approach if you want to have two different versions of text--rather than blacking out--is to define a dummy command and then use \ifdefined and \else to select which version of the text to use.

For example:

%Near top of document
\newcommand{\DOUBLEBLIND}{} %Identify that we are building the double blind review version
...
This work included a novel Internet-based link between 
\ifdefined\DOUBLEBLIND
  % Stuff you want to see in the double blind version (or empty if omitting in review version)
  Lab1 and Lab2, based on [withheld].
\else
  % Stuff you want in hide in the double blind version (or leave off \else block if nothing to add)
  Actual_Name and Other_Real_Name, based on \cite{give_away_paper}.
\fi

In this form, typesetting the document will create the review version. To create the final, simply comment out the new command by placing a % in front of the \newcommand... line and the alternate text will be used instead.

Bryan P
  • 527