6

I have a table which contains confidential informations but there is no problem to publish the boxplot resulting of the table.

To produce the boxplot, I am using pfgplot, which reads directly the data from the table.

Is it safe to use the resulting boxplot from pgfplot, in the sense that no data from my table will be included (eg. as metadata) in the resulting pdf file ?

  • Can't you just use the 'boxplot prepared' handler so that the data confidential data is never going into LaTeX in the first place? – Alexander Sep 13 '13 at 11:20
  • I will do as you say, eventhough it is a little bit more complicated : http://tex.stackexchange.com/questions/117435/read-boxplot-prepared-values-from-a-table – Laurent Dudok de Wit Sep 13 '13 at 11:23
  • Well, every plot can be reverse engineered and also if it is a vector graphic it is easier to get the coordinates. So better not publish your plot if you have serious concerns. Otherwise rely on the reader's laziness. – percusse Sep 13 '13 at 11:50
  • @percusse: I think Laurent is not concerned with people getting the values of the box plot, but with people getting the values that were used to calculate the box plot values (the box plot values are only summary statistics of the confidential data). – Jake Sep 13 '13 at 12:04
  • Create a pdf with pgfplot in standalone and include it as an image? – Ethan Bolker Sep 13 '13 at 12:22
  • 3
    @Jake Oh then I would say yes it is safe since pgfplots only produce paths. I wish it was unsafe though, I could use that hack forever :) – percusse Sep 13 '13 at 12:53

1 Answers1

8

You are safe to publish the resulting vector graphics.

Please confirm that you generate the graphics as follows:

From what you said, I assume you are using the statistics library and its boxplot plot handler as in the following example:

\documentclass{standalone}

\usepackage{pgfplots}

\usepgfplotslibrary{statistics}

\begin{document}

\begin{tikzpicture}
\begin{axis}[y=1cm,ytick=\empty]
    \addplot+[boxplot]
        table[row sep=\\,y index=0] {
        data\\
        1\\ 2\\ 1\\ 5\\ 4\\ 10\\
        7\\ 10\\ 9\\ 8\\ 9\\ 9\\
    };
\end{axis}
\end{tikzpicture}

\end{document}

enter image description here

In this case, pgfplots computes a boxplot prepared out of the input data. The values of this boxplot prepared are (a) visible and (b) can be reverse engineered. But the original table values are unavailable from the resulting vector graphics (except for any outliers, of course).


Unrelated side-remark:

Note that this answer holds only for boxplot. If you had a scatter plot or a line plot, the entire coordinate data could be reverse engineered. In fact, the clickable library of pgfplots allows not only to reverse engineer any coordinate, but also contains a copy of the input coordinates in the resulting .pdf.

David Carlisle
  • 757,742