8

I try to create a stacked histogram. Like this: enter image description here

My data consists of classes (0-4) and a float value for each class.

I already have a histogram for all mean values. But I would like to color the bars according to the amount of classes in each bar.

What I have:

enter image description here

Code:

\documentclass{article}
\usepackage{filecontents}
\usepackage{pgfplots, pgfplotstable}
% first number is a class (0-4), second float number is the mean of the data
\begin{filecontents*}{data.csv}
                1,0.177597344546
                1,0.18105947348
                1,0.177429493018
                2,0.244377481246
                4,0.185496836789
                0,0.180714004683
                4,0.187928321127
                3,0.188037364067
                4,0.187302774169
                3,0.188172520266
\end{filecontents*}
\begin{document}
\begin{tikzpicture}
    \begin{axis}[
            ybar stacked,
            ymin=0,
        ]
        \addplot +[
            hist={
                bins=20,
            }   
        ] table [x, y, col sep=comma] {data.csv};
    \end{axis}
\end{tikzpicture}
\end{document}
cmhughes
  • 100,947
mjspier
  • 263

2 Answers2

2

I found a solution but I had to split my data. I created a data file for reach class (0-4) what can be easily done with grep. grep 0, data.csv > mean0.csv

If it is possible to add a condition when the data is loaded (load only the rows where x=0 for example), this step would be obsolete.

! xmin and xmax has to be defined so every histogram uses the same range.

\documentclass{article}
\usepackage{filecontents}
\usepackage{pgfplots, pgfplotstable}
% first number is a class (1-4), second float number is the mean of the data
\begin{filecontents*}{mean0.csv}
                0,0.180714004683
\end{filecontents*}
\begin{filecontents*}{mean1.csv}
                1,0.177597344546
                1,0.18105947348
                1,0.177429493018
\end{filecontents*}
\begin{filecontents*}{mean2.csv}
                2,0.244377481246
\end{filecontents*}
\begin{filecontents*}{mean3.csv}
                3,0.188037364067
                3,0.188172520266
\end{filecontents*}
\begin{filecontents*}{mean4.csv}
                4,0.185496836789
                4,0.187928321127
                4,0.187302774169
\end{filecontents*}
\begin{document}
\begin{figure}
    \centering
        \begin{tikzpicture}
            \begin{axis}[
                    ybar stacked,
                    ybar legend,
                    ylabel={\# traces},
                    xlabel={mean},
                    ymin=0,
                    xmin=0.17,
                    xmax=0.25,
                    legend style={at={(0.5,-0.20)},
                    anchor=north,legend columns=-1},
                ]
                \addplot +[hist={bins=20}]table [x, y, col sep=comma] {mean0.csv};
                \addplot +[hist={bins=20}]table [x, y, col sep=comma] {mean1.csv};
                \addplot +[hist={bins=20}]table [x, y, col sep=comma] {mean2.csv};
                \addplot +[hist={bins=20}]table [x, y, col sep=comma] {mean3.csv};
                \addplot +[hist={bins=20}]table [x, y, col sep=comma] {mean4.csv};
                \legend{pro1, pro2, pro3, pro4, pro5}
            \end{axis}
        \end{tikzpicture}
    \caption{caption}
\end{figure}
\end{document}

The histogram looks like this now.

enter image description here

mjspier
  • 263
2

Just for completeness. I created also another solution whereby I calculated the histogram bins beforehand with a python script.

Before my data looked like this:

<class, mean_value>    
1,0.177597344546
1,0.18105947348
1,0.177429493018
...

After the python script data is like this:

<bin_value   class1  class2 class3 class4 class5>
0.177093 471 882 0 0 0
0.180632 538 127 0 135 0
0.184171 0 0 0 691 556
....

The final latex example:

\documentclass{article}
\usepackage{filecontents}
\usepackage{pgfplots, pgfplotstable}
\begin{filecontents*}{data2.csv}
0.177093 471 882 0 0 0
0.180632 538 127 0 135 0
0.184171 0 0 0 691 556
0.187710 0 0 0 183 453
0.191249 0 0 0 0 0
0.194788 0 0 0 0 0
0.198327 0 0 0 0 0
0.201866 0 0 0 0 0
0.205405 0 0 0 0 0
0.208944 0 0 0 0 0
0.212483 0 0 0 0 0
0.216022 0 0 0 0 0
0.219561 0 0 0 0 0
0.223100 0 0 0 0 0
0.226639 0 0 0 0 0
0.230178 0 0 0 0 0
0.233717 0 0 0 0 0
0.237256 0 0 0 0 0
0.240795 0 0 124 0 0
0.244334 0 0 885 0 0
\end{filecontents*}
\begin{document}
        \begin{tikzpicture}
            \begin{axis}[
                    ybar stacked,
                    ylabel={\# traces},
                    xlabel={mean},
                    ymin=0,
                    legend style={at={(0.5,-0.20)},
                    anchor=north,legend columns=-1},
                ]  
                \addplot +[ybar]table [x index=0,y index=1] {data2.csv};
                \addplot +[ybar]table [x index=0,y index=2] {data2.csv};
                \addplot +[ybar]table [x index=0,y index=3] {data2.csv};
                \addplot +[ybar]table [x index=0,y index=4] {data2.csv};
                \addplot +[ybar]table [x index=0,y index=5] {data2.csv};
                \legend{pro1, pro2, pro3, pro4, pro5}
            \end{axis}
        \end{tikzpicture}
\end{document}

The resulting graphs is the same as above.

If somebody requests, I can also post the python script.

mjspier
  • 263