0

I have a data file. I want to plot a histogram from this data file with the number of occurrences on top of each bar. I could prove such a plot in Mathematica and the result is enter image description here

But I couldn't provide such a plot in the latex. How can I do that? Here is my minimal code

\documentclass{standalone}
\usepackage{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{filecontents}
\usepackage{pgfplots}
\pgfplotsset{compat=newest}

\begin{filecontents*}{data.csv}
0.999705, 0.102439, 2.22161, 4.48747, 1.21895, 6.4383, 1.89919, 0.131524, 0.255719,
 0.678656, 1.207, 2.4247, 1.33127, 1.50532, 1.14534, 3.62333, 1.3151, 1.1334, 1.14764, 
3.37807, 0.314127, 0.780982, .94461, 1.76052, 1.15505, 0.641343, 0.501159, 0.838922, 0.392404, 2.40693, 1.2087, 0.939336, 0.620025, 0.778478, 1.70643, 1.50092, 0.610803, 0.449405, 0.372005, 0.437747, 2.17335, 0.147226, 0.275256, 0.285204, .332344, 0.390268, 0.598056, 2.78572, 0.843533,0.869065, 
1.40148, 0.713403, 0.560139, 0.64868, 0.860224, 1.15303, 1.45957, 1.1884, 1.15756, 0.151852, 0.655366, 1.04536, 0.815271, 1.18471, 1.47575, 1.5487, 3.5261, 2.02479, 1.86159, 
2.20584, 2.10486, 2.75795, 1.41652, 0.685807, 4.80702, 1.69252, 1.08762, 0.541417, 0.552933, 0.60403, 0.661523, 1.93877, 4.95087, 0.667625, 0.643584, 0.721016, 0.746126, 0.577656, 3.09755, 2.66435, 0.56278, 0.799503, 0.783744, 0.576326, 0.669558, 0.977875, 1.54727, 1.80504, 1.08556, 0.674201, 0.808802, 3.41343, 1.82106, 1.32317, 0.960459, 2.83347, 1.746, 0.995808, 3.18927, 0.168725, 0.24383, 0.636872, 0.986101, 0.782347, 0.963776
\end{filecontents*}


\begin{document}

\begin{tikzpicture}
\begin{axis}[
ybar,
width=\textwidth,
ylabel = {Number},
xlabel = {$ \Delta_{\text{ours}} $},
xtick={0,0.2,...,8},
ytick={0,1,...,50},
 ]
\addplot table [x,col sep=comma] {data.csv};
\end{axis}
\end{tikzpicture}
AYBRXQD
  • 737

2 Answers2

3

Your data is in a format that requires some manipulations.

\documentclass{standalone}
\usepackage{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{filecontents}
\usepackage{catchfile}
\usepackage{pgfplots}
\pgfplotsset{compat=newest}

\begin{filecontents*}{data.csv}
0.999705, 0.102439, 2.22161, 4.48747, 1.21895, 6.4383, 1.89919, 0.131524, 0.255719,
 0.678656, 1.207, 2.4247, 1.33127, 1.50532, 1.14534, 3.62333, 1.3151, 1.1334, 1.14764, 
3.37807, 0.314127, 0.780982, .94461, 1.76052, 1.15505, 0.641343, 0.501159, 0.838922, 0.392404, 2.40693, 1.2087, 0.939336, 0.620025, 0.778478, 1.70643, 1.50092, 0.610803, 0.449405, 0.372005, 0.437747, 2.17335, 0.147226, 0.275256, 0.285204, .332344, 0.390268, 0.598056, 2.78572, 0.843533,0.869065, 
1.40148, 0.713403, 0.560139, 0.64868, 0.860224, 1.15303, 1.45957, 1.1884, 1.15756, 0.151852, 0.655366, 1.04536, 0.815271, 1.18471, 1.47575, 1.5487, 3.5261, 2.02479, 1.86159, 
2.20584, 2.10486, 2.75795, 1.41652, 0.685807, 4.80702, 1.69252, 1.08762, 0.541417, 0.552933, 0.60403, 0.661523, 1.93877, 4.95087, 0.667625, 0.643584, 0.721016, 0.746126, 0.577656, 3.09755, 2.66435, 0.56278, 0.799503, 0.783744, 0.576326, 0.669558, 0.977875, 1.54727, 1.80504, 1.08556, 0.674201, 0.808802, 3.41343, 1.82106, 1.32317, 0.960459, 2.83347, 1.746, 0.995808, 3.18927, 0.168725, 0.24383, 0.636872, 0.986101, 0.782347, 0.963776
\end{filecontents*}


\begin{document}

\begin{tikzpicture}
\CatchFileEdef{\mydata}{data.csv}{}
\edef\mymax{0}
\edef\mymin{0}
\foreach \X in \mydata
{\pgfmathtruncatemacro{\myX}{2*\X}
\pgfmathtruncatemacro{\mymax}{max(\mymax,\myX)}
\pgfmathtruncatemacro{\mymin}{min(\mymin,\myX)}
\xdef\mymax{\mymax}
\xdef\mymin{\mymin}
\ifcsname mybin\romannumeral\myX\endcsname
 \expandafter\xdef\csname mybin\romannumeral\myX\endcsname{\the\numexpr
    \csname mybin\romannumeral\myX\endcsname+1}
\else
 \expandafter\xdef\csname mybin\romannumeral\myX\endcsname{1}
\fi
}
\edef\mydata{}
\pgfplotsforeachungrouped \X in {\mymin,\the\numexpr\mymin+1,...,\mymax}
{\pgfmathsetmacro{\myx}{\X/2+0.25}
\ifcsname mybin\romannumeral\X\endcsname
\edef\mydata{\mydata (\myx,\csname mybin\romannumeral\X\endcsname)}
\else
\edef\mydata{\mydata (\myx,0)}
\fi}
\begin{axis}[nodes near coords,
bar width=2.2em,
nodes near coords style={anchor=south},
width=\textwidth,
ybar,
ylabel = {Number},
xlabel = {$ \Delta_{\text{ours}} $},
xmin=\mymin,ymin=0,
xmax=0.5+\mymax/2,
 ]
\addplot[draw=black,fill=orange!50] coordinates {\mydata};
\end{axis}
\end{tikzpicture}
\end{document}

enter image description here

  • Dear @Schrödinger's cat, thank you so much, but this plot has a considerable difference from what I uploaded. For example, the spaces between bars, numbers of occurrences on top of bars are not an integer. I want the plot to be exactly the same as what I uploaded. – AYBRXQD Mar 06 '20 at 17:36
  • @AYBRXQD This plot plots the data, which consists of fractional numbers. Your Mathematica screen shot shows data consisting of integers. If you want to reproduce the Mathematica plot you need to provide the data that is used in that plot, not some other data. Once this done, one can adjust the bar width and colors to produce a precise match. –  Mar 06 '20 at 17:44
  • Dear @Schrödinger's cat, the data file constructed in the latex code, was extracted from the Mathematica, exactly. – AYBRXQD Mar 06 '20 at 17:48
  • @AYBRXQD IMHO the question is rather misleading. You post a code and essentially ask how to add these numbers. Adding the numbers is, however, not the important step. The important step is to bin the data, which does not get mentioned at all in the question. –  Mar 06 '20 at 20:22
2

If you want to plot a histogram, than simply do that by making use of the statistics library. Usually adding the numbers on top of the bars is done by nodes near coords, but for histograms it is a known bug that first, the nodes are not centered above the bars and second, that you will get an additional/superfluous node. To circumvent this, here an adaption of esdd's great answer (the link is found in the code).

Please note that I directly have put the data to the table instead of reading it from a file. Otherwise the data must be ordered by column and not -- as you did -- by row. This is done now with the trick of using row sep=\\ and replacing the commas by \\.

% used PGFPlots v1.16
\documentclass[border=5pt]{standalone}
\usepackage{amsmath}
\usepackage{pgfplots}
    \usepgfplotslibrary{statistics}
    \pgfplotsset{compat=1.16}

    % -------------------------------------------------------------------------
    % modified from <https://tex.stackexchange.com/a/181132/95441>
    % -------------------------------------------------------------------------
    \newcommand*\NNC{\pgfmathprintnumber{\pgfkeysvalueof{/data point/y}}}
    % default value for enlarge x limits = 0.1
    \newcommand*\enlargexlimits{0.05}
    \pgfplotsset{
        hist nodes near coords/.style={
            nodes near coords style={
                xshift={
                    (\pgfkeysvalueof{/pgfplots/width}-45pt) % every plot is 45pt smaller then the width
                    /(1+2*\enlargexlimits)                  % correction for enlarge x limits
                    /\pgfkeysvalueof{/pgfplots/hist/bins}   % number of bins
                    /2% shift only half of bin width
                },
            },
            nodes near coords={%
                \pgfmathparse{
                    \pgfkeysvalueof{/data point/x} < #1*\pgfkeysvalueof{/pgfplots/hist/data max}?
                        "\noexpand\NNC"     % if true print nodes near coords
                        :                   % if false suppress the additional node near coords
                }\pgfmathresult%
            },
        },
        % if you set `hist/data max` explicitly, use value 1
        hist nodes near coords/.default={0.9},
    }
    % -------------------------------------------------------------------------
\begin{document}
\begin{tikzpicture}
    \begin{axis}[
        width=10cm,
        ybar,
        ymin=0,
        ylabel={Number},
        xlabel={$\Delta_{\text{ours}}$},
        enlarge x limits=\enlargexlimits,
    ]
        \addplot+ [
            hist={
                data min=0,
                data max=6.5,
                bins=13,
            },
            % set 1 here because we have set `data max`
            hist nodes near coords=1,
        ] table [row sep=\\,y index=0] {
            0.999705\\ 0.102439\\ 2.22161\\ 4.48747\\ 1.21895\\ 6.4383\\
            1.89919\\ 0.131524\\ 0.255719\\ 0.678656\\ 1.207\\ 2.4247\\
            1.33127\\ 1.50532\\ 1.14534\\ 3.62333\\ 1.3151\\ 1.1334\\ 1.14764\\
            3.37807\\ 0.314127\\ 0.780982\\ .94461\\ 1.76052\\ 1.15505\\
            0.641343\\ 0.501159\\ 0.838922\\ 0.392404\\ 2.40693\\ 1.2087\\
            0.939336\\ 0.620025\\ 0.778478\\ 1.70643\\ 1.50092\\ 0.610803\\
            0.449405\\ 0.372005\\ 0.437747\\ 2.17335\\ 0.147226\\ 0.275256\\
            0.285204\\ 0.332344\\ 0.390268\\ 0.598056\\ 2.78572\\ 0.843533\\
            0.869065\\ 1.40148\\ 0.713403\\ 0.560139\\ 0.64868\\ 0.860224\\
            1.15303\\ 1.45957\\ 1.1884\\ 1.15756\\ 0.151852\\ 0.655366\\
            1.04536\\ 0.815271\\ 1.18471\\ 1.47575\\ 1.5487\\ 3.5261\\ 2.02479\\
            1.86159\\ 2.20584\\ 2.10486\\ 2.75795\\ 1.41652\\ 0.685807\\
            4.80702\\ 1.69252\\ 1.08762\\ 0.541417\\ 0.552933\\ 0.60403\\
            0.661523\\ 1.93877\\ 4.95087\\ 0.667625\\ 0.643584\\ 0.721016\\
            0.746126\\ 0.577656\\ 3.09755\\ 2.66435\\ 0.56278\\ 0.799503\\
            0.783744\\ 0.576326\\ 0.669558\\ 0.977875\\ 1.54727\\ 1.80504\\
            1.08556\\ 0.674201\\ 0.808802\\ 3.41343\\ 1.82106\\ 1.32317\\
            0.960459\\ 2.83347\\ 1.746\\ 0.995808\\ 3.18927\\ 0.168725\\
            0.24383\\ 0.636872\\ 0.986101\\ 0.782347\\ 0.963776\\
        };
    \end{axis}
\end{tikzpicture}
\end{document}

image showing the result of above code

Stefan Pinnow
  • 29,535