Variations on this question have been asked many times before, but I have not been able to find a working solution. The closest I have seen are discussions on other stackexchanges concerning how to produce this with R.
I am trying to produce the following plot, with red and blue lines (the position of the lines is merely an estimate of where I expect them to appear in a solution, which must calculate their position based on the information in the data.csv file):
Part of my difficulty in asking this question is that I am not sure of the correct terminology to describe the red and the blue lines that I have drawn on the plot. I do not think they are gaussian curves as they are not symmetrical, but nevertheless I have included gaussian code in the below MWE in case my implementation of it is simply wrong (there is a very, very small gaussian plot on the bottom-left of the above chart that I have clearly been unsuccessful in implementing).
EDIT: As discussed in the comments below, I am looking for the curves to demonstrate a smooth estimation of the distribution of the results.
MWE
\documentclass{standalone}
\usepackage{pgfplots}
\usepackage{filecontents}
\begin{document}
\begin{filecontents*}{data.csv}
COLUMNA,COLUMNB
38,22
85,18
104,82
56,20
202,57
64,15
115,22
8,20
120,14
81,24
100,28
39,11
81,29
25,18
122,51
93,10
45,19
103,11
33,24
60,24
50,47
61,24
46,14
45,15
84,72
62,20
50,13
84,38
52,19
108,5
182,34
145,19
117,12
34,59
43,19
42,26
170,18
31,27
86,18
183,24
36,15
,21
,16
,26
\end{filecontents*}
\pgfmathdeclarefunction{gauss}{2}{\pgfmathparse{1/(#2*sqrt(2*pi))*exp(-((x-#1)^2)/(2*#2^2))}%
}
\begin{tikzpicture}
\centering
\begin{axis}[
ybar,
/pgf/number format/.cd,
use comma,
1000 sep={},
title={Title},
xlabel={Bins},
ylabel={Instances},
x label style={at={(axis description cs:0.5,-0.1)},anchor=north},
y label style={at={(axis description cs:0.05,0.5)},anchor=south},
%xticklabel style={rotate=90, anchor=near xticklabel},
xtick distance=50,
ytick distance=2,
width=\textwidth, %10.5cm
height=6cm,
axis y line*=left,
axis x line*=bottom,
ymin=0,
xmin=0,
xticklabel interval boundaries,
]
%%%
\addplot +[blue,
fill opacity=0.5,
hist={bins=22,
data min=0,
data max=220,
}
] table[y=COLUMNA, col sep=comma] {data.csv};
\addlegendentry{Series A}
\addplot +[red,
fill opacity=0.5,
hist={bins=22,
data min=0,
data max=220,
}
] table[y=COLUMNB, col sep=comma] {data.csv};
\addlegendentry{Series A}
\addplot [fill=red!50, draw=none, domain=0:220] {gauss(1.86,2.12)};
\end{axis}
\end{tikzpicture}
\end{document}

TeXenvironment and then use the data to plot things as you wish ;) – Raaja_is_at_topanswers.xyz Jan 24 '19 at 13:15TeXitself would be capable of performing and plotting the necessary calculations (as it already does with a Gaussian curve), but if I understand @StefanPinnow correctly, what I am after is beyond its capabilities. – Craig Jan 24 '19 at 13:38TeXis not meant to do such tasks ;) (which, I myself learned recently). Moreover, doing such a thing will make the entire code more complicated than it should be (which, IMO could be avoided whenever it is possible). – Raaja_is_at_topanswers.xyz Jan 24 '19 at 13:41\addplotcommand usingraw gnuplot... – Stefan Pinnow Jan 24 '19 at 14:28Gaussian. However, it doesn't represent the distribution of the so-called binned data ;) So, unless OP makes it clear what is of interest, as you said, it is indeed a futile attempt per se. For reference, https://imgur.com/a/DFcwiy0 this represent the distribution of the first column of the data that is here. – Raaja_is_at_topanswers.xyz Jan 24 '19 at 19:31f(x)=0.1*data(x-2)+0.2*data(x-1)+0.4*data(x)+0.2*data(x+1)+0.1*data(x+2)(of course not for the points at the ends of the interval). This would give you a sample of points you can draw a smooth curve through. Obviously, there are much more sophisticated recipes on the market, and I fully agree with you that one should then resort to standard software tailored for that. – Jan 24 '19 at 19:36