2

I'm trying to create a box & whiskers plot that includes the upper & lower fence, and the outliers. My current plot looks like this (minus all the labeling):

a boxplot that has only the Q1, median, Q2, upper, & lower whiskers graphed

\documentclass [12pt, letterpaper] {article}

\usepackage {pgfplots} \pgfplotsset {compat=1.18} \usepgfplotslibrary {statistics}

\begin {document}

\begin {center} \begin {tikzpicture} \begin {axis} [ xmin = 15, xmax = 35, xtick = {15, 20, 25, 30, 35}, ytick = \empty, width = 10cm, height = 5cm, ] \addplot + [ boxplot prepared = { median = 25.5, upper quartile = 27, lower quartile = 23, upper whisker = 31, lower whisker = 17, }, ] coordinates {}; \end {axis} \end {tikzpicture} \end {center}

\end {document}

The problem is I can't figure out how to add the upper fence, lower fence, and my outliers; I'd need my graph to look more-so like this:

a box plot that has the Q1, median, Q1, upper whisker & lower whisker marked, but this time includes the upper fence, lower fence, and some outliers (with everything being labeled)

muzimuzhi Z
  • 26,474
Le-Kat
  • 21
  • +1: Nice code example. Is there a reason why there are so many spaces between commands and arguments, e. g. \end {center}? – Dr. Manuel Kuehner Jan 31 '22 at 04:03
  • The reason there's so many spaces is probably since I just yanked this table out of the .tex file I was writing, and removed all the data that was irrelevant to this question. Also I generally prefer to over-use white space since it makes things slightly more readable (personally speaking, at least) – Le-Kat Jan 31 '22 at 04:17
  • I see, there seems to be no problem with that: https://tex.stackexchange.com/questions/202020 – Dr. Manuel Kuehner Jan 31 '22 at 04:21
  • I have provided a manual solution and some references on how I calculated it. Maybe some of the better experts will provide an automated solution. – Dr. Manuel Kuehner Jan 31 '22 at 04:38

1 Answers1

2
  • I provide a very manual solution because I am not skilled enough for a more automated solution.
  • I calculate the position of the fences manually according to Wikipedia (as of 2022-01-30), see citation block below.
  • Then I draw a dashed line manually.
  • Watch out, the code must be before \end{axis}!
  • You can use \draw[dashed, blue] (17, 0.6) -- (17, 1.4); if you want to change the color.
  • I commented %ytick = \empty out, in order to see which values I need for the y positions (0.6 and 1.4).

The same data set can also be made into a box-plot through a different approach as shown in Figure 3. This time the boundaries of the whiskers are found within the 1.5 IQR value. From above the upper quartile (Q3), a distance of 1.5 times the IQR is measured out and a whisker is drawn up to the largest observed data point from the dataset that falls within this distance. Similarly, a distance of 1.5 times the IQR is measured out below the lower quartile (Q1) and a whisker is drawn down to the lowest observed data point from the dataset that falls within this distance. All other observed data points outside the boundary of the whiskers are plotted as outliers.

enter image description here (Source: https://towardsdatascience.com/understanding-boxplots-5e2df7bcbd51)


\documentclass{article}

\usepackage{tikz} \usepackage{pgfplots} \pgfplotsset{compat=1.18} \usepgfplotslibrary{statistics}

\begin{document}

\begin{center} \begin{tikzpicture} \begin{axis} [ xmin = 15, xmax = 35, xtick = {15, 20, 25, 30, 35}, %ytick = \empty, width = 10cm, height = 5cm, ] \addplot+[ boxplot prepared = { median = 25.5, upper quartile = 27, % Q3 lower quartile = 23, % Q1 upper whisker = 31, lower whisker = 17, }, ] coordinates {}; % https://en.wikipedia.org/wiki/Box_plot % IQR = 27 - 23 = 4 % 1.5 * IQR = 6 % Lower Fence = Q1 - 1.5 * IQR = 17 % Upper Fence = Q3 + 1.5 * IQR = 33 \draw[dashed] (17, 0.6) -- (17, 1.4); \draw[dashed] (33, 0.6) -- (33, 1.4); \end{axis} \end{tikzpicture} \end{center}

\end{document}

enter image description here