3

I need to calculate the mean of a bunch of values generated by a function inside a loop. Is it possible?

\documentclass{article}
\usepackage{tikz}
\usepackage{pgfplots}
\begin{document}
\begin{tikzpicture}
\pgfmathsetseed{5411}
\def\N{5}
\node[]at(0,1){\textbf{Values}};
\foreach \n in {1,...,\N}
{
\pgfmathtruncatemacro{\value}{10*rand}
\node[]at(1+\n,1){$\value$};
%\pgfmathsomething{\mean}{"(sum of values)/N"}. 
}
\node[]at(0,0){\textbf{Mean}};
\node[]at(3,0){$\dots$};
%\node[]at(3,0){$\mean$};
\end{tikzpicture}

\end{document}

yngabl
  • 448
  • Though you can almost certainly make an array-like structure (see PGF/TikZ: How to store strings in array?), you can also just keep a running total of your values, and then divide that total by N after the loop terminates. – Mike Renfro Jun 11 '18 at 19:04
  • The statistics part of the pgfplots manual (section 5.12) has also some means to compute the means (or medians) e.g. through the \pgfmathprintnumber{\boxplotvalue{median}} syntax. –  Jun 11 '18 at 19:12
  • Unrelated, but: Think about what you are doing. TeX is a type-setting program with tikz for drawing. It is not designed to do complex mathematical tasks. While it is technically possible to do so (see math libraries of pgf, float double support via fpu library), there are programs better suited to the task. – Huang_d Jun 11 '18 at 19:24

3 Answers3

4

This is a brute force minimal damage proposal.

\documentclass{article}
\usepackage{tikz}
\begin{document}
\begin{tikzpicture}
\pgfmathsetseed{5411}
\def\N{5}
\node[]at(0,1){\textbf{Values}};
\pgfmathsetmacro{\mysum}{0}
\foreach \n in {1,...,\N}
{
\pgfmathtruncatemacro{\value}{10*rand}
\node[]at(1+\n,1){$\value$};
\pgfmathtruncatemacro{\mysum}{\mysum+\value}
\xdef\mysum{\mysum}
%\pgfmathsomething{\mean}{"(sum of values)/N"}. 
}
\node[]at(0,0){\textbf{Mean}};
\node[]at(3,0){$\mysum$};
%\node[]at(3,0){$\mean$};
\end{tikzpicture}
\end{document}

enter image description here

You were loading pgfplots but not using it. If you use it, you could make use of the statistics abilities to do these things in a more elegant way, but for the present situation that might be an overkill.

1

This is basically answered in a previous post given here; except the data was not created by a function using a loop. I've updated that with the following code:

\documentclass{article}
\usepackage{sagetex}
\begin{document}
\begin{sagesilent}
MyData = []
for i in range(1,6):
  MyData += [i^2]
\end{sagesilent}

\noindent My data set is $S = \sage{MyData}$. For this data:\\
The sample size is $\sage{len(MyData)}$.\\
The mean is $\sage{mean(MyData)}$.\\
The median is $\sage{median(MyData)}$.\\
The minimum value is $\sage{min(MyData)}$.\\
The maximum value is $\sage{max(MyData)}$.\\
The standarad deviation of the sample is $\sage{std(MyData)}$.\\
The sum of the data values is $\sage{sum(MyData)}$.
\end{document}

This gives the output shown below: enter image description here

In order to use the sagetex package it must be downloaded locally to your computer or, if you don't want to do that, you can set up a free Cocalc account. The language being used is Python which makes the code easier to read/debug. Be aware that range(1,6) does not include the last item, 6. So there are 5 elements in the data set. By having a computer algebra system take care of the mathematics you can quickly get many other statistics for your data set as well.

DJP
  • 12,451
1

A fairly general method with expl3.

\documentclass{article}
\usepackage{xparse}
\usepackage{tikz}

\ExplSyntaxOn
\NewDocumentCommand{\setarray}{O{default}mm}
 {% #1 is the name of the array
  % #2 is the number of values
  % #3 is the format of the values
  \seq_clear_new:c { l_yngabl_array_#1_seq }
  \int_step_inline:nnnn { 1 } { 1 } { #2 }
   {
    \seq_put_right:cx { l_yngabl_array_#1_seq } { \fp_eval:n { #3 } }
   }
 }
\NewExpandableDocumentCommand{\getvalue}{O{default}m}
 {% #1 is the name of the array
  % #2 is the index of the item to retrieve
  \seq_item:cn { l_yngabl_array_#1_seq } { #2 }
 }
\NewExpandableDocumentCommand{\mean}{O{2}m}
 {% #1 is the optional number of decimal digits
  % #2 is the name of the array (empty for default)
  \fp_eval:n
   {
    round(
     (\seq_use:cn { \__yngabl_array:n { #2 } } { + })/
     \seq_count:c { \__yngabl_array:n { #2 } }
     ,#1)
   }
 }
\cs_new:Nn \__yngabl_array:n
 {
  l_yngabl_array_ \tl_if_blank:nTF { #1 } { default } { #1 } _seq
 }
\ExplSyntaxOff

\begin{document}

\begin{tikzpicture}
\setarray{8}{randint(10)}
\node [] at (0,1) {\textbf{Values}};
\foreach \n in {1,...,8}{
  \node [] at (1+\n,1) {$\getvalue{\n}$};
}
\node [] at (0,0) {\textbf{Mean}};
\node [] at (2,0) {$\mean{}$};
\end{tikzpicture}

\bigskip

\begin{tikzpicture}
\setarray{8}{round(rand(),3)}
\node [] at (0,1) {\textbf{Values}};
\foreach \n in {1,...,8}{
  \node [] at (1+\n,1) {$\getvalue{\n}$};
}
\node [] at (0,0) {\textbf{Mean}};
\node [] at (2,0) {$\mean[3]{}$};
\end{tikzpicture}

\bigskip

\begin{tikzpicture}
\setarray[powers]{8}{round(2^#1/42,2)}
\node [] at (0,1) {\textbf{Values}};
\foreach \n in {1,...,8}{
  \node [] at (1+\n,1) {$\getvalue[powers]{\n}$};
}
\node [] at (0,0) {\textbf{Mean}};
\node [] at (2,0) {$\mean{powers}$};
\end{tikzpicture}

\end{document}

As you see, the second mandatory argument to \setarray receives the instruction for computing a value. In this argument, #1 refers to the current array index when the computation is performed.

enter image description here

egreg
  • 1,121,712