8

In CLRS (on pages 49-50), what is the meaning of the following statement:

$\Sigma_{i=1}^{n} O(i)$ is only a single anonymous function (of $i$), but is not the same as $O(1)+O(2)+\cdots+O(n)$, which doesn't really have an interpretation."

Ran G.
  • 20,684
  • 3
  • 60
  • 115
kesari
  • 181
  • 1
  • I tried to formulate your question more precisely; also note that we have latex support here so you can write nicely formatted math. I encourage you to be more specific: what exactly is confusing? What part is causing trouble? (Maybe you can then edit the title of the question accordingly as well). – Juho Jun 09 '14 at 13:46
  • 1
    see also related: https://cs.stackexchange.com/questions/366/what-goes-wrong-with-sums-of-landau-terms and https://cs.stackexchange.com/questions/2814/sums-of-landau-terms-revisited – Ran G. Jun 11 '14 at 06:37
  • 1
    Arguably, the expanded sum does not have an interpretation, either; you should write $O(\sum \dots)$ do begin with. – Raphael Jun 11 '14 at 07:29
  • 1
    Can anyone explain the intended meaning of $\sum_{i=1}^nO(i)$ ? Sum of $n$ functions of order "$i$" ? This makes little sense, as $O(i)=O(1)$. Sum of $n$ functions indexed by $i$ and of some order ?? –  Jun 11 '14 at 10:50

2 Answers2

12

Since $1+2+\dots+n =O(n^2)$, it is tempting to suggest that $O(1)+O(2)+\dots+O(n) = O(n^2)$ ... but this is not in fact valid. The reason is that there might a different constant for each term in the sum.

An example

Let me give an example. Consider the sums $S(1) = 1^2$, $S(2) = 1^2 + 2^2$, $S(3) = 1^2 + 2^2 + 3^2$, $S(4) = 1^2 + 2^2 + 3^2 + 4^2$, and so on. Note that $1^2 \in O(1)$, $2^2 \in O(2)$, $3^2 \in O(3)$, $4^2 \in O(4)$, and so on for each term in the sum. Therefore, it would be reasonable to write $S(j)=1^2 + \dots + j^2$ in the form $S(j) = O(1) + \dots + O(j)$. So can we conclude that $S(j) = O(j^2)$? Nope. In fact, $S(n) = n(n+1)(2n+1)/6$, so $S(n) = \Theta(n^3)$.

If that doesn't help, let's try the following more precise mathematical development:

A formalization

Recall that the interpretation of, say, $O(n^2)$ is that it is a set of non-negative functions $f(n)$ (namely, the set of functions $f(n)$ such that there exists constants $c \ge 0, d\ge 0$ such that $f(n) \le c \cdot n^2$ for all $n\ge d$).

The closest we can come to an interpretation of $O(1) + O(2) + \dots + O(n)$ is that it is the set of functions of the form $f_1(n) + f_2(n) + \dots + f_n(n)$ such that $f_1(n) \in O(1)$, $f_2(n) \in O(2)$, ..., $f_n(n) \in O(n)$.

But now the constants for each $f_i$ can be different. Thus, each $f_i$ is a non-negative function $f_i$ such that there exist constants $c_i\ge 0,d_i \ge 0$ with $f_i(n) \le c_i \cdot i$ for all $n \ge d_i$.

Now, given this, what can we say about $g(n) = f_1(n) + f_2(n) + \dots + f_n(n)$? Not much useful. We know that there exists a constant $d=\max(d_1,d_2,\dots,d_n)$ such that $g(n) \le c_1 \cdot 1 + c_2 \cdot 2 + \dots + c_n \cdot n$ for all $n\ge d$. Now what can we say about this sum? Well, the answer is that we can't say anything at all. It could be arbitrarily large. It is tempting to let $c=\max(c_1,c_2,\dots,c_n)$ and say that $g(n) \le c \cdot (1+2+\dots+n) \le c \cdot n^2 = O(n^2)$... but this is not actually correct, since we need a single constant value of $c$ that works for all $n$, and the value $\max(c_1,c_2,\dots,c_n)$ is a function of $n$, not a constant.

So there might not be any constant $c$ such that $g(n) \le c \cdot (1+2+\dots+n)$; there might not be any constant $c$ such that $g(n) \le c \cdot n^2$. There is no guarantee that $g(n) \in O(n^2)$.

For more reading

See https://math.stackexchange.com/q/86076/14578 and Sums of Landau terms revisited for other questions that deal with this general issue.

D.W.
  • 159,275
  • 20
  • 227
  • 470
1

The reason that CLRS's comment is confusing is that, technically, $\sum_{i=1}^{n} O(i)$ is defined as $O(1) + O(2) + \ldots O(n)$. What is really happening is that CLRS is abusing notation for sake of simplicity:

  • $O(1)$ represents a set of functions. It includes, for example, $f(n)=1$, $f(n)=1/n$, and $f(n)=n^{1/n}$.
  • When you write $O(1) + O(2)$ you're technically adding two sets $O(1)$ and $O(2)$ with a sumset operation. When this is done with more than a constant number of terms, it can lead to unexpected behaviors, as D.W. clearly explains in another answer.

Instead, CLRS would thus like you to interpret $\sum_{i=1}^n O(i)$ as $\sum_{i=1}^n f(i)$ where the generic function $f(i) \in O(i)$. For example, they would write that $\sum_{i=1}^n 3i-5$ is $\sum_{i=1}^n O(i)$, or $O(n^2)$.

Ari Trachtenberg
  • 642
  • 4
  • 10
  • This explanation isn't quite right. There's nothing wrong with adding $O(1) + O(2)$. That's well-defined. $O(1)$ is a set of functions, $O(2)$ is a set of functions, and when $S,T$ are sets of functions, $S+T$ is normally understood to be the set of functions ${f(n)+g(n) : f(n) \in S, g(n) \in T}$. This is what is commonly intended when we add two big-Oh notations, and everything works out fine as long as you only add two (or a constant number of) big-Oh symbols. Where you get into trouble is when the number of addends is not a constant, as explained in my answer. – D.W. Jun 09 '14 at 19:41
  • I agree that this is the common definition of set addition and that it is well-defined, although I do not think that it is what is meant in common use. As you say correctly in your answer above, using set addition on more than a constant number of terms leads to problems. – Ari Trachtenberg Jun 09 '14 at 19:48
  • I prefer to define O(f(n)) as an anonymous element of a certain set of functions, rather than the set itself. Then $\sum_i O(i)$ means "$\sum_i f(i)$ for some function $f$ such that...", while $O(1)+O(2)+\cdots+O(n)$ means "$f_1(1)+f_2(2)+\cdots+f_n(n)$ for some functions $f_1,f_2,\dots,f_n$ such that...". Totally not the same thing. – JeffE Jun 11 '14 at 12:00