Question about "baffling" umbral calculus result

Question

I am reading a paper here and I've come to a particular passage that is confusing me. It comes on page 2 of the attached paper and it deals with the binomial theorem... The passage lays the groundwork of how umbrals were linear functionals before the spread of linear algebra but the statement that confuses me is at the end of the paragraph and states:

Before knowledge of linear algebra became widespread, the action of a linear functional (written using physicist notation as $\langle L\mid x^n\rangle=a_n$) would be conceived of as raising the index $n$ to a power, and then "treating" the sequence $a_n$ as a sequence of powers $a^n$, while reserving the right to lower the index at the proper time. No precise rules for lowering indices were stated, nor could they be, as long as the underlying conceptual framework was missing. A baffingly difficulty in the calculus of umbrae was the important $rule$ $$(a+a)^n=\sum_{k=0}^{n}\binom{n}{k}a^ka^{n-k}$$ which seemed to imply $a+a\neq 2a$

It is the last statement I am confused at. Why is this implication here? How is it a result?

That's a really quick proof that the coefficients sum to $2^n$ — David P, Jan 13 '15 at 01:05
I understand the proof. That is easy. But I wasn't understanding the underlying statement answered below. — Eleven-Eleven, Jan 13 '15 at 03:03

Tom Copeland · Accepted Answer · 2024-01-09T22:00:32.723

This is not "a baffling difficulty", certainly not for the originators of the umbral calculus nor for anyone understanding that umbral evaluation--lowering of exponents of an umbral quantity--can not be made until an ultimate Taylor or power series in the umbral quantity is achieved with factors of the umbral variable in a summand aggregated, or collapsed, via the law of exponents.

Flagging the umbral quantity with a period in the subscript, we have

$$(a.+ a.)^n = (2a.)^n = 2^n a.^n$$

which is a power series in the aggregated umbral quantity, so equal to $2^n a_n$.

Now also--and this lies at the core of the umbral calculus--using the binomial expansion

$$(a.+a.)^n = \sum_{k=0}^n \binom{n}{k} a.^k a.^{n-k} =\sum_{k=0}^n \binom{n}{k} a.^n$$

$$ = a.^n \sum_{k=0}^n \binom{n}{k} = a.^n 2^n$$

and this is also a power series in the aggregated umbral quantity which evaluates to $2^na_n$.

A tricky part is that $(a.+1)^0 =\sum_{k=0}^0 \binom{0}{k}a.^k =a.^0 =a_0$, which is not necessarily equal to $1$. Not tricky at all is $(a.+a.)^1 =\sum_{k=0}^1 \binom{1}{k}a.^ka.^{1-k} =a.^1 \sum_{k=0}^1 = 2a_1 = a.^1 +a.^1$. But tricky again, cuz of habit again, $(a.+b.)^1 = \sum_{k=0}^1 \binom{1}{k} a.^{1-k}b.^k = a_0b_1 + b_0 a_1$, which is not necessarily equal to $a.^1 +b.^1 = a_1+b_1$ as would naively be assumed from $(x+y)^1 =x^1y^0 + x^0y^1 = x+y$, so one must distinguish between $(a. +b.)^1$ and $a. + b. = a.^1 + b.^1 = a_1+b_1$, which means one must be explicit about the application of the binomial expansion if $a_0$ and $b_0$ are not both unity.

Let's redo this with an explicit eval op. $\langle a.^k \rangle = a_k$.

$$\langle (a.+ a.)^n \rangle = \langle (2a.)^n \rangle =\langle 2^n a.^n \rangle = 2^n \langle a.^n \rangle = 2^n a_n$$

and

$$\langle (a.+ a.)^n \rangle = \langle \sum_{k=0}^n \binom{n}{k} a.^k a.^{n-k} \rangle =\langle\sum_{k=0}^n \binom{n}{k} a.^n \rangle =\langle a.^n \sum_{k=0}^n \binom{n}{k}\rangle =\langle a.^n 2^n\rangle = 2^n\langle a.^n \rangle =2^na_n $$

or

$$\langle (a.+ a.)^n \rangle = \langle \sum_{k=0}^n \binom{n}{k} a.^k a.^{n-k} \rangle = \sum_{k=0}^n \binom{n}{k} \langle a.^k a.^{n-k} \rangle$$

$$ = \sum_{k=0}^n \binom{n}{k} \langle a.^n \rangle = \sum_{k=0}^n \binom{n}{k} a_n = 2^na_n.$$

Problems ensue it you mistakenly adopt a blight of commutativity and/or distributivity:

$$ \langle (a.+a.)^n\rangle = (\langle (a. + a.) \rangle)^n$$

$$ =(\langle 2a. \rangle)^n = (2a_1)^n = 2^na_1^n $$

which isn't equal to $2^na_n$ unless $a_1^n = a_n$,

or, even worse,

$$ \langle (a.+a.)^n \rangle = \langle \sum_{k=0}^n \binom{n}{k} a.^k a.^{n-k} \rangle $$ $$= \sum_{k=0}^n \binom{n}{k}\langle a.^k \rangle \langle a.^{n-k} \rangle = \sum_{k=0}^n \binom{n}{k}a_k a_{n-k} .$$

These mistakes are the same as adopting the same blight for differentiation such as Leibniz initially did with

$$\partial_x f(x)g(x) = \partial_x f(x)\; \partial_x g(x),$$

following an algebraic gut feeling rather than a hybrid analytic-geometric intuition as Newton correctly did.

In fact, for $j,k,m,n =0,1,2,3,\cdots$, express the umbral substitution op as

$$Sub_{x^k \to a_k} = e^{a.\partial_{x=0}} = \sum_{k \geq 0} a_k \frac{\partial_{x=0}^k}{k!} $$

so that

$$Sub_{x^k \to a_k} x^m = e^{a.\partial_{x=0}}\; x^m = \sum_{k \geq 0} a_k \frac{\partial_{x=0}^k}{k!}\; x^m = a_m \; .$$

We would make the same mistakes if we assumed

$$Sub_{x^k \to a_k} \;x^j x^{n-j}=(Sub_{x^k \to a_k}\; x^j) \;Sub_{x^k \to a_k} \;x^{n-j}$$

$$ = e^{a.\partial_{x=0}}\;x^j x^{n-j}=\left (e^{a.\partial_{x=0}} \; x^j \right)\;e^{a.\partial_{x=0}} \; x^{n-j} = a_ja_{n-j} \;.$$

Similarly erroneously concluding

$$Sub_{x^k \to a_k}\;(x + x )^n = ((Sub_{x^k \to a_k}\;x) + (Sub_{x^k \to a_k}\;x))^n$$

$$=e^{a.\partial_{x=0}}\,(x + x )^n =\left ( \left ( e^{a.\partial_{x=0}}\;x \right ) +\left (e^{a.\partial_{x=0}}\;x \right )\right )^n$$

$$ = (a_1+a_1)^n = 2^na_1^n.$$

So, this "baffling difficulty" is such only if application of commutativity and distributivity in the differential calculus is a baffling difficulty. The originators understood when and when not to apply commutativity or distributivity.

I'll be a little harsh and criticize Rota and Roman for either consciously constructing or unconsciously falling into a straw man argument so that they can plug for the Bourbakian 'new math' algebraic-purist approach by conjuring up purely formal linear functionals and jingoistically repeating a common mantra about the 'shadowy' umbral calculus. The originators of umbral calculus understood linear algebra before it was codified (by, or at least influenced by, many of the originators, such as Boole, Blissard, Sylvester, and Cayley). I must add that there is much to glean from Roman and Rota and their collaborators about the umbral / finite operator / Sheffer polynomial calculus, but hiding the derivative op $\partial_t$ as a formal linear functional, such as in MathWorld under Sheffer polynomials, is often an impediment rather than an aide to understanding.

For me, the diff op rep for substitution is clear and readily allows / suggests extension to $a.^s =a_s$ for $s$ real or even complex in many cases if one expresses

$$Sub_{x^n \to a_n} \; x^s = e^{-(1-a.) \partial_{x=1}} \; x^s = (1-(1-a.))^s = a_s$$

and twice uses the binomial expansion to obtain a Newton series in $a_n$ or uses the Mellin transform to interpret and analytically continue $a_s$. But that's another story.

Btw, Sylvester coined the terms 'umbral calculus' and 'umbrae' to suggest that manipulations of indexed sequences shadow those for regular variables, such as demonstrated above (and as intuited much earlier by Leibniz w.r.t. differentiation, see The Theory of Linear Operators by Davis). He didn't intend for the pejorative 'shadowy = shady = edgy" to become attached. Although the power and elegance of umbral calculus has the feeling of witchcraft (which Rota later lamented was lost in his machinations), it is white magic not black, but always there are mischievous apprentices.

Quoting Rota:

Although the notation of Hopf algebra satisfied the most ardent advocate of spic-and-span rigor, the translation of “classical” umbral calculus into the newly found rigorous language made the method altogether unwieldy and unmanageable. Not only was the eerie feeling of witchcraft lost in the translation, but, after such a translation, the use of calculus to simplify computation and sharpen our intuition was lost by the wayside ... .

Actually, that was probably more the provocative Rota's sentiment than Roman's, and his use of the term linear functional is not clear to me although having a background in physics I'm familiar with the bra-ket notation of Dirac. Maybe I'm abusing that term, but I'm decrying a formal symbol replacing what is a diff op. The diff op rep is extremely elegant and useful in expressing, intuiting, proving, and generalizing theorems, more so than a symbol representing a functional mapping between domain and range / codomain. In this sense, abstraction can be an impediment to abstraction. — Tom Copeland, Jul 01 '23 at 22:44
See Doron Zeilberger's review of Roman's book and note the nonintuitive, ad hoc approach based on the first equation on p. 75 of the review. — Tom Copeland, Jul 20 '23 at 17:26

score 4 · Answer 2 · answered Jan 12 '15 at 22:17

4

I believe the author is writing about confusing the term $a_n$ with the term $a^n$ and vice versa. Interpreted as powers, the binomial theorem is a classical result, but if you interpret the powers as indices, it seems to indicate that the $n$-th component of the sum of a sequence with itself is not the same as twice the $n$-the component of this sequence.

answered Jan 12 '15 at 22:17

j4GGy

3,721

Now I think I understand where the author is coming from . Thank you. – Eleven-Eleven Jan 13 '15 at 03:07

Question about "baffling" umbral calculus result

2 Answers2

Linked