This is not "a baffling difficulty", certainly not for the originators of the umbral calculus nor for anyone understanding that umbral evaluation--lowering of exponents of an umbral quantity--can not be made until an ultimate Taylor or power series in the umbral quantity is achieved with factors of the umbral variable in a summand aggregated, or collapsed, via the law of exponents.
Flagging the umbral quantity with a period in the subscript, we have
$$(a.+ a.)^n = (2a.)^n = 2^n a.^n$$
which is a power series in the aggregated umbral quantity, so equal to $2^n a_n$.
Now also--and this lies at the core of the umbral calculus--using the binomial expansion
$$(a.+a.)^n = \sum_{k=0}^n \binom{n}{k} a.^k a.^{n-k} =\sum_{k=0}^n \binom{n}{k} a.^n$$
$$ = a.^n \sum_{k=0}^n \binom{n}{k} = a.^n 2^n$$
and this is also a power series in the aggregated umbral quantity which evaluates to $2^na_n$.
A tricky part is that $(a.+1)^0 =\sum_{k=0}^0 \binom{0}{k}a.^k =a.^0 =a_0$, which is not necessarily equal to $1$. Not tricky at all is $(a.+a.)^1 =\sum_{k=0}^1 \binom{1}{k}a.^ka.^{1-k} =a.^1 \sum_{k=0}^1 = 2a_1 = a.^1 +a.^1$. But tricky again, cuz of habit again, $(a.+b.)^1 = \sum_{k=0}^1 \binom{1}{k} a.^{1-k}b.^k = a_0b_1 + b_0 a_1$, which is not necessarily equal to $a.^1 +b.^1 = a_1+b_1$ as would naively be assumed from $(x+y)^1 =x^1y^0 + x^0y^1 = x+y$, so one must distinguish between $(a. +b.)^1$ and $a. + b. = a.^1 + b.^1 = a_1+b_1$, which means one must be explicit about the application of the binomial expansion if $a_0$ and $b_0$ are not both unity.
Let's redo this with an explicit eval op. $\langle a.^k \rangle = a_k$.
$$\langle (a.+ a.)^n \rangle = \langle (2a.)^n \rangle =\langle 2^n a.^n \rangle = 2^n \langle a.^n \rangle = 2^n a_n$$
and
$$\langle (a.+ a.)^n \rangle = \langle \sum_{k=0}^n \binom{n}{k} a.^k a.^{n-k} \rangle =\langle\sum_{k=0}^n \binom{n}{k} a.^n \rangle =\langle a.^n \sum_{k=0}^n \binom{n}{k}\rangle =\langle a.^n 2^n\rangle = 2^n\langle a.^n \rangle =2^na_n $$
or
$$\langle (a.+ a.)^n \rangle = \langle \sum_{k=0}^n \binom{n}{k} a.^k a.^{n-k} \rangle = \sum_{k=0}^n \binom{n}{k} \langle a.^k a.^{n-k} \rangle$$
$$ = \sum_{k=0}^n \binom{n}{k} \langle a.^n \rangle = \sum_{k=0}^n \binom{n}{k} a_n = 2^na_n.$$
Problems ensue it you mistakenly adopt a blight of commutativity and/or distributivity:
$$ \langle (a.+a.)^n\rangle = (\langle (a. + a.) \rangle)^n$$
$$ =(\langle 2a. \rangle)^n = (2a_1)^n = 2^na_1^n $$
which isn't equal to $2^na_n$ unless $a_1^n = a_n$,
or, even worse,
$$ \langle (a.+a.)^n \rangle = \langle \sum_{k=0}^n \binom{n}{k} a.^k a.^{n-k} \rangle $$
$$= \sum_{k=0}^n \binom{n}{k}\langle a.^k \rangle \langle a.^{n-k} \rangle =
\sum_{k=0}^n \binom{n}{k}a_k a_{n-k} .$$
These mistakes are the same as adopting the same blight for differentiation such as Leibniz initially did with
$$\partial_x f(x)g(x) = \partial_x f(x)\; \partial_x g(x),$$
following an algebraic gut feeling rather than a hybrid analytic-geometric intuition as Newton correctly did.
In fact, for $j,k,m,n =0,1,2,3,\cdots$, express the umbral substitution op as
$$Sub_{x^k \to a_k} = e^{a.\partial_{x=0}} = \sum_{k \geq 0} a_k \frac{\partial_{x=0}^k}{k!} $$
so that
$$Sub_{x^k \to a_k} x^m = e^{a.\partial_{x=0}}\; x^m = \sum_{k \geq 0} a_k \frac{\partial_{x=0}^k}{k!}\; x^m = a_m \; .$$
We would make the same mistakes if we assumed
$$Sub_{x^k \to a_k} \;x^j x^{n-j}=(Sub_{x^k \to a_k}\; x^j) \;Sub_{x^k \to a_k} \;x^{n-j}$$
$$ = e^{a.\partial_{x=0}}\;x^j x^{n-j}=\left (e^{a.\partial_{x=0}} \; x^j \right)\;e^{a.\partial_{x=0}} \; x^{n-j} = a_ja_{n-j} \;.$$
Similarly erroneously concluding
$$Sub_{x^k \to a_k}\;(x + x )^n = ((Sub_{x^k \to a_k}\;x) + (Sub_{x^k \to a_k}\;x))^n$$
$$=e^{a.\partial_{x=0}}\,(x + x )^n =\left ( \left ( e^{a.\partial_{x=0}}\;x \right ) +\left (e^{a.\partial_{x=0}}\;x \right )\right )^n$$
$$ = (a_1+a_1)^n = 2^na_1^n.$$
So, this "baffling difficulty" is such only if application of commutativity and distributivity in the differential calculus is a baffling difficulty. The originators understood when and when not to apply commutativity or distributivity.
I'll be a little harsh and criticize Rota and Roman for either consciously constructing or unconsciously falling into a straw man argument so that they can plug for the Bourbakian 'new math' algebraic-purist approach by conjuring up purely formal linear functionals and jingoistically repeating a common mantra about the 'shadowy' umbral calculus. The originators of umbral calculus understood linear algebra before it was codified (by, or at least influenced by, many of the originators, such as Boole, Blissard, Sylvester, and Cayley). I must add that there is much to glean from Roman and Rota and their collaborators about the umbral / finite operator / Sheffer polynomial calculus, but hiding the derivative op $\partial_t$ as a formal linear functional, such as in MathWorld under Sheffer polynomials, is often an impediment rather than an aide to understanding.
For me, the diff op rep for substitution is clear and readily allows / suggests extension to $a.^s =a_s$ for $s$ real or even complex in many cases if one expresses
$$Sub_{x^n \to a_n} \; x^s = e^{-(1-a.) \partial_{x=1}} \; x^s = (1-(1-a.))^s = a_s$$
and twice uses the binomial expansion to obtain a Newton series in $a_n$ or uses the Mellin transform to interpret and analytically continue $a_s$. But that's another story.
Btw, Sylvester coined the terms 'umbral calculus' and 'umbrae' to suggest that manipulations of indexed sequences shadow those for regular variables, such as demonstrated above (and as intuited much earlier by Leibniz w.r.t. differentiation, see The Theory of Linear Operators by Davis). He didn't intend for the pejorative 'shadowy = shady = edgy" to become attached. Although the power and elegance of umbral calculus has the feeling of witchcraft (which Rota later lamented was lost in his machinations), it is white magic not black, but always there are mischievous apprentices.
Quoting Rota:
Although the notation of Hopf algebra satisfied the most ardent advocate of spic-and-span rigor, the translation of “classical” umbral calculus into the newly found rigorous language made the method altogether unwieldy and unmanageable. Not only was the eerie feeling of witchcraft lost in the translation, but, after such a translation, the use of calculus to simplify computation and sharpen our intuition was lost by the wayside ... .