Prove The Derivative Rules in the Ring of Polynomials. How to show Leibniz's rule?

Question

Let $R$ be a commutative ring with unity element 1. Let $f(x)\in R[x]$ and define its derivative as $f'(x)=r_1 +2(r_2)x+...+n(r_n)x^{n-1}$. Prove that $(f+g)'(x)=f'(x)+g'(x)$ and that $(fg)'(x)=f'(x)g(x)+f(x)g'(x)$

So im pretty sure I worked out the addition, but am struggling with the multiplication.

Let $f(x),g(x)\in R[x]$ with $f(x)=r_0+r_1x+r_2x^2+...+r_nx^n$ and $g(x)=s_0+s_1x+s_2x^2+...+s_nx^n$

$$(f+g)'(x)=((r_0+s_0)+(r_1+s_1)x+...+(r_n+s_n)x^n)'(x)$$ $$=((r_1+s_1)+2(r_2+s_2)x+...n(r_n+s_n)^{n-1}$$ $$=((r_1+s_1)+(2r_2+2s_2)x+...+(nr_n+nr_n)x^{n-1}$$ $$=f'(x)+g'(x)$$

For the multiplication, this is what I've got so far...

$$(fg)'(x)=((r_0s_0)+(r_0s_1+r_1s_0)x+...+(m_j)x^n)$$ where $$m_j=\sum_{i+k=j}^n r_is_k$$ for $j=0,1,2,...$,

$$(fg)'(x)=((r_0s_1+r_1s_0)+2(r_0s_2+r_1s_1+r_2s_0)x+...+n(m_j)x^{n-1}$$ $$=((r_0s_1+r_1s_0)+(2r_0s_2+2r_1s_1+2r_2s_0)x+...+nm_jx^{n-1}$$

This is where I lose it, if I'm even correct up to this point...

Hint: try just proving it for monomials, since you already proved that the derivative is linear/additive. That is, just prove in the special case that $f = ax^n$ and $g = bx^m$. — Nick, May 20 '19 at 20:21

score 3 · Answer 1 · answered May 20 '19 at 20:57

Man. I remember doing proofs in this vein for my own algebra course and they were a bit of a pain. So I kind of feel you. In any event, my approach would be slightly different for brevity's sake if nothing else. I'll begin by establishing the basics and a concept you may (or may not) have learned thus far in your course, so this might take some time.

I don't know if this will really help you, but I found this way just a lot easier to tangle with.

Basics:

We can define a polynomial $f \in R[x]$ as a sort of "infinite vector" of coefficients, like so

$$\sum_i a_i x^i = f = (a_0,a_1,a_2,...,a_n,0,0,0,...)$$

where $a_n \ne 0$, $a_i \in R \; \forall i$, and $\deg(f) = n$. Define a second polynomial $g$ similarly:

$$\sum_i b_i x^i = g = (b_0,b_1,b_2,...,b_m,0,0,0,...)$$

where $b_m \ne 0$ and $\deg(g) = m$ (which may or may not be $n$). Then we can define polynomial multiplication either as a double sum...

$$fg = \sum_{n \ge 0} \left( \sum_{i+j = n} a_i b_j \right) x^n$$

...or as a "coefficient vector" ...

$$fg = (c_0,c_1,c_2,...,c_{m+n},0,0,...)$$

where

$$c_n = \sum_{i+j = n} a_i b_j $$

Finally, we can say that $f=g$ if $a_i = b_i \; \forall i$ and, of course, they're polynomials from the same ring (equivalently, the coefficient vectors are the same).

I think that establishes all the basics. You don't have to use the coefficient "vector" analogue if you don't want to: if $f,g\in R[x]$ then $f=g$ if and only if they're coefficients are the same. And similarly you can pick away generic coefficients as we later do here without appealing to the "vector" analogue, but it might be easier to conceptualize. I'm not sure of the motivation outside of this though, outside of being slightly easier to type up.

Digression on Indices:

As you might have noticed we're playing a little loose with indices here and in particular not defining strict upper/lower bounds. I'm doing this because I imagine you're more than familiar enough with this stuff conceptually to know that, for the individual polynomials, the sum runs on the nonnegative integers from $0$ to the degree of the polynomial. Arguably if we wanted to, we could reference later terms with indices greater than the degree and still have this all work out nicely, since they're just zero. For example, $a_{n+k}$ for $k \ge 1$ is always $0$ so nothing really changes. So we don't really have to stop at a particular upper bound either.

I'm doing it this way because one of the main problems I had in my time when dealing with polynomials was always finangling with the indices. I would sometimes spend more time on trying to get the indices worked out instead of just digging into the proof (cost me some time on tests).

So I ended up just deciding to do this work in a relatively loose manner instead since the core is the same. It makes a lot of the indexing easier to work with, conceptualize, and just comprehend, and I believe it's fully justified.

Onto the Derivatives:

Define $f',g'$ by

$$f' = (a_1,2a_2,3a_3,...,ka_k,...,na_n,0,0,...) \;\;\;\;\; g' = (b_1,2b_2,3b_3,...,kb_k,...,,mb_m,0,0,...)$$

where $f,g$ had degrees $n,m$ respectively and definitions as prescribed earlier. Consider a generic coefficient from the "vector" for $fg$:

$$c_k = \sum_{i+j = k} a_i b_j $$

Then in $(fg)'$, we would have $kc_k$ instead, from the definition of derivative. To make things even clearer: if we let $(fg)'=(\delta_1,\delta_2,...,\delta_{m+n-1},0,0,...)$, then

$$\delta_k=kc_k=\sum_{i+j = k} k a_i b_j$$

We want to show that this $k^{th}$ coefficient is equal to the $k^{th}$ one in $f'g+fg'$.

Let $f'g+fg'=(d_1,d_2,...,d_{m+n-1},0,...)$. Then we want to show $\delta_k = d_k$.

From the definitions of polynomial multiplication and addition (addition is just performed "component-wise" on the "vectors"), we see

$$d_k = \underbrace{\sum_{i+j = k} ia_i \cdot b_j}_{\text{from f'g}} + \underbrace{\sum_{i+j = k} a_i\cdot jb_j}_{\text{from fg'}}$$

This is where the benefits of the "looser" indices become apparent. We can immediately combine the sums, since they're over the same index. Then using some ring axioms and the commutativity, we can move $i,j$ around, factor, and let $k=i+j$. Symbolically,

$$\begin{align} d_k &= \sum_{i+j = k} ia_i \cdot b_j + \sum_{i+j = k} a_i\cdot jb_j \\ &= \sum_{i+j = k} ia_i b_j + a_ijb_j \\ &= \sum_{i+j = k} ia_i b_j + ja_ib_j\\ &= \sum_{i+j = k} (i+j)a_i b_j\\ &= \sum_{i+j = k} ka_i b_j \end{align}$$

Now, recall, from our earlier discussion: $\delta_k$, which comes from $(fg)'$, has the definition

$$\delta_k = \sum_{i+j = k} k a_i b_j$$

Thus, $\delta_k = d_k$. We thus conclude: $(fg)'=f'g+fg'$.

I have been introduced to this notation before jumping into polynomials expressed as I have above. Thank goodness this actually makes sense to me. I appreciate it! — Mather Guy, May 20 '19 at 23:00

Kyle · Accepted Answer · 2023-11-07T22:31:06.990

2

First show that the Leibniz rule works for monomials. That is, let $f(x) = rx^a$ and $g(x) = sx^b$. Then $(fg)(x) = rsx^{a+b}$ so that \begin{align*}(fg)'(x) &= (a+b)rsx^{a+b-1} \\ &= arx^{a-1}sx^{b} + bsx^{b-1}rx^{a} \\&= f'(x)g(x) + g'(x)f(x).\end{align*} Since you have shown that the addition rule works on finite sums, note that $(fg)(x) = \sum_{i=0}^n \sum_{j=0}^m f_i(x)g_j(x)$ is a finite sum of monomial products (where $f_i(x) = r_ix^i$ and $g_j(x) = s_jx^j$). Thus, \begin{align*}(fg)'(x) &= \sum_{i=0}^n \sum_{j=0}^m f_i'(x)g_j(x) + g_j'(x)f_i(x) \\&= f'(x)g(x) + g'(x)f(x).\end{align*} This last equality holds because, evidently, $f'(x) = \sum_{i=0}^n f_i'(x)$ and similarly $g'(x) = \sum_{j=0}^m g_j'(x)$, so that rearranging terms in the finite sum yields the desired result.

edited Nov 07 '23 at 22:31

answered May 20 '19 at 20:40

Kyle

544

1

I think you mean the Leibniz rule, not the chain rule, right? (Although you should also be able to show the chain rule works using a similar argument - first, show $(f^n)'(x) = n f(x)^{n-1} f'(x)$ by induction using the Leibniz rule, i.e. the chain rule works for $g \circ f$ when $g$ is a single power term; and then extend to general polynomials $g$ by linearity.) – Daniel Schepler Jul 12 '23 at 16:30
@DanielSchepler thanks for pointing that out—you're right of course. I've edited the answer. – Kyle Nov 07 '23 at 22:28

score 2 · Answer 3 · answered May 20 '19 at 20:53

You can see the derivative in the following way. Consider the ring $R[x,y]$ and form $f(x+y)-f(x)$. This polynomial vanishes for $y=0$, so it is divisible by $y$ in $R[x,y]$. Call $\hat{f}(x,y)$ the quotient, so that $f(x+y)-f(y)=y\hat{f}(x,y)$, and set $$ f'(x)=\hat{f}(x,0) $$ It is immediate, by the binomial theorem, that if $f(x)=x^n$, for $n>0$, then $f'(x)=nx^{n-1}$ and that the derivative of a constant polynomial is $0$. It is also immediate that this map $f\mapsto f'$ is linear, by a direct computation. Thus it is the standard derivative.

Now take $h(x)=f(x)g(x)$. Then \begin{align} h(x+y)-h(x) &=f(x+y)g(x+y)-f(x)g(x) \\[4px] &=f(x+y)g(x+y)-f(x)g(x+y)+f(x)g(x+y)-f(x)g(x) \\[4px] &=\bigl(f(x+y)-f(x)\bigr)g(x+y)+f(x)\bigl(g(x+y)-g(x)\bigr) \\[4px] &=y\hat{f}(x,y)g(x+y)+f(x)y\hat{g}(x,y) \\[4px] &=y\bigl(\hat{f}(x,y)g(x+y)+f(x)\hat{g}(x,y)\bigr) \end{align} showing that $$ \hat{h}(x,y)=\hat{f}(x,y)g(x+y)+f(x)\hat{g}(x,y) $$ Now evaluate at $y=0$ both sides, to get $$ \hat{h}(x,0)=\hat{f}(x,0)g(x+0)+f(x)\hat{g}(x,0) $$ which is to say $$ h'(x)=f'(x)g(x)+f(x)g'(x) $$

Elías Guisado Villalgordo · Answer 4 · 2023-07-12T16:19:32.890

Let $R$ be a commutative, unital ring and let $S=R[x_i\mid i\in I]$ be the polynomial ring over $R$ with variables indexed by $I$. Let $j\in J$. Since $A$ is a free $R$-module with basis $\mathcal{B}=\{1\}\cup\bigcup_{n=1}^{+\infty}\{x_{i_1}\cdots x_{i_n}\mid (i_1,\dots,i_n)\in I^n\}\subset S$, we can define an $R$-linear map $\frac{\partial}{\partial x_j}:S\to S$ that sends $f\in\mathcal{B}$ to the formal derivative $\frac{\partial f}{\partial x_j}$.

Lemma. The map $\frac{\partial }{\partial x_j}:S\to S$ satisfies Leibniz's rule.

Proof by Calculus II:

$\underline{\text{First step:}}$ We show that the map $\frac{\partial}{\partial x_j}:S\to S$ satisfies Leibniz's rule if $R=\mathbb{R}$ and $I$ is finite. We have an obvious real-linear map $\operatorname{ev}:\mathbb{R}[x_1,\dots,x_n]\to C^\infty(\mathbb{R}^n,\mathbb{R})$ from the polynomial ring to the real algebra of infinitely-differentiable functions $\mathbb{R}^n\to\mathbb{R}$. On the one hand, $\operatorname{ev}$ is injective because $\mathbb{R}$ is infinite and we can apply this result; on the other hand, the diagram $$ \require{AMScd} \begin{CD} \mathbb{R}[x_1,\dots,x_n]@>{\dfrac{\partial}{\partial x_j}}>>\mathbb{R}[x_1,\dots,x_n]\\ @VVV@VVV\\ C^\infty(\mathbb{R}^n,\mathbb{R})@>>{\dfrac{\partial}{\partial x_j}}>C^\infty(\mathbb{R}^n,\mathbb{R}) \end{CD} $$ commutes. Since the bottom map satisfies Leibniz's rule (by Calculus II), so does the top map (the vertical maps are injective).

$\underline{\text{Second step:}}$ We will reduce the problem over arbitrary $R$ to the first step.

Exercise. Suppose we have a commutative diagram $$ \require{AMScd} \begin{CD} A@>{D}>>M\\ @VVV@VVV\\ B@>>{D'}>N \end{CD} $$ where

$A\to B$ is a ring homomorphism,

$M$ is an $A$-module, $N$ is a $B$-module,

$D$, $D'$ are additive,

$A\to B$ is onto and

$M\to N$ is $A$-linear.

Then, if $D$ satisfies Leibniz's rule, so does $D'$.

Now, take a surjection $R_0=\mathbb{Z}[y_j\mid j\in J]\to R$. Then we have a commutative diagram $$ \require{AMScd} \begin{CD} R_0[x_i\mid i\in I]@>{\dfrac{\partial}{\partial x_j}}>>R_0[x_i\mid i\in I]\\ @VVV@VVV\\ R[x_i\mid i\in I]@>>{\dfrac{\partial}{\partial x_j}}>R[x_i\mid i\in I] \end{CD} $$ From the exercise, it suffices to show that the top map satisfies the Leibniz rule. Since $R_0[x_i\mid i\in I]=\mathbb{Z}[x_i,y_j\mid i\in I, j\in J]$, relabeling variables, we show that $\frac{\partial}{\partial x_j}:\mathbb{Z}[x_i\mid i\in I]\to\mathbb{Z}[x_i\mid i\in I]$ satisfies the Leibniz rule. In turn, since elements of $\mathbb{Z}[x_i\mid i\in I]$ are finite sums of monomials, it suffices to show that $\frac{\partial}{\partial x_j}:\mathbb{Z}[x_1,\dots,x_n]\to\mathbb{Z}[x_1,\dots,x_n]$ satisfies the Leibniz's rule. But $$ \require{AMScd} \begin{CD} \mathbb{Z}[x_1,\dots,x_n]@>{\dfrac{\partial}{\partial x_j}}>>\mathbb{Z}[x_1,\dots,x_n]\\ @VVV@VVV\\ \mathbb{R}[x_1,\dots,x_n]@>>{\dfrac{\partial}{\partial x_j}}>\mathbb{R}[x_1,\dots,x_n] \end{CD} $$ commutes and the vertical maps are injective, and we win. $\square$

Prove The Derivative Rules in the Ring of Polynomials. How to show Leibniz's rule?

4 Answers4