4

Background

I am currently working on an exercise out of Robert Ash's Abstract Algebra: The Basic Graduate Year (which can be found on his website free of charge) and I am struggling to understand a portion of the solution that he gives, also available at the given link.


The Exercise

The exercise I am working on is Problem 5 of Section 2.4, and it asks the following.

Let $I$ be a proper ideal of $F[[X]]$. Show that $I\subseteq \langle X \rangle$, so that $\langle X\rangle$ is the unique maximal ideal of $F[[X]]$.

A brief note on the notation. Here $F[[X]]$ is referring to the ring of formal power series with coefficients in the field $F$, where Ash defines a formal power series as being of the form $$f(X)=a_0+a_1X+a_2X^2+\ldots$$


My Question

For this exercise, Ash gives the following solution:

Suppose that $f(X) = a_0 + a_1X +\ldots$ belongs to $I$ but not to $\langle X\rangle$. Then $a_0$ cannot be $0$, so by ordinary long division we can find $g(X) \in F[[X]]$ such that $f(X)g(X) = 1$. But then $1\in I$, contradicting the assumption that $I$ is proper.

What I don't understand here is the "by ordinary long division" part. I am not familiar with the notion of dividing one power series by another, and so I have a feeling that's not what he's talking about here. At first my thought was that $g(X)$ would simply equal $f^{-1}(X)$, but if that is the case, then why would he mention the bit about long division? Also, how does $a_0\neq 0$ help us other than tell us that $I\not\in\langle X \rangle$, something we already assumed? What am I missing here?


As always, I appreciate any help you are able to give!

1 Answers1

2

If $f(X) = a_0 + a_1 X + a_2 X^2 + \dots$ with $a_0 \ne 0$ then we can write $f(X) = a + Xg(X)$ where $a = a_0$ and $g(X) = a_1 + a_2X + a_3X^2 + \dots$. Then, from the identity $(1 - X)^{-1} = 1 + X + X^2 + X^3 + \dots$, we obtain

$$ \frac{1}{f(X)} = \frac{1}{a + Xg(X)} = \frac{a^{-1}}{1 + a^{-1}Xg(X)} = a^{-1} \left( \sum_{n = 0}^\infty (-a^{-1}Xg(X))^n \right). \tag{1} $$

The series in $(1)$ converges because the $n$-th term, i.e. $(-a^{-1})^n X^ng(X)^n$, has degree at least $n$. Thus coefficient on $X^m$ in

$$ \sum_{n = 0}^\infty (-a^{-1}Xg(X))^n $$

does not change past the first $m + 1$ terms of the sum. This is exactly what it means to converge in $F[[X]]$.

Recall that a sequence $f_0(X), f_1(X), \dots$ converges to $f(X)$ in $F[[X]]$ if for all $m \in \mathbf{N}$ there exists $N \in \mathbf{N}$ such that for all $n \ge N$, $[X^m] f_n(X) = [X^m]f(X)$. I.e. the coefficient of $X^m$ is constant for all but finitely many indices.

This process of finding $1/f(X)$ is what is meant by ordinary long division. If you wanted to actually calculate the coefficients, you can use the identity

$$ [X^m] \frac{1}{f(X)} = a^{-1} [X^m] \sum_{n = 0}^m (-a^{-1}Xg(X))^n $$

since later terms in the sum do note affect this coefficient.

Trevor Gunn
  • 27,041