Hard-wiring a proof method in my head

Question

There's s a kind of proof regularly used in linear algebra ( proving facts about Transformations, direct sums, basis, ... ) that i have definitely agreed with but still couldn't connect my intuitive and logic to my formal proof writing.
After avoiding to fully understand it for long enough i decided trying to erradicate all my minor doubts on it.

Context :
Given a vector space V, suppose we have U1,U2,...,Um, all of subspaces of V such that their direct sum equals V.
By saying that V = U1 directsum U2 directsum ... Um directsum we know two things :

First: V = U1 + U2 + ... + Um = { u1 + u2 + .... + um : for all u1 in U1, u2 in U2, ..., um in Um }

Second: Every element in V can be obtained by a unique combination of uj, each one contained in Uj for 1 <= j <= m

What i certainly understand :

I know intuitively that considering V = U1 directsum U2 directsum ... directsum Um, then the vector 0 in V can only be obtained by summing the elements 0 in U1,U2,...,Um ( or else it wouldn`t be a direct sum ).

Here's the proof i'm having a problem with :

I can formally ( but not intuitively and logically ) go step by step to show that previous fact implies that EVERY element from V can only be obtained by taking a unique combination of uj, each one contained in Uj for 1 <= j <= m :

Step 1 -
If V = U1 directsum U2 directsum ... directsum Um, then for any element x in V : x = u1 + u2 + ... + um

Step 2 -
Suppose we have another combination for x : y = v1 + v2 + ... + vm

Step 3 -
Then subtracting both equations, leads to : 0 = (u1 -v1) + (u2 -v2) + .... + (um - vm)

Step 4 -
Which leads to the fact that u1=v1 , u2=v2 , ... , um=vm ( because we need a trivial combination to yield the 0 vector in V )

Step 5 -
Hence proving that every element in V can only be obtained by a unique combination of elements in each subspace Uj.

My concern is that i can`t really connect the fact i intuitevily know ( about the requirement of the trivial combination ) with the fact that is implied by it.

I think i have two problems :
1 - I can't fully understand what is going on in the transition from step 1 to step 2 ( Are we starting a contradiction method ? ).i can't understand the connection from step 1 to step 2 with the results achieved in step 4.Does step 4 contradict what we assumed in step 2 ?
2 - I can't automatically believe or know ( witouth running the proof steps in my head ) that the trivial combination for the vector 0 in V, implies unique combination for any element in V.

It's a bit hard to ask something specific in the logical steps of this proof because i can`t fully and intuitively understand it .

I hope i conveyed a not so confusing message.

don't sure it is useful, but my suggest is to think that $U_j$'s are pairwise disjoint, a part for the zero-vector — Federica Maggioni, Apr 25 '13 at 17:15

Cameron Buie · Answer 1 · 2013-10-21T21:26:57.410

We can actually consider this basic approach as a direct proof, instead. We supposed that $$u_1+\cdots+ u_m=v_1+\cdots+v_m$$ where $u_j,v_j\in U_j$ for $1\le j\le m$. In order to show that our representations are unique, it suffices to show that these two arbitrary representations are in fact the same. In particular, we need to show that $$u_j=v_j$$ for each $1\le j\le m$. Equivalently, we're given that $$0=(u_1-v_1)+\cdots+(u_m-v_m)$$ and must prove that $$0=u_j-v_j$$ for each $1\le j\le m.$ Since each $U_j$ is a subspace of $V$ and $u_j,v_j\in U_j$, then $u_j-v_j\in U_j$. Thus, this representation of the zero vector is the trivial one, so each $u_j-v_j=0,$ as desired.

There is a general principle behind these types of proofs (aside from the fact that for elements $x,y$ of a vector space $V$ we have $x=y$ if and only if $0=x-y$). Suppose that $V=U_1+\cdots+U_m$ for some subspaces $U_j$ of $V$, let $W:=U_1\times\cdots\times U_m$ and consider the function $f:W\to V$ given by $$f(u_1,\dots,u_m)=u_1+\cdots+u_m.$$ Each $U_j$ is a vector space over the same field as $V$, so $W$ is, too (with termwise addition and scalar multiplication). We can readily see that $f$ is a linear transformation, and so saying "the elements of $U$ have unique representations as...$U_j$ for $1\le j\le m$" is equivalent to saying "$f$ is a vector space isomorphism." Since $U_1+\cdots+U_m=V$, then $f$ maps $W$ onto $V$ (so every vector of $V$ has a representation). Ultimately, then, the given proof shows that if the kernel of $f$ is trivial (contains only the zero vector), then $f$ is one-to-one (so that the representations are unique).

This is true in general, though. Let $X,Y$ be vector spaces over some field and let $L:X\to Y$ be any linear transformation. Then $L$ is one-to-one if and only if the kernel of $L$ is trivial. For any $y\in Y$, we call $$L^{-1}(y):=\{x\in X:L(x)=y\}$$ the fiber of $y$ (under $L$). Note that the fibers are disjoint from each other, and that every element of $X$ lies in exactly one fiber (since $L$ is a function). Note moreover that $L$ is one-to-one if and only if each fiber contains no more than one element (if $L$ isn't onto, then some will be empty). The kernel of $L$ is simply the fiber of $Y$'s zero vector, so it's clear that the kernel will be trivial if $L$ is one-to-one. What about the converse, though? Why is it that if we know the kernel of $L$ is trivial, then we know that all the fibers have at most one element?

Well, let's take any $y\in Y$ such that $L^{-1}(y)\ne\emptyset$. Fix $x\in L^{-1}(y).$ For any $z$ in the kernel of $L$, we have $$L(x+z)=L(x)+L(z)=y+0=y.$$ Thus, $x+L^{-1}(0)\subseteq L^{-1}(y)$, where for subsets $Z$ of $X$ we have $$x+Z:=\{x+z:z\in Z\}.$$ On the other hand, if $x'\in L^{-1}(y)$, then putting $z=-x+x'$, we have $x'=x+z$ and $$L(z)=L(-x)+L(x')=-L(x)+L(x')=-y+y=0,$$ so $x'\in x+L^{-1}(0)$. Since this holds for all $x'\in L^{-1}(y)$, then $x+L^{-1}(0)\supseteq L^{-1}(y)$, whence $x+L^{-1}(0)=L^{-1}(y).$ Put another way, all the non-empty fibers are simply translations of the kernel of $L$. That's why a given fiber can have no more points in it than the kernel of $L$, which is why triviality of the kernel of $L$ means that $L$ is one-to-one, which is why uniqueness of all representations follows from uniqueness of the representation of zero.

It actually follows from the above reasoning that all non-empty fibers under $L$ are translations of each other (since they're all translations of the kernel of $L$), so instead of assuming that the kernel was trivial, we could instead have assumed that some non-empty fiber had only one point. The thing that's special about the kernel of $L$ is that, unlike the other fibers under $L$, it is a vector subspace of $X$, so it has nice properties like closure under addition and scalar multiplication.

Nice answear, Cameron.May i ask you a linear algebra book recommendation ? I'm not sure i know what a fiber is.Thanks — nerdy, Apr 25 '13 at 23:39
Also, do you know how could one think intuitevily (without doing T(0) = 0 // T(x-y) = 0 with x=y // T(x) = T(y) with x=y) why trivial kernel of a transformation implies one-to-one transformation? — nerdy, Apr 25 '13 at 23:51
Let me give you a few examples to illustrate what a fiber is. In general, if $X,Y$ are sets and $f:X\to Y$ is a function, then for $y\in Y$, the fiber of $y$ under $f$ is just the set of things in $X$ that $f$ sends to $y$. Consider the function $f:\Bbb R^2\to\Bbb R$ given by $f(x,y)=\sqrt{x^2+y^2}.$ The fibers of negative numbers are empty; the fiber of zero is the origin; and for positive $r,$ the fiber of $r$ is the circle about the origin of radius $r.$ — Cameron Buie, Apr 26 '13 at 00:32
Consider $g:\Bbb R^2\to\Bbb R$ given by $g(x,y)=xy$. For $t\neq 0$, the fiber of $t$ is the curve $y=t/x$, and the fiber of zero is the axes. Now, those aren't linear transformations, so the fibers aren't simply translations of each other. On the other hand, consider the function $h:\Bbb R^2\to\Bbb R$ given by $h(x,y)=y-2x$. The fibers under $h$ are all the lines in the plane with slope $2$ (the fiber of $b$ is the line with slope $2$ and $y$-intercept $b$). — Cameron Buie, Apr 26 '13 at 00:44
The fact that non-empty fibers of a linear transformation are translations of each other is what allows us to conclude that a linear transformation is one-to-one if and only if its kernel is trivial. I'll think on some good linear algebra texts. — Cameron Buie, Apr 26 '13 at 00:51

Hard-wiring a proof method in my head

1 Answers1

Linked