I explained in "Why is time inversion?"

Red is a reference delta impulse (of height 1), whereas green is a typical response to that impulse, denoted by function h(t). In LTI, output is proportional to the input, that is if we have a delta impulses at the input at time 0, output at time t will be a*h(t). Now, instead of single impulse at the origin, you apply a series of input impulses at various times. What will be the output? Say, there was input impulse of height $a_1$ at $T_1$. Since current time is $t$, impulse occurred $t-T_1$ seconds ago and its contribution to the current output y(t) is $a_1 h(t - T_1)$. There is another contribution from another impulse, occurred at $T_2$. Its contribution is $a_2 h(t-T_2)$. So, $y(t) = a_1 h(t-T_1) + a_2 h(t-T_2)$. You simply add up the contributions beacuse of LTI linearity.
In general, you have $y_t = \sum_0^t {a_i h(t-i)}$. That is a convolution formula.
It also appears when you multiply to polynomials $(a_0 + a_1 z + a_2 z^2 + ...)(b_0 + b_1 z + b_2 z^2 + ...) =\sum_0^\infty {c_n z^n} $ where $c_n = a_i b_{n-i}$. That is why you tend to represent series as z-transforms. In this case you can simply multiply them and have convolution on the background.