A first short shot, because I could waste hours of your time on that.
- The reality (whether it exists or not) looks, sounds, appears complicated
- As our senses are limited, and our brain is too slow, we need to shrink our perceptions, enough to understand most of what we need to grasp and share
- So with our limitations, we need (and are only capable of) approximations of the above reality, here some signal $s(t)$
- Depending on the time we have, the purpose, the extent, different kinds of approximations are required.
Convergence is the science of 'how approximations can be expected to be close enough to the reality'.
To draw a circle by hand, you only need to seize that $\pi \approx ~ 3$ (uniform 5% error). With a compas on a gridded sheet graph paper, perhaps $\pi \approx ~ 3.15$ is sufficient. For the Jet Propulsion Lab (JPL) "for interplanetary navigation, we use 3.141592653589793", see details in How Many Decimals of Pi Do We Really Need?
Convergence is about how more complex questions are answered (see for instance how can a function can be approximated by sines or cosines) :
- can a calibration function be approximated by a parabola
- if I want raise of 20 % in two years, can I have an increase of only $9.6 \%$ each of the two years (th answer is yes).
In more academic terms: suppose that there is a function $s(t)$. One has little knowledge about it, except that it wiggles. So you can choose a set of functions $\psi_\omega (t) $ depending on a parameter $\omega$, that wiggles like exponential sines. Can we approximate $s(t)$ with a certain combination of a finite number of $\psi_\omega (t)$? In the standard linear case, we seek a linear combination of the $\psi_\omega (t)$:
$$\sum a_\omega \psi_\omega (t)$$
The question rephrases as, for a given interval for $t$, and an increasing set $\Omega_\lambda$, $\lambda\in \Lambda$, of $\omega \in \Omega_\lambda$. The causal ideas, is: the bigger the number of terms (so the bigger the cardinal or the number of terms in $\Lambda$). In other terms, is
$$s(t)-\sum_{a_\omega \in \Omega_\lambda} a_\omega \psi_\omega (t)$$
very close to $0$, and how, depending on $t$ and $\Omega_\lambda$. Or: how fast does
$$\Delta\left(s(t)-\sum_{a_\omega \in \Omega_\lambda} a_\omega \psi_\omega (t)\right)$$
tends to zero, where $\Delta(\cdot)$ is a measure of closeness (a distance, a norm, a divergence).