The normalization is indeed by 1 / sqrt(scale), and it's an L2-norm; the trick's in the scale wavelet.
I'll use wavelet='morl' throughout. Prior to integrating, we can inspect the wavelet here; it's returned by wavelet.wavefun, which is binary-compiled, but after some guesswork, I found to it to match exactly with
scipy.signal.morlet2(1024, 1024 / 16) * sqrt(1024 / 16) * np.pi**(.25)
which is, from source, using Wiki's notation, $\psi(t) = \psi_{\sigma}(t/a)$, where $a$ is scale, and
$$
\psi_{\sigma}(t) = e^{j\sigma t} e^{-t^2/2} \tag{1} \label{1}
$$
(scale and $\pi^{-1/4}$ cancel). This is what's integrated via cumsum(psi) * step, and then resampled for all scales.
Resampled vs. recomputed
What exactly is this resampling doing in in terms of Eq 1? Is it just a higher resolution of the wavelet at the same scale, or is it equivalent to recomputing Eq 1 at each scale? Conveniently enough, latter, but only approximately, and approximation degrades substantially for small scale (-- code1):
Notice from code1, however, the recomputed wavelet:
Ns = len(int_psi_scale)
w = real(morlet2(Ns, Ns / 16) * sqrt(Ns / 16) * np.pi**(.25)) # repeats first blob
w /= scale
The recomputing includes 1 / scale, which, together with * sqrt(scale), equals 1 / sqrt(scale). Mystery solved.
Your code is wrong, where's * step?
Replaced by 1 / scale. How?
In MAE code, notice that for scale=64, we have int_psi_scale == int_psi, which == cumsum(psi) * step. For w_int, we do cumsum(w) / scale. And 1 / scale is... == step. Thus, the pre-integrated wavelet, psi, is just w at scale=64 (in morlet2 code above, 1024 / 16 == 64, checks out), and step happens to be ... conveniently? == 1 / scale when integrating.
Then, why is 1 / scale there? Unclear. Two possibilities in mind: (1) preserving the wavelet's norm at integration; (2) scaling the wavelet, independent of integration.
- If the wavelet was L1 or L2 normalized before integration, either will be preserved. This is from chain rule; just replace $f$ with $\psi$, and $k$ with $1/a$:
$$
\int f(k x) dx = \frac{1}{|k|} \int f(x) dx
$$
- This seems likelier, as the
diff later is closely equivalent to undoing the integration, defeating the purpose of (1). Why rescale the wavelet? Normalization - see here.
You cheated earlier; there is no w /= scale
True, the code actually shows w_int = cumsum(w) / scale, but the two are exactly the same. It's thus the earlier question of where 1 / scale "belongs", or "comes from". This is answered here, and in other part below.
Why step == 1 / scale at integration? (-- for reference, from here (in code1, $n$ is x)):
Just a coincidence, or is step, along $n_i$, carefully crafted to yield the convenient resampling properties, which in turn demand step = 1 / scale? Don't know, might update answer later.