I would agree on Hilmar's comment, it's probably harmonics generated by nonlinearities of the system in combination with how the logarithmic frequency sweep response is deconvolved into an impulse response.
Your sweep has instantaneous frequency $f$ at time $t$, following:
$$f(t) = \exp(0.7090076835\,t/\text{s})\times20\text{ Hz}.$$
This is the logarithmic sweep (a linear sweep in a logarithmic frequency scale) that satisfies $f(0\text{ s}) = 20\text{ Hz}$ and $f(10\text{ s}) = 24\text{ kHz}$.
We can solve from this the time $t$ at which frequency $f$ appears:
$$\Rightarrow\quad t(f) = 1.41042195\ln(f/\text{Hz})\text{ s} - 4.225246556\text{ s}.$$
The frequency domain division in your deconvolution formula "IFFT(FFT(Recording)/FFT(Sweep))" shifts in time any frequency $f$ appearing at time $t(f)$ in the sweep, by $-t(f)$, back to time 0 in the impulse. However an $n$th harmonic of any frequency $f$ will be shifted by $-t(nf)$ from time $t(f)$, to time:
$$t_n(f) = t(f) - t(nf).$$
Due to the frequency sweep being logarithmic, this turns out to be independent of $f$:
$$\Rightarrow\quad t_n = t_n(f) = -1.410421949\ln(n)\text{ s}.$$
We can tabulate some values:
$$\begin{align}
t_1 &=& 0,\\
t_2 &=& -0.9776299980\text{ s},\\
t_3 &=& -1.549506886\text{ s},\\
t_4 &=& -1.955259996\text{ s},\\
t_5 &=& -2.269986558\text{ s}.\end{align}$$
Add to those any delay due to delays in the system and that's where you'd see in the deconvolution result the peak corresponding to each harmonic. Multiplication or division in the discrete frequency domain does circular convolution or deconvolution. I don't know for sure what size discrete Fourier transform (DFT) and inverse DFT you used, but for a vector spanning 14 seconds you'd see a 2nd harmonic peak at $14\text{ s} + (-0.9776299980\text{ s}) = 13.02237000\text{ s}$ or a bit later due to any delays in the system. That seems to agree with your finding of "13.02 seconds", although I would expect your number to be closer to 13.03 s if you have the mic at least 1 m away from the speaker. If you look at a spectrogram of your sweep response, the harmonics should be visible there above the fundamental sweep.
More generally, for a rising sweep going from frequency $f_0$ at time 0 to frequency $f_1$ at time $T$:
$$f(t) = \exp\left(\frac{\ln\left(f_1/f_0\right)}{T}\times t\right)\times f_0,$$
$$\Rightarrow\quad t(f) = \frac{T\ln(f/f_0)}{\ln(f_1/f_0)},$$
the impulse response of an $n$th harmonic will be time-shifted by deconvolution to: $$t_n = -\frac{T\ln(n)}{\ln(f_1/f_0)}.$$
ifft(fft(pad(recording)) * fft(pad(ifft(1 / fft(sweep))))), if no improvement then no boundary effects / "time aliasing" – OverLordGoldDragon May 23 '23 at 16:05ifft(fft(pad(recording)) * (1/fft(pad(sweep))). What I means is that I believe you can pad the initial sweep before inversion and get the same results (i.e. pad then invert in the frequency domain). Could you provide some insight on that? – ZaellixA May 23 '23 at 16:24ifft(1/fft(sweep))as a filter in convolution - but that's not what's happening. We seek to undo a convolution bysweep. Then, it's all about the forward transform - was it "full", "same", "valid"? Certainly not circular. Whatever it was, we first replicate it with FFT convolution, and then it becomes a simple division. I think for most cases,1 / fft(pad(sweep))is correct, and my version isn't. Sorry for the confusion – OverLordGoldDragon May 28 '23 at 08:42