It appears that your equalization for dispersion worked well for the positive frequencies if I read the plots correctly (at least on a error magnitude plot, to evaluate that visually better I recommend plotting in dB).
If you only need frequencies below 6000 rad/s, consider decimating your signal to a lower rate as the first step in processing. Decimating to a sampling rate of 14000 to 15000 rad/s would allow reasonable margin for simpler filtering in the decimation process but really simplify downstream processing over the current sampling rate of 62,832 rad/s. The decimation filter when properly designed will also remove all higher frequency noise that may otherwise alias in.
Unless your time domain waveform is a series of square pulses (which I don't recommend as they require infinite frequency support), your first plot may be suffering from significant spectral leakage. If your waveform is square pulses, the subsequent filtering to be suggested further below will help with that but in any case I recommend windowing your signal in time which will provide greater dynamic range (at the expense of frequency resolution). A great window choice for time-frequency localization is the Kaiser Window which comes very close with less computation to the ideal DPSS Window for sampled systems. To proceed with this create a window function in time matching the length of your data, multiply your data sample by sample with that window and then take the FFT. You should see a dramatic improvement in the spectral floor. I demonstrate this below with the DTFT of the Kaiser Window compared to a default rectangular window. The first parameter is the number of samples (so this one below was just 30 samples long) and the second parameter is the $\beta$ factor this window uses; the higher the value for $\beta$ (in this case we used 10) the deeper the side-lobes (higher dynamic range) and wider the main lobe (less frequency resolution).

It also appears that you may have some interpolation distortion (did you up-sample your signal?). If so, filtering can be optimized with multi-band filters at the interpolation images, but the need for that will be clearer once you review your windowed FFT spectrum.
Also it appears that you only wanted to compensate for the positive frequencies of an otherwise real signal. Don't you want to provide your equalization to both the positive and negative frequencies (the regions you showed in red)? The error is not indicating that.
My preferred and simple low pass filtering technique is a least squared linear phase filter, which provides optimum filtering in a least squared sense (will be guaranteed to have the least error from target frequency bands for a given number of filter taps) and is linear phase so will not contribute any further dispersion (importantly).
Least squared filter coefficients can be determined easily from MATLAB, Octave and Python scipy.signal using the firls function. The transition band between pass band and stop band will drive the number of taps needed to achieve a certain rejection requirement.
Finally for doing such equalization, compare your technique to the Wiener-Hopf equations as summarized in this post here which also provides a least-squared solution to the equalization (dispersion correction) problem.