If we go mad and inject an element of real world signal processing, an ordinary FM bandwidth of 20,000 Hz will carry speech and/or music for a standard broadcast signal, on a VHF/FM transmission.
Where you have a continuous audio signal for sampling, the Nyquist rate is double the highest frequency (i.e. double the bandwidth). Hence a signal ranging from 0 to 20,000 Hz implies a Nyquist rate of 40kHz. If the sampling rate is at least equal to this, the resulting digitised signal will be free of aliasing distortion. Hence, standard sampling rates of 44,100 Hz and 48,000 Hz are commonly used in commercial radio broadcasting.
We might describe the use of 48,000 Hz as over-sampling. In practice, because the difference from 44,100 Hz is not very great, it is not generally regarded as 'true' oversampling.
Historically, there have been higher sampling rates. For instance, once upon a time DAT audio (Digital Audio Tape) used a sampling rate of 96,000 Hz -- double the more usual 48,000 Hz.
Obviously, you can potentially get greater resolution -- better signal quality -- if you double the sampling rate, since you are reducing the 'step' size, in fact halving it. But you are doubling the effective data rate too. From an economic standpoint this implies a greater cost, on transmission (and on storage space), even though in engineering terms it represents a 'purer' signal.
The lowpass filter will pass through all signal not exceeding 20,000 Hz. But by sampling at not less than double this rate, you avoid the economic expense of requiring a too sharp cut-off, and a too long settling duration, in this filter.
Sampling at the higher rate of 96kHz -- so called "oversampling" (sampling at greater than Nyquist rate) -- has engineering benefits, i.e. sound quality benefits.
Firstly, for any given level of quantization (16-bit quantization is mostly used), the higher the sampling rate the lower the quantization noise (inherent to 16-bit sampling) as there are fewer rounding errors in the maths. By reducing the quantization "noise", it thereby improves the SNR (signal-to-noise ratio).
This was a factor which made DAT attractive, due to it having a sampling rate which, at 96kHz, was double the standard 48kHz sampling rate.
Secondly, as quantization noise is random (i.e. evenly distributed), the quantization noise spreads randomly across (half of) the sampling frequency; and since this spread broadens as the sampling rate increases, the higher rate of 96kHz allows the lowpass filter to remove more of it.
Accordingly, the conclusion is that using the Nyquist rate of 40kHz (or any higher rate) for sampling the audio will avoid any aliasing distortion, and will avoid any economic drawbacks in the cost of the lowpass filter. Also, sampling at a higher rate than the Nyquist rate will provides audio quality improvements, at some increased economic cost.