I think there are some limitations in the existing answers, so I'll share my two cents. Hopefully there are some thoughts that are helpful.
The FIR-filtered white noise method will work nicely for generating arbitrary length sequences, but if you are trying to simulate a noise source with non-trivial structure in its PSD the spectrum will be difficult to emulate using a FIR filter.
The inverse Fourier method as described by other answers could also use a little further discussion. Randomizing the phase only (and not magnitude) of the Fourier coefficients would result in an infinite sequence with precisely the correct long-term average PSD (good), but also a 100% repeatable PSD in each window of a PSD estimation (bad). The PSD of each window should 'dance' a little over time for a true stationary noise source. So both the magnitude and phase should be randomized. For magnitude, you could square a Gaussian random number with variance of 1 and multiply by the component of the PSD at that frequency, and for phase you could use a uniform random from 0 to 2pi. Or... you could also generate the real and imaginary component independently using a Gaussian random with variance of 1 (magnitude has a variance of 2 at this point), and scale each of them by sqrt(PSD/2) to get to the final magnitude variance equal to the PSD at the each frequency. Note that all of this assumes you have a stable, long-term average of the PSD. If not, it might be worth considering some smoothing.
There are also some potential issues with generating an arbitrary-length sequence of noise from finite-length windows consisting of the inverse Fourier transform on a finite-length PSD. If you use the same input to each successive inverse FFT, you will have a similar problem to the above constant magnitude issue - that each 'random' sinusoidal component in the time domain will continue forever at the same magnitude (and phase in this case). This would cause an autocorrelation to spike at the lag of 1 FFT window, which is clear sign of bad noise.
On the other hand, if you create each successive frequency domain window from a different randomized magnitude/phase number scaled by the PSD value at that frequency, then there are potential discontinuities caused by stitching these together in the time domain. For example, if the PSD has a large spectral contribution from low frequencies (e.g. DC at wavenumber 0, or other low wavenumbers), then there will be an artifact at the boundary between successive windows in the time domain where the magnitude and/or phase abruptly changes from one random value to the next. I'm not sure what the textbook way to mitigate this is, but what comes to mind would be something like Welch's method in reverse: you could generate twice as many noise windows as needed, taper each one by a Hann window, and then stack them with an overlap of half the window length to maintain smooth transitions of phase and magnitude between windows. The multiplication by a Hann window in the time domain should have no smearing effect on the PSD, as it would essentially be equivalent to a convolution by a 1 wavenumber wide Dirac delta in the frequency domain.
For arbitrary-length output, you could probably also just up-sample/interpolate the PSD up to the nearest power of 2 larger than the required noise sequence length, and inverse FFT the whole monstrous thing... but that would scale terribly.
Anyways, that's my 2 cents. Very open to thoughts, discussion, and criticism.