The OP mentions both “noisy source” and “clean signals” and that alone would be one distinction on the use of the Welch method which I will demonstrate below. Another motivation would be to use tools that already have all the correct scaling done for an estimate of the power spectral density, but if we are just interested in a proportional result rather than exact estimate, this latter detail would be of less significance.
The Welch method provides an averaging of shorter length DFTs. The shorter length DFTs will have a wider resolution bandwidth per bin and thus by averaging those we can achieve a smoother psd result than a single longer DFT would provide: to see this compare the DFT of the longer sequence to the resulting PSD using the Welch method; the result is less noise in the estimate of the power spectral density at the expense of frequency resolution, which would be preferable when the actual underlying spectral density is both noisy (random process) and not changing rapidly versus frequency.
In addition to reducing the noise in the estimate, using a validated Welch method such as what is provided in MATLAB, Octave and Python avoids having to determine all the proper scaling factors for windowing losses and resolution bandwidth given the total number of samples if trying to compute it directly from the FFT.
I demonstrate this below using the PSD of a waveform generated from a Gaussian White Noise process with a variance of 1 with $2^{18}$ total samples. I used the Welch method from Python's scipy.signal library 'scipy.signal.welch' which if not specified will provide a power spectral density in dB relative to a variance of 1W sampled at a rate of 1 Hz (or whatever units of power according to the units that the time domain samples are in), and thus 0 dB would be a power density of 1 W/Hz. Similarly if the time units are in samples, this would equivalently be in units of W/sample. I used a Kaiser window with the default overlap of 50% to create the Welch PSD, and then for comparison I processed the same time sequence using a similarly Kaiser windowed FFT, properly scaled using the resolution bandwidth of the window to result in the accurate estimate of the PSD as limited by that approach (resulting in more noise in our estimate sample to sample in the FFT compared to the Welch approach).

Also to confirm that the FFT method I used is accurately representing the power spectral density (just noisier in the estimate) I include a zoomed in plot below with a dashed line showing the variance of the entire DFT sequence (since this was a white noise process, the PSD is constant across the spectrum). The dashed line below is the variance of the noisy blue line showing excellent agreement with the PSD values returned by the Welch method. Also we see here the significantly smaller number of samples returned by the Welch method to return the spectrum, which is desirable when a high frequency resolution is not needed.
