This answer assumes that the signal you are trying to detect is either not changing frequency with time or is changing slowly compared to your DFT window period (number of samples * sample period).
Your problem is that the SNR in your second picture is terrible, which makes it impossible to know with any confidence what is noise and what is a legitimate signal peak. You need to improve the SNR.
One way to do that is by averaging multiple DFT's. You are already doing the most difficult part of the process- sampling the data, DFT'ing it, and calculating the magnitude. All you have to do is do that multiple times and average the results in each bin. So, if you were doing DFT's that are 512 samples long and you did four DFT's, you would calculate the bins for the average DFT by averaging the bins of the four DFT's. So for bin 0, you would average the bin 0 magnitude of all four DFT's, for bin 1 you would average all four bin 1's, etc.
Mathematically it looks like this-
$ X'[k] = \frac{1}{n} \sum_{i=1}^{n} X_i[k] $, for all k
where $X'[k]$ is the averaged DFT, and $n$ is the number of DFT's, and $k$ is the bin index.
You should get about 3 dB of SNR improvement each time you double the number of DFT's. Thus, going from 1 to 2 DFT's gets you 3 dB, 2 to 4 gets you another 3, 4 to 8 gets you another 3 dB, etc. Unfortunately, before long you run into diminishing returns. Also, the number of DFT's you can average is constrained by how quickly the signal is changing frequency. Once the "length" of the averaged DFT (sample period * number of samples per DFT * number of DFT's) gets to the point where the signal's peak is changing bins the signal will start to get smeared.