You are mixing up two different notions of sampling.
In a digital
communications system, the received (analog, continuous-time) signal
is passed through an A/D converter so that the needed further
processing (e.g. matched filtering) can be implemented on a
programmable DSP processor or as a MATLAB or C++ program or
on a special purpose ASIC. The sampling that you show in your question
is the one being carried out in the A/D converter, and as long as
the sampling is above the Nyqvist rate, the fact that the samples
are not at the peaks of the signal is irrelevant: the original
signal is represented perfectly adequately by the samples regardless
of where the peaks in the signal are with respect to the sampling
instants.
The other notion of sampling has to do with the matched filter
output which is a sequence of digital data. Which of these
data should the decision-making device use to decide whether
(in the simplest case) whether a 0 or a 1 was transmitted?
The answer is what you quote: it is the datum when the
signal output of the matched filter has a peak. That is,
we are picking one value from the matched filter output
sequence and making the decision based on that.
Note that
We are not going to average the matched filter output (or
take a weighted average of the output) and make a decision on that
quantity.
We are not going to take the maximum value of the actual output
sequence (which may, because of noise, be at a different location
than the time when the signal output of the matched filter is
supposed to peak) and make a decision based on that.
For more information on matched filters (including explanations
of the above bulleted points), see
this other answer of mine.