Why do we define the partial derivative evaluated at $p$ in this strange way?
Let $V\subseteq\mathbb R^n$ be an open set and $f:V\rightarrow \mathbb R$ a real valued function on $V$. Then for each $i=1,2,\dots,n$ the partial derivative (if exists) is defined by $$ (D_if)(r^1,\dots,r^n):=\lim_{h\rightarrow 0}\frac{f(r^1,\dots,r^i+h,\dots,r^n)-f(r^1,\dots,r^i,\dots,r^n)}{h}. $$In particular, this essentially requires the function to be defined on some open set of $\mathbb R^n$.
Let $f\in C^\infty_p(M)$ be a smooth function germ near $p\in M$ of the $n$-manifold $M$. This is to say, $f$ is represented by a smooth function defined near $p$ and two smooth functions defined near $p$ represent the same germ $f$ if and only if they agree on some neighborhood of $p$.
We cannot define the partial derivative of $f$ (or any representative) directly, because $M$ is not an open set of $\mathbb R^n$. However suppose that $(U,x)$ is a local coordinate system near $p$. Then $x:U\rightarrow V\subseteq\mathbb R^n$ is a homeomorphism between $U\subseteq M$ and $V\subseteq \mathbb R^n$. We may suppose without loss of generality that the germ $f$ has a representative defined on $U$ and let us also write $f$ for this representative. Then the function$$ \hat f:=f\circ x^{-1} $$ is a smooth function defined on $V\subseteq\mathbb R^n$ which is an open set of $\mathbb R^n$. Explicitly, we have$$ \hat f(x^1(p),\dots,x^n(p))=f(p) $$ for each $p\in U$. We can now take the partial derivatives of this function, so we define $$ \frac{\partial f}{\partial x^i}(p):=(D_i\hat f)(x(p)). $$ We have to do this because we cannot take partial derivatives on $M$ directly as $f$ is not a multivariable function. But coordinate maps can be used to represent $f$ as the multivariable function $\hat f$, which is then partially differentiable. Hence the "strange" procedure.
How can one check if this acutally satisfies the Leibnitz-Rule?
Using the previous notation, we have for any pair of germs $f,g\in C^\infty_p(M)$ (identified with their representatives defined on the coordinate domain $U\subseteq M$)$$ \frac{\partial(fg)}{\partial x^i}(p)=D_i(\widehat{fg})(x(p)) \overset{(1)}{=} D_i(\hat f\hat g)(x(p)) \overset{(2)}{=} (D_i\hat f)(x(p))\hat g(x(p))+\hat f(x(p))(D_i\hat g)(x(p)) \\ =\frac{\partial f}{\partial x^i}(p)g(p)+f(p)\frac{\partial g}{\partial x^i}(p), $$where at (1) we have used $$ \widehat{fg}=(fg)\circ x^{-1}=(f\circ x^{-1})(g\circ x^{-1})=\hat f\hat g, $$i.e. that composition distributes with respect to multiplication and at (2) we have used that the ordinary partial derivative $D_i$ in $\mathbb R^n$ obeys the Leibniz rule.