More than "derive", I would talk about "why". For some explanations have a look at my answers here and here. For a direct application consider for example the $p$-multivariate normal distribution
$$f(x|\theta, \Sigma)=\frac{1}{(2\pi)^\frac{p}{2}|\Sigma|^\frac{p}{2}}
\exp\left( -\frac{1}{2}\langle (x-\theta),\Sigma^{-1}(x-\theta)\rangle \right);$$
the exponent is (up to a factor $-\frac{1}{2}$) the square Mahalanobis distance of $x$ from the mean $\theta$. This is an example of kernel (gaussian); it is widely used in density estimation. In the bivariate case, the level curves / density contours
$$\langle (x-\theta),\Sigma^{-1}(x-\theta)\rangle = K $$
are ellipses, with the usual statistical / mathematical interpretation.
In more mathematical terms, the squared Mahalanobis distance is an example of Bregman divergence generated by the convex function $F(x)=\frac{1}{2}\langle x,\Sigma^{-1}x\rangle$. In the regression context, it is also related to leverage; I refer to specialized texts for more details.
On Bregman divergences and Mahalanobis distance (with applications in topology): http://arxiv.org/pdf/0709.2196v1.pdf
On the geoemtry induced by the divergence with generator given by $F(x)=\frac{1}{2}\langle x,\Sigma^{-1}x\rangle$ (pag. 8-9 in particular):
http://bulletin.pan.pl/(58-1)183.pdf
This second reference shows that the Mahalanobis distance induces a Riemannian geometry
structure on a certain manifold with curvature tensor induced by the positive definite matrix $\Sigma^{-1}$. This is nice.