To start with, give your derivative a name, say myderiv. Let's for simplicity assume that there's no freedom to choose a variable for derivation, do your definvatives are always just written as myderiv[expression].
Now all derivatives share the following rules:
The derivative of a constant is zero
deriv[_?NumericQ] := 0
The derivative is linear
deriv[a_?NumericQ x_] := a deriv[x]
deriv[x_ + y_] := deriv[x]+deriv[y];
The derivative follows the product rule (I'm assuming whatever you derive uses the normal commutative product):
deriv[a_ b_] := deriv[a] b + a deriv[b]
Using the commutative product also allows you to use the following power rule (for non-commutative products it gets more complicated):
deriv[a_^n_Integer] := n a^(n-1) deriv[a]
Now you have to define the specifics of your derivative. Let's use as example the normal derivative for x. Then we clearly have the following rule:
deriv[x] = 1
With this, we already can calculate the derivative of arbitrary polynomials:
deriv[3 x^2 + 5x + 7]
(*
==> 5 + 6 x
*)
Now we have to define how to derive general functions. For that, we need a notation to denote the derivative of f; for simplicity I'll restrict it to one-argument functions. So denote the derivative of a function as d[f]. Then we can define the chain rule:
deriv[f_[expr_]] := d[f][expr] deriv[expr]
We can also define the derivatives of some specific functions:
d[Sin] = Cos; d[Cos] = (-Sin[#])&;
Now let's try it:
deriv[Sin[x]+Sin[Cos[x]]]
(*
==> Cos[x] - Cos[Cos[x]] Sin[x]
*)
deriv[f[g[x]]]
(*
==> d[f][g[x]] d[g][x]
*)
Now this definition is of course still quite incomplete and also has quite some potential for optimization, but it should give you the idea.