It starts with this: The standard enthalpy of formation of any substance, $\Delta H^\circ_\mathrm{f}$, is (by convention) defined as the enthalpy change for the reaction at 1 bar and a specified temperature (usually $\pu{298.15 K}$), in which the product is 1 mole of that substance, and the reactants are its component elements in their respective standard states.
For instance, $\Delta H^\circ_\mathrm{f}$ for $\ce{H2O_{(l)}}$ is equal to $\Delta H^\circ$ for the following reaction at standard state:
$$\ce{H2_{(g)} + 1/2O2_{(g)}->H2O_{(l)}},$$
since $\ce{H2_{(g)}}$ and $\ce{O2_{(g)}}$ are the respective standard states for hydrogen and oxygen.
Once you've accepted this definition as the convention for determining $\Delta H^\circ_\mathrm{f}$ for any substance, it directly follows that the value of $\Delta H^\circ_\mathrm{f}$ for any element in its standard state must be zero. For instance, $\Delta H^\circ_\mathrm{f}$ for $\ce{H2_{(g)}}$ is equal to $\Delta H^\circ$ for the following reaction:
$$\ce{H2_{(g)} -> H2_{(g)}},$$
which is necessarily zero.
To give an analogy: Suppose you define the "altitude of formation", $\Delta z_f$, of any location on earth as the change in altitude necessary to reach that location from sea level.
Hence $\Delta z_f$ for the summit of Mt. Everest is $\Delta z$ for the altitutude change associated with:
$$\ce{sea level -> summit of Everest},$$
which is 29,029 feet.
It necessarily follows, from this convention, that $\Delta z_f$ for any location at sea level is zero, since that would be equal to $\Delta z$ for:
$$\ce{sea level -> sea level}$$.