First, what can be substituted in place of the variables and how (when eliminating the $∀$ quantifiers)?
A term, i.e. either a varibale or an (individual) constant or a "complex term" built-up from terms and functional symbols.
At the beginning, the language of set theory has at most one individual constant : $\emptyset$.
But during the development of the theory, is usual to "enlarge" the langugae with e.g. function symbols : $\mathcal P(x), \cap(x,y), \cup(x,y)$.
In the example you have linked, starting from the Empty set (or Null set) axiom : $\exists X \forall y \ \lnot (y \in X)$, by Extensionality axiom it is easy to prove that that set is unique. Thus, we can add to the language a name (an individual constant for it : $\emptyset$).
If the empty set is a constant, is there a reason why axiom such as $∀a (a∉∅)$ is not in place ?
As said, is a matter of choice; see Axiom of empty set :
In many formulations of first-order predicate logic, the existence of at least one object is always guaranteed. If the axiomatization of set theory is formulated in such a logical system with the Axiom schema of Separation as axioms, and if the theory makes no distinction between sets and other kinds of objects, then the existence of the empty set is a theorem.
Note : the empty set is a set (i.e. an object of the domain of discourse, or universe, of the theory), while the constant "$\emptyset$" is a symbol of the language that we use to formalize the theory.
Attempt at a formal derivation of some basic results...
The first step is to prove, using Extensionality, the uniqueness of the empty set (that allows us to speak of the empty set) : this fact, by way of suitable metatheorems, allows us to enlarge the language adding the individual constant "$\emptyset$".
1) $\exists X\exists X'[\forall y \lnot (y \in X) \land \forall y \lnot (y \in X') \land \lnot (X = X')]$ --- assumed
By $\exists$-elim twice followed by $\land$-elim, we have :
2) $\forall y (y \notin X)$
3) $\forall y (y \notin X')$
4) $\lnot (X=X')$
5) $a \notin X$ --- from 2) by $\forall$-elim
6) $a \notin X \to (a \in X \to a \in X')$ --- tautology : $\lnot P \to (P \to Q)$
7) $\forall z (z \in X \to z \in X')$ --- from 5) and 6) by $\to$-elim followed by $\forall$-intro
In the same way we have :
8) $\forall z (z \in X' \to z \in X)$
9) $\forall z (z \in X \leftrightarrow z \in X')$ --- from 7) and 8) by $\leftrightarrow$-intro
10) $X=X'$ --- from $\text{Extensionality}$ by $\to$-elim
11) $\bot$ --- by 4) and 10) by $\to$-elim, allowing us to close the $\exists$-elim's
12) $\lnot \exists X \exists X'[\forall y \lnot (y \in X) \land \forall y \lnot (y \in X') \land \lnot (X = X')]$ --- from 1) and 11) by $\to$-intro, discharging assumption.
Now, by equivalences for quantifiers and suitable tautological equivalences, we have :
13) $\forall X \forall X'[\lnot \exists y (y \in X) \to (\lnot \exists y (y \in X') \to (X = X'))]$
14) $¬∃y(y∈A)$ --- from $\text{Empty set}$ by $\exists$-elim
15) $\forall X'[\lnot \exists y (y \in A) \to (\lnot \exists y (y \in X') \to (A = X'))]$ --- by $\forall$-elim
16) $\lnot \exists y (y \in A) \to \forall X'(\lnot \exists y (y \in X') \to (A = X'))$ --- $X'$ is not free in the antecedent
17) $\forall X'(\lnot \exists y (y \in X') \to (A = X'))$ --- from 14) and 16) by $\to$-elim
But the last one is the formal statement of uniqueness : $\exists ! X \forall y \ \lnot (y \in X)$.
Now we need the metatheorem for the Elimination of Defined Symbols :
Let $\Gamma$ be a theory over a language $\mathcal L$.
Assume that
$\Gamma \vdash (∃!y)R(y, x_1,\ldots, x_n)$,
pursuant to which we have defined the new function symbol $f$ by the axiom
$y = f(x_1, \ldots, x_n) ↔ R(y, x_1,\ldots, x_n)$
and thus extended $\mathcal L$ to $\mathcal L'$ and $\Gamma$ to $\Gamma'$.
Then $\Gamma \vdash \varphi$ for any formula $\varphi$ over $\mathcal L$ such that $\Gamma' \vdash \varphi$.
This metatheorem assures that extensions of theories by definitions are conservative in that they produce convenience but no additional power (the
same old theorems over the original language are the only ones provable).
Having said that, we can enlarge the original language with the new constant symbol $\emptyset$ (an individual constant is a $0$-ary function symbol) and enlarge the theory with the definitional axiom :
$y=\emptyset ↔ \forall z \lnot (z \in y)$.
The same "procedure" must be used to introduce suitable "names" corresponding to each existential axiom (or theorem); thus the Power set axiom will licence us to introduce the function symbol $\mathcal P(x)$ and so, from Empty set axiom, we have $\mathcal P(\emptyset)$.
Now, what is $\mathcal P(\emptyset)$ ?
We have that :
$∀z[z ∈ \mathcal P(\emptyset) ↔ ∀w(w∈z → w ∈ \emptyset)]$.
We obviously have : $∀w(w ∈ \emptyset → w ∈ \emptyset)$, and thus $\emptyset \in \mathcal P(\emptyset)$.
Assume now that $X \ne \emptyset$ and $X \in \mathcal P(\emptyset)$.
We have that $∀w(w ∈ X → w ∈ \emptyset)$ and this means that $∀w \ \lnot (w ∈ X)$, i.e. $X = \emptyset$ : contradiction!
Thus, we conclude that :
$\forall z (z \in \mathcal P(\emptyset) \to z = \emptyset)$,
which amounts at (using set-builder notation) :
$\mathcal P(\emptyset) = \{ \emptyset \}.$