Since the bounty is about to expire, I am posting my findings so far.
Using the method mentioned by user42355, one can prove the following more general result.
Let $\lambda=(\lambda_1,\ldots,\lambda_{\ell})$ be an integer partition with at most $\ell$ parts, i.e., we assume the $\lambda_i$'s are integers such that $\lambda_1\ge\cdots\lambda_{\ell}\ge 0$. Define, for $N\ge \ell$,
$$
M_{\lambda}(N):=\mathbb{E}_{U(N)}\left[
|\Delta_1(U)|^{2(\lambda_1-\lambda_2)}
|\Delta_2(U)|^{2(\lambda_2-\lambda_3)}\cdots
|\Delta_{\ell-1}(U)|^{2(\lambda_{\ell-1}-\lambda_{\ell})}
|\Delta_{\ell}(U)|^{2\lambda_{\ell}}
\right]\ .
$$
Here $\Delta_j(U)$ denotes the $j\times j$ principal minor determinant in the top left corner of the matrix $U$, and the latter is random and Haar distributed in the unitary group $U(N)$.
Let $G_{\lambda}(N)$ denote the expectation of the same integrand but where the matrix $X$ is now sampled according to the Ginibre ensemble, and let $T_{\lambda}(N)$ be the the similar expectation with now a random matrix $T$ which is upper triangular with positive diagonal entries, sampled as follows.
The entries are independent. For $i<j$, $T_{ij}$ is such that $\sqrt{2}{\rm Re}(T_{ij})$ and $\sqrt{2}{\rm Im}(T_{ij})$ are independent standard $N(0,1)$ real Gaussian random variables. For $1\le i\le N$, $2T_{ii}^2$ is a chi-squared random variable with $2(N-i+1)$ degrees of freedom.
As in the answer by user42355, the key fact one needs is the Gram (or QR for numerical analysts) decomposition $X=UT$, as an equality in distribution, with $U$ and $T$ independent and distributed as above. We then immediately have
$$
G_{\lambda}(N)=M_{\lambda}(N) T_{\lambda}(N)\ .
$$
Now since all the entries of the Ginibre matrix $X$ are independent, we see that $G_{\lambda}(N)=:G_{\lambda}$ is independent of $N$. As a result,
$$
M_{\lambda}(N)=\frac{T_{\lambda}(\ell)}{T_{\lambda}(N)}M_{\lambda}(\ell)
$$
Since $|\Delta_{\ell}(U)|=1$ for $U$ unitary of size $\ell$, we get the recursion relation
$$
M_{(\lambda_1,\ldots,\lambda_{\ell})}(\ell)=
M_{(\lambda_1-\lambda_{\ell},\lambda_2-\lambda_{\ell},
\ldots,\lambda_{\ell-1}-\lambda_{\ell})}(\ell)
$$
which reduces the length of the partition seen as a sequence of nonnegative integers.
Using the formula for the moments of the chi-squared or Gamma distributions, we easily get
$$
T_{\lambda}(N)=(N)_{\lambda_1}(N-1)_{\lambda_2}\cdots(N-\ell+1)_{\lambda_{\ell}}
$$
where I used the (rising) Pochhammer symbol $(x)_n=x(x+1)\cdots(x+n-1)$.
Putting everything together, and solving the recursion on $\ell$, we arrive at
$$
M_{\lambda}(N)=\frac{(\ell)_{\lambda_1}(\ell-1)_{\lambda_2}\cdots(1)_{\lambda_{\ell}}}{(N)_{\lambda_1}(N-1)_{\lambda_2}\cdots(N-\ell+1)_{\lambda_{\ell}}}\times C_{\lambda}
$$
where
$$
C_{\lambda}=\prod_{j=2}^{\ell}\left(
\prod_{i=1}^{j-1}
\frac{(j-i)_{\lambda_i-\lambda_j}}{(j-i+1)_{\lambda_i-\lambda_j}}
\right)
$$
$$
=\prod_{j=2}^{\ell}\left(
\prod_{i=1}^{j-1}
\frac{(j-i)}{(j-i+\lambda_i-\lambda_j)}
\right)\ .
$$
In other words, we end up with the beautiful and most certainly known formula,
$$
M_{\lambda}(N)=\frac{h_{\lambda}}{\prod_{\Box\in\lambda}(N+c(\Box))}\ .
$$
Here $\Box$ denotes a box in the Ferrers diagram of $\lambda$, drawn the English (rather than French) way. The content of $\Box$ is $c(\Box)=j-i$ if the box lies in the $i$-th row (from top to bottom) and the $j$-th column (from left to right).
Finally, $h_{\lambda}$ is the product of hook lengths for the partition $\lambda$, which is also given by
$$
h_{\lambda}=\frac{(\lambda_1+\ell-1)!\ (\lambda_2+\ell-2)!\cdots \lambda_{\ell}!}{\prod_{1\le i<j\le\ell}\ (j-i+\lambda_i-\lambda_j)}\ .
$$
It is easy to see that the moment in the question corresponds to the rectangular partition $\lambda=(p,\ldots,p)$, where $p$ is repeated $k$ times.
In a previous version of the post for my question, I suspected that the moment should be one over the dimension of a representation of $U(N)$, but then I deleted that because I dismissed it as too good to be true. Yet, it is true! Namely,
$$
M_{\lambda}(N)=\frac{1}{{\rm dim}_{\mathbb{C}}(\mathbb{S}^{\lambda}(\mathbb{C}^N))}
$$
where $\mathbb{S}^{\lambda}$ is the Schur functor associated to the partition $\lambda$.
This follows from the hook content formula for ${\rm dim}_{\mathbb{C}}(\mathbb{S}^{\lambda}(\mathbb{C}^N))$ which is the number of semistandard Young tableaux of shape $\lambda$ filled with numbers from $1$ to $N$.
As far as references, for the key probabilistic Gram or QR decomposition, I looked at the notes by Menon and Trogdon mentioned by user42355, as well as other random matrix references from the areas of statistics (e.g., the book by Eaton), probability, mathematical physics (e.g., related to quantum transport), but I did not like the presentation therein (just my personal taste). This is because one has to go through rather tedious independence or conditional independence statements about partitioned Wishart matrices. The most illuminating presentation I found was the book "Analysis on Lie Groups, An Introduction" by Jacques Faraut. The needed Gram decomposition is done in Exercise 5 of Chapter 5, as a corollary of Theorem 5.3.1 which shows how to build a left Haar measure on a locally compact group $G$ homeomorphically decomposed as a product $G=PQ$ of two closed subgroups, using
a left Haar measure on $P$, a right Haar measure on $Q$, and the module function of $G$.
After finishing the computation of $M_{\lambda}(N)$, and contemplating the beautiful final formula, I realized one can prove this in a completely different way using the Peter-Weyl type formula for orthogonality of matrix elements of irreducible unitary representations of compact groups. Indeed the integrand of $M_{\lambda}$ is essentially of the form
$$
|\langle v,\pi_{\lambda}(U)v\rangle|^2
$$
for a suitable highest weight vector $v$.
Finally, note that there is related formula in my previous post:
Integration of a function over 7-sphere
The integrand is essentially the same, but the difference is about taking the expectation of a Ginibre versus a complex Wishart random matrix.