You can at least approximate colored noise by altering the spectrum of white noise. The following are some definitions of colored noise
\begin{equation} S_{pink}(f) = \frac{S_{0}}{f^{\alpha}} \end{equation}
where $\alpha$ is close to 1,
\begin{equation} S_{brown}(f) = \frac{S_{0}}{f^{2}}\end{equation}
\begin{equation} S_{blue}(f) = S_{0}f\end{equation}
You can see more here. I'm not a python person, so I'm not going to write python, but the basic procedure is something like: generate white noise $\rightarrow$ N-D FFT $\rightarrow$ N-D filter (e.g. for pink noise dot multiply the white noise spectrum by $\frac{1}{f}$) $\rightarrow$ N-D IFFT. This will give you an approximation of your desired colored noise.
For generating a covariance (correlation) matrix, the function you are likely looking for is spectrum's corrmtx, which is equivalent to Matlab's corrmtx. This will give you a correlation matrix estimate for 1-D data via time-averaging. However, you seemed confused on what the correlation matrix is and how to find it. For a data vector
\begin{equation}\underline{x}=\begin{bmatrix}x(0) & x(1) & \cdots & x(N-1)\end{bmatrix}^{T}\end{equation}
The correlation matrix is defined as
\begin{equation} R_{xx} = E\{\underline{x}\,\underline{x}^{H}\}\end{equation}
This means that the correlation matrix is the ensemble averaged outer product, and thus represents the complete set of statistical second moments of a data vector $\underline{x}$. Theoretically, for a vector of length $N$, the correlation matrix is an $N$-by-$N$ matrix. Estimating via time-averaging will typically decrease the size of the estimated correlation matrix.
Estimating the correlation matrix for 2-D data, for example, will result in a block diagonal matrix. Estimating correlation matrices of 2-D or higher dimensional data is pretty conceptually challenging, so I would start with 1-D and make sure you understand that.
EDIT 1: Question about Covariance Matrix vs. Correlation Matrix
Wikipedia's explanation doesn't seem right. The covariance matrix is defined as
\begin{equation}K_{xx} = E\{(\underline{x}-\mu_{x})(\underline{x}-\mu_{x})^{H}\}\end{equation}
where $\mu_{x}$ is assumed constant for a Wide-Sense Stationary random process. For noise to be white, it must be zero-mean. There are several stack exchange posts on this, I'll link this one. This means, for white noise,
\begin{equation} R_{xx} = K_{xx} = \sigma_{0}^{2}I\end{equation}
where $I$ is the identity matrix.
Practically speaking, any property that applies to the correlation matrix also applies to the covariance matrix, although the converse is not true. I find estimating covariance matrix to be more tedious and doesn't always provide a huge benefit, especially when dealing with higher dimensional data, but there very well could be practical use for it that I'm unaware of. To estimate it, you have to subtract off the mean.
Responding to your updated question, that is not how you would calculate the covariance matrix. You would still use the corrmtx function, but your data input would be the data with the mean subtracted off. If you wanted to mimic ensemble averaging, you could form an $N$-by-$M$ matrix that contains $M$ draws of an $N$ length random vector, where $M \gg N$. You would then take the outer product of each row and then average the $M$ outer products.