There are two problems. The first is that the height of the matrix nodes is equal to only twice the inner ysep because they have no content, whereas the T node has height equal to twice the inner ysep plus the height of the T. There are various ways to address this. One simple approach is to measure the height of the T and set that as text height for the matrix nodes. (If you have other content in the matrix in your real document, minimum text height may be more appropriate.)
The second is that you need to anchor the T node relative to the nodes of the matrix. Basically, you want the T's .base anchor at .base of the combined node. In this case, that's just diagram.base. Again, if you have more nodes, you can create the fit node and then use <name of fit node>.base.
\documentclass{standalone}
\usepackage{tikz}
\usetikzlibrary{matrix}
\usetikzlibrary{fit}
\newlength\myTht
\settoheight\myTht{T}
\begin{document}
\begin{tikzpicture}
\matrix (diagram) [matrix of nodes, nodes in empty cells, nodes={draw}, text height=\myTht] {% or minimum text height if other nodes have deeper or taller material
& \
};
\node [anchor=base] at (diagram.base) {$T$};
\node [draw, fit=(diagram-1-1)(diagram-1-2)] {} ;
% \node (fitted) [draw, fit=(diagram-1-1)(diagram-1-2)] {} ;
% \node [anchor=base] at (diagram-1-1.base -| fitted.center) {$T$};
\end{tikzpicture}
\end{document}

EDIT
You could also use
\node (diagram-fit) [draw, fit=(diagram-1-1)(diagram-1-2)] {} ;
\node at (diagram-fit) {$T$};
This suggests that the default anchor for the fitted node is still center. Certainly, the T is centred without specifying any anchors. This should also work if your matrix includes more nodes, which you don't want influencing the placement of T.
However, when you try to place the $T$ into diagram-fit at the same time as you create/draw diagram-fit, things go pear-shaped. This isn't actually especially unusual. There are other cases where you effectively need to use two separate operations to get sensible alignment by default. For example \draw ... node ... will give you something different from \draw ... coordinate ... ; \node at () .... It's not that there's anything special about the fitted node, I don't think. I suspect it is that it is not a straightforward \node. (But this is just a suspicion - I haven't looked.)
fitfunctionality is not made for nodes with text in it since it usestext heightandtext depthto set the size of the node. The given answers solve this in different ways but you're going to run into problems as soon as you want to put more complex text inside the nodes. A PGF/TikZ matrix isn't really made for multicol but solutions exist that try their best. – Qrrbrbirlbel Oct 17 '23 at 12:05