While just using the law for $X$ is a perfectly good probability space for a universe consisting of a single random variable, what about situations with two, three, four, countably infinite, uncountably infinite numbers of random variables?
In your situation with a single random variable, you can just take $(\Omega,\mathcal{F}, P) = (\mathbb{R},\mathcal{B},P_X)$ as your probability space. $P_X$ is just a specified probability measure on $(\mathbb{R},\mathcal{B}),$ so there's no circularity here (you can define it in terms of a distribution function if you wish). Similarly, if you have a universe with two random variables, you can take $(\Omega,\mathcal{F}, P) = (\mathbb{R^2},\mathcal{B(\mathbb{R}}^2),P_{X_1,X_2})$ for the appropriate probability measure $P_{X_1,X_2}.$ The pattern goes on through situations where there's an infinite number of RVs or even uncountably (like continuous time stochastic processes) though the functional analysis gets challenging. What doesn't change as the complication increases is that it can be described a probability space ... in the case of a continuous time like Brownian motion the natural one would look like $(\Omega,\mathcal{F}, P)=$ (all functions B(t), some sigma algebra, some measure on the space of functions). Since there are many different possible random systems you want to be able to describe, with probability spaces of varying complexity, it's easier to keep the notation abstract. However if you have a concrete situation with a natural set of random variables, you almost always think in terms of the natural probability spaces described above, perhaps in terms of distribution functions instead of measures (but those two are equivalent).
So why bother with the $P_X(B) = P(\omega | X(\omega)\in B)$ definition? Well, $X$ might not be the only thing in the universe and you might want to lift it out to consider it in isolation. (Of course, they are equivalent if $(\Omega,\mathcal{F}, P) = (\mathbb{R},\mathcal{B},P_X)$.) Or you might have chosen a different representation of the probability space than the natural one. For instance, maybe you take $$(\Omega,\mathcal{F}, P) = ([0,1], \mathcal{B}([0,1]), \mathrm{Lebesgue}).$$ This space can be used to model lots of stuff. You can model a random sequence of zeros and ones (whose "natural" outcome space is not $[0,1]$ but rather $\{0,1\}^{\mathbb{N}}$) by mapping $\omega \in [0,1]$ to its decimal expansion. Or you can model a Gaussian RV by mapping $\omega \in [0,1]$ to $X = \Phi^{-1}(\omega) \in \mathbb{R}$ with $\Phi$ the normal CDF (like in inverse transform sampling). You can even use some trickery to model a sequence of random variables with a given distribution. As you alluded to in your question, the two representations have the same information and are equivalent. Since there are many possible spaces to represent a given random model and it doesn't matter which you choose, it's natural to be abstract and write $(\Omega,\mathcal{F}, P).$