Understanding how the holographic principle actually works is still an active area of current research, so don't expect a completely satisfactory answer yet — even the experts are still trying to understand it!
With your background, a good point of entry might be to study toy models based on tensor networks, where the weirdness of the holographic principle is "explained" in terms of the properties of certain quantum error-correcting codes. (The subject of quantum error-correcting codes, in turn, is relatively accessible from a purely mathematical perspective without much knowledge of physics.) An early paper on such toy models is
Even though this paper is written for an audience that is already familiar with holographic-principle research, it is written with a friendly style and cites some other background sources that might be helpful. A more recent review, which might be helpful on its own or might at least cite some other helpful references, is
These toy models (partly) address how something defined on a lower-dimensional space can act like something in a higher-dimensional space, but they don't directly address what might be the core of your question, namely how something defined on a lower-dimensional space can possibly encode all the stuff that's in a higher-dimensional space. An important point here is that our experiments in four-dimensional space-time have limited resolution, and the amount of stuff that we can actually resolve with our current experimental capabilities is nowhere near the amount of stuff that can be encoded by the alleged lower-dimensional holographic dual.
Even in principle, we probably couldn't ever resolve anything more than that. The reason is worded well by Kaplan (2016) in section 1.1 of "Lectures on AdS/CFT from the Bottom Up", http://sites.krieger.jhu.edu/jared-kaplan/files/2016/05/AdSCFTCourseNotesCurrentPublic.pdf (accessed 2018-10-14), who says it like this: "if you were to try to resolve distances of order the Planck length $\ell_{pl}\sim 10^{-35}$ meters, you would need energies of order the Planck mass..., at which point you would start to make black holes. We are fairly certain of this because the universally attactive nature of gravity permits gedanken experiments in which we could make black holes without passing through a regime of physics we don’t understand. Pumping up the energy further just results in larger and larger black holes, and the naive reductionist program [of trying to resolve things at smaller and smaller scales] comes to an end. So it seems that there’s more to understanding quantum gravity than simply finding a theory of the “stuff” that’s smaller than the Planck length - in fact there is no well-defined notion of smaller than $\ell_{pl}$."
The point is that the holographic principle doesn't say that we can use a lower-dimensional space to encode all of the arbitrarily-small stuff that could mathematically fit in a higher-dimensional space. What it does say (loosely translated) is that we can encode enough stuff to match what we can observe in the physical world and — this is the interesting part — that we can encode it in a way that really does behave like it lived in a higher-dimensional space. This last part is what the "error correction" toy models are meant to address, and this is what the words "bulk locality" allude to in the section-title of the 2018 review cited above.