This is just a 2x interpolator: the delay between the top path and bottom path are 1/2 a sample offset at the input rate such that the half band output is providing the expected interpolated value in between each or the input samples to provide a 2x up-sampled output.
In the graphic below to help make this very clear, the blue samples represent the original waveform x[n] after a fixed number of sample processing delay. The red samples are the output of the interpolation filter, providing the same waveform with an additional 1/2 sample delay (which can be done easily by using an even number of taps and half-band filter). Then the output commutator selects from each providing the upsampled waveform at twice the input rate.

Below I posted existing slides I already have of this process as a Half Band Decimator which combines the Half Band Construction with Polyphase decomposition. The interpolator is formed in same fashion by reversing this, as in feeding the filter and the delay and commuting the output between the two. The major significance in doing this is that we run all filter elements at half the clock rate after the polyphase transformation:
Starting with a 7 tap Half Band Filter where as in the typical half band construction we place the taps to coincide with the zeros of the Sinc impulse response to minimize number of coefficients:

Next we do the polyphase decomposition by mapping rows to columns as given with the decimate by 2 (2 polyphase filters) below:

Which then reduces to the simplified form below:

Given the symmetric taps, we can further simplify this to be:

And the same in a more general form:
