I do consider zero-padding as a form of an often-used combination of two other basic operations:
- "extension",
- "windowing",
for finite-length data (signals, images).
[Interlude] I don't know a generic word for that (like companding, that combines compressing and expanding). Fellow SE.DSPers, if you know of one, please share. Otherwise, I would propose "extowing". I will start with the "windowing" part.
Windowing:
While there is a huge literature in the design of 1D windows, I have neither been teached/exposed, nor seen so much references, on 2D (and nD) designs. If I focus on 2D windows per se (not for other purposes than windowing), in my bibliographic reference list (10.900+ items and counting), there is only a handful of such references, mostly ancient, like: Huang, T. S., 1970, Two-dimensional windows (quote: "Many good one-dimensional windows have been devised, however, relatively
few two-dimensional windows have been investigated.") or Coulombe, S. and Dubois, E., 1996, Multidimensional windows over arbitrary lattices and their application to FIR filter design. Most per se designs I know of are:
- tensor, separable outer-products of 1D windows,
- non-separable circular extensions of 1D designs, where a centered $W(t)=w(|t-t_0|)$ window is converted to 2D with some norm $l$: $W(x)=w(l(x-x_0))$, with discretization and normalization side-effects,
- non-separable "1D-inspired" 2D optimization (like McClellan).
However, I have not seen a lot of them natively implemented in image processing software (apart from tensorized 1D windows and 2D discretized Gaussians).
- Extension: data extension is common practice in image processing, for different reasons. For instance, in JPEG Discrete cosine transform padding, one uses extensions to process images whose width or height are not divisible by 8. Additionally, DCT type II has beneficial symmetric features that are practically useful.
A mere zero-padding can be applied, but the risk of strong artifacts at the borders is very high. Useful extensions can be strongly dependent on image applications and morphology. For instance, many sound/vibration signals are zero-mean, and can easily be zero-extended with a little tapering. Meanwhile, standard images have $[0,255]$ pixel values, and hence are not zero-mean. So constant (zero-order) or linear extensions are sometimes used at borders, and there exists a literature on windowing for adaptive (causal) image filtering (R.M. Mersereau ; D.E. Dudgeon, 1975, Two-dimensional digital filtering
or J.H. McClellan, 1982, Multidimensional spectral estimation).
Both operations (Windowing and Extensions) are naturally combined in the design of multirate or multiscale filter banks, where parallel banks of windowed pass-band filters are designed together to allow both overlap between pixel blocks (to avoid sharp discontinuities) and perfect reconstruction (exact inversion). The Lapped Orthogonal Transform (LOT) is typical, with 50% overlap on each side. Embedded in the context of paraunitary filter filter banks, many works have derived symmetric or antisymmetric image extensions, to benefit from the inherent symmetries in the filters. The typology is often four-fold, with half-sample or whole sample symmetry, and symmetry or antisymmetry. They are sought to preserve "image" continuity or differentiability across blocks.
But let's get practical. If you have enough memory, my experience is that you are really safe, in the first instance, if you perform a 4-fold image extension (symmetric or antisymmetric depending on the data) and windowing: 50% on each edge, and a separable 1D window design, with a power-raised cosine window.