I am currently studying the FREAK descriptor and I read the article published by its designers. It states that the aim was to mimic the retinal topology, and one of the advantages that could be gained is the fact that retinal receptive fields overlap, which increases the performance.
I thought about it a lot, and the only explanation I was able to come up with is the fact that, looking at this problem from an implementation point of view, a receptive field is the ensemble of an image patch centred around a pixel, plus the standard deviation of the Gaussian filter applied to this patch. The size of the receptive field represents the value of the standard variation. The bigger the size is, the more pixels will be taken into consideration when Gaussian filtering, and so we "mix" more information in a single value.
But this guess of mine is very amateurish, I would appreciate it if someone could give an explanation from what goes on in the field of image processing-computer vision-neuroscience.


