Edit01 (here and scattered throughout)
[Not meaning to be antagonistic… but admittedly somewhat defensive.]
Insofar as the following is a citation of an existing work, it is the below Martin and Foley item, pp165-185 (with a noted exception) — noting again that I read this years ago, and did not consult it (nor anything else) when writing the below… and (again) that I consider that, although this can be studied as a science, most of the root knowledge is innate (of course) and implicit and could be rendered explicit by any adult who did not already have it so (as in the following example). (It is of course theoretically possible that I have read other sources on the same material. As far as I am concerned, most modern adults would consider almost everything here neither insightful nor controversial, with the exception of brightness (for which I have no source), height cues (which I did not mention) and kinetic depth effect (which I did not mention and which is not obviously and certainly relevant to the question).)
Atmospheric colouring is an exception in that one can use this to work out that air is blue (as opposed to needing to know that air is blue to make sense of an apparent phenomenon).
Brightness is an exception in the case of which indeed I should have supplied a reference… except that, as noted, I can not see it in the noted text book, and thus have no idea where I heard it (noting that I do have a memory of the excitement of learning this (or possibly working it out)).
More citation anomaly and disparity as noted.
End — Main Body of Edit01.
Original, with references in “[]” added.
Going from memory, since no one else has covered this… .
This is more a psychology question (as below).
3D vision utilises about seven different mechanisms. Binocular vision accounts for only two of these.
• Occlusion — each eye sees slightly different areas of a partly occluded object. [(“Occlusion” suggests also the monocular topic “interposition”, below.) As a binocular phenomenon, Martin and Foley (p183) label this “binocular disparity”, which is [from a quick look] actually about the image from the same object falling on “different” — I would say “non-corresponding” — areas of the retina (as a function of being closer to, on, or further from, the focal distance). [From a quick look] they do not seem to cover the fact that one eye will actually see more of a partly occluded object (which means I have no idea where I heard it (and I am thinking that I might well have worked it out myself)).
• Focus — you have to focus both of your eyes on an object to see it not-fuzzy; your system knows about (relative) eye positions and subject distance. This includes both focusing the lens in the eye [“accommodation” [p167] is lens shape] and pointing both eyes at the object [“convergence” [p182] is the eyes turning towards each other].
Binocular occlusion information only works out to about (from memory) 2m [“10 feet” Martin and Foley p183]. This would be increased by having the eyes further apart. (Focus works out to a much greater distance; I am not sure which of the above two aspects is more useful, but I am guessing it is lens shape (and that they are of roughly similar usefulness). [Martin and Foley mention “10 feet” (p167), citing Hochberg (1971) for accommodation (lens shape)… but appear to have chosen this as immediately arbitrary. Their point is that it yields “rather weak” distance information.]
Other (monocular) 3D vision cues include the following.
• Object size — many objects have a standard size [“familiar size” p167], and almost all objects have different proportions (such as leg thickness) according to how big and heavy they are [which Martin and Foley do not cover; the source is me]… and an object that is further away will have a smaller image size. [pp167-169 “size cues”, “relative size”.]
• Brightness — any given object will reflect less light into the eye from further away (because the eye is a smaller target further away). (This is more useful and important than one realises.) [From a quick check, this one does not appear to be in the noted text. I do not think (particularly) that I worked this out for myself, but if it is not in that text, then I have no idea where I heard it.]
• Perspective — many types of object (e.g. road, path, wall, river) have a stable width, or similar or analogous, and this will have a progressively smaller image size with greater distance. [“linear perspective” p170.]
• Texture — many objects have a known texture (or (assumptively) a regular texture), and the image size of the detail of the texture will decrease with distance. This works for groups of animals, and leaves, and the like, as well. [pp169-170.]
• Air tinting — the further away an object is, the more it is tinted blue by the interposing nitrogen. [“atmospheric perspective” p170. Martin and Foley also say that more distant objects appear “blurr/y/”. I would say that that is largely because we have less visual acuity for an object that is further away (because the image size is smaller); Martin and Foley put it down to interference by air particles.]
• The physics of movement — speed and acceleration can be informative. [Martin and Foley treat this area on pp174-176. I was thinking of… what I said. M&F mention “motion cues” (the class), “motion parallax” (which involves the subject moving laterally, and multiple stationary objects), “motion perspective” (which involves the subject moving towards or away, in any environment) and “kinetic depth effect” (which is about apparent 3D-ness in rotating objects, and is ostensibly not about distance).]
[Other monocular items that Martin and Foley mention… . • Shading [p170] — objects can cast shadows on other objects. I would say that this is primarily not about distance perception. • Interposition [p167] — the nearer object obscures part of the further object (or not). • Height cues [pp173-174] — my take on this is that, on flat ground (and because the viewer is above the ground), an object that is further away will have its base closer to the horizon and thus visually higher… and, for objects in the air, I would say here only that it is more complicated.]
Focus is very valuable for (for instance) catching a ball, whereas a task like driving [with one eye] has much more 3D information available. [Source: I worked this out using thinking. …So it’s me.]
As for having more than 2 eyes… having eyes spaced apart vertically (as well as having the existing horizontal spacing) would give additional occlusion information, but the gain would be minimal. [Source: I worked this out using thinking. …So it’s me.]
Source
The source is a custom book — that is, “Custom Book”, which is, “a compilation of chapters from… [existing] Pearson Education Australia titles.” The publisher logo “Prentice Hall” appears, but only as a logo, and not on the cover.
The Custom Book is “PSYC236 Cognition and Perception”, ©2001, Pearson Education Australia Pty Ltd. It is “Sourced from: Sensation and Perception 4th Edition”, ©2001, Martin and Foley.
The cited material is from a chapter numbered 6, which appears to be its number in the original/source book; ditto for the page numbers.