Correctly judging poses, sizes, and shapes of objects in a scene are functionally important components of scene understanding for biological and machine visual systems. A three-dimensional (3D) object seen from different views forms quite different retinal images, and in general many different 3D objects could form identical two-dimensional (2D) retinal images, so judgments based on retinal information alone are underspecified. However, the very frequent case of objects on the ground projected to retinal images is a 2D-to-2D mapping and an invertible trigonometric function