My research is concerned with the visual processes involved in navigation. Real-time vision systems, artificial or natural, interpret the imagery they obtain with their sensors by developing space-time representations relating the images to the three-dimensional (3D) world. These representations are of various kinds; they are related to the task the system is engaged in and they span a wide spectrum, ranging from computationally simple to very complex ones. My work is devoted to the classification and study of these representations, with emphasis on their geometric aspects, for the purpose of both designing autonomous vision systems and explaining animal visual capabilities.
The unifying theme of my efforts originates in the philosophy advocated by Gibson. Since perception is direct; i.e it happens immediately , properties of the scene in view should be directly encoded in certain structures or patterns of different image measurements. I am applying computational principles in search of these structures and the corresponding representations. In particular, the technical problems I have investigated are related to basic processes in the perception of three-dimensional motion, shape and their relationship.
Geometric studies of motion fields obtained from image sequences revealed that 3D motion is encoded in form of global patterns in the image. By measuring qualitative properties of image motion along various appropriately grouped directions the classical structure from motion problem is turned into a pattern recognition problem. The patterns found have been used in the implementation of algorithms for motion-related navigational tasks in robotics systems and they also have been used in the creation and explanation of a class of illusions in human vision.
With regard to the estimation of visual shape and environmental layout, psychophysical experiments as well as computational considerations convincingly show that actual systems cannot estimate exact distances and shapes, but instead derive a distorted version of space. One of the reasons for this distortion is the difficulty in estimating the exact viewing geometry. In the case of 3D motion or stereo, if the three-dimensional viewing geometry is estimated incorrectly, a distorted version of space will in turn be computed. A study of the resulting transformation between perceptual space (computed space) and actual space, which amounts to a Cremona transformation, revealed a number of properties regarding the relationship of 3D motion and shape. This transformation has been shown to explain a large amount of data from psychophysical experiments on the perception of depth. Furthermore, since the visible surfaces have positive depth, by analyzing the geometry of the regions where space is distorted negatively and studying the conditions under which these regions become minimal, an algorithm independent error analysis for the structure from motion problem has been conducted. This analysis besides illuminating the instability of the problem also compares the performance of spherical and planar eyes with regard to the estimation of shape and motion.
Since metric shape cannot be computed in practice, vision systems have to compute a number of alternative space and shape representations for a hierarchy of visual tasks, ranging from obstacle avoidance through homing to object recognition. My current research interests are concerned with understanding the large spectrum of these representations. Furthermore, I am applying theoretical findings of my work to specific problems in navigation and in multimedia systems, specifically in the problems of video indexing in large data bases.
Work on Texture