This work targets the recognition of articulated objects in single range images using techniques derived from invariance principles. Articulated objects are those that have components attached via joints and can move with respect to one another. The input consists of a single range image obtained from an unknown viewpoint of an unknown object in an unknown articulated position. The desired object is the identity of the object together with the articulation and viewpoint parameters. In order to make this problem tractable, this identification is done against a set of models and recognition is possible only if the model of the object is in this set.

There is a significant problem when we attempt to use invariance techniques to recognize objects with articulated components. When considering invariants, the basic question is 'invariant to what'? In other words, what is the transformation to which we want to find invariant qualities? For instance when an object can be seen from different viewpoints, we want viewpoint invariants. In range images, viewpoint invariants are in fact (scaled) Euclidean invariants. This is a well-defined group of transformations and it applies to any object we look at, i.e. the transformation group is independent of the object. Thus the viewpoint invariance can be applied generically to all objects.

When it comes to articulation, this is no longer the case. Each object has different articulation degrees of freedom (DOF), i.e. a different transformation group. Just by looking at an object we cannot identify its DOFs and therefore we do not know its transformation group. Therefore, while it is mathematically possible to find articulation invariants for each individual object, we cannot find generic articulation invariants that apply to all objects.

Since we cannot use real articulation invariants, our goal is to turn as many of the articulation DOFs as we can into generic viewpoint DOFs. The remaining DOFs, whose number will be small, are dealt with using suitable compression techniques.

The way to achieve this is to divide the object into smaller parts. Since we want to avoid explicit segmentation, these smaller parts are not necessarily the real object parts. They are arbitrary parts such as any part of the object that is included within a sphere of a certain center and radius. The sphere that contains a rigid body part has viewpoint DOFs but no articulation. The sphere that contains a joint has both viewpoint and articulation DOFs. However this joint has only one articulation parameter, namely the angle between the two segments of the arm. All other DOFs of these arm segments have been turned into viewpoint DOFs. We view the joint within our sphere as a separate object that is seen from some unknown viewpoint.

We can now use viewpoint invariance methods for each sub-object such as the joint, for each angle of the joint. We will thus obtain invariants that depend only on the angle of the joint. This will be a smooth function of the angle and it can be compressed using methods that take advantage of this smoothness, such as wavelets.

Another major advantage of the object division is that we can find so-called global invariants of each sub-object. These are invariants that depend on the whole sub-object rather than on isolated features such as points or lines. This achieves two purposes:

- we avoid the problem of feature extraction, with the high sensitivity associated with feature-based methods, and
- we avoid calculating global invariants of the whole object, which would be sensitive to occlusion and missing parts.

- This method does not require the segmentation of the object components or the detection of discrete features such as plane parameters from the range. The sub-division of the object into spheres is done independent of the object so as to reduce the number of articulation parameters.
- In order to achieve some degree of tolerance to both noise and occlusion, we avoid both global and local invariants. Instead, we use scale-space invariants and calculate invariant characteristics of object sub-parts for different sphere radii. We thus obtain invariants for sub-objects as functions of its articulation parameters.
- These invariant functions are compressed using multi-dimensional discrete wavelets and used for storage, matching and identification. Algorithms for sub-range decompression and iterative cooperative improvements have been developed to achieve efficient matching.

An overview of the work is provided as on-line viewgraphs and as a compressed Postscript file.

An * IEEE-PAMI * paper is available
here as a pdf file.
A technical report in a
Postscript file.