Across the pale parabola of joy...

Human Motion Capture Using Distributed Cameras

This webpage describes the NSF project on New Technology for Capture, Analysis and Visualisation of Human Movement Using Distributed Cameras, NSF ITR 0325715.

Introduction

National Science Foundation University of Maryland Stanford University New York University

This project is a collaboration with the Biomotion Laboratory at Stanford University and the Media Research Laboratory at New York University. The objective is to perform markerless motion capture of human subjects using multiple calibrated cameras. We use articulated shape models such as super-quadrics to represent the humans. The markerless motion capture procedure consists of estimating the human body model parameters for the human subject, initialising the pose for a set of frames, and tracking the pose for the complete sequence. One of the applications of the project is to analyse the gait of human subjects recovering from injuries.

People

Data acquisition

The data acquisition is performed in the Keck Lab and using the Hydra portable motion capture facility at the University of Maryland. It is possible to capture from 32 cameras synchronously using the Keck laboratory which is now semi-retired. The Hydra project is a portable system that uses ten colour cameras at 60 frames per second. The details of the capture facilities can be obtained from their respective webpages. Several 3-D laser scans and some markerless motion capture systems were obtained from the Biomotion Laboratory at Stanford University.

Articulated human body models

The articulated human body model is constructed of super-quadric body segments that are attached to each other at joints as can be seen in the figure below. The segments are connected to each other in a kinematic chain that allows three degrees of rotation at all joints except for the shoulder joint. The shoulder joint is a complex joint and hence we allow three degrees of translation (limited) as well as rotation to better model the complexity of the joint. The different body joints are labelled in the figure.

Voxel data

(a) Voxel data

3D model

(b) 3D super-quadric model with labels

Figure: Obtaining 3D super-quadrics model from scanned 3D model

This part is currently being updated.

Acquisition and initialisation of models

Segmentation in Eigenspace The first part of the problem is the model-driven segmentation in Laplacian Eigenspace. The input abstraction layer is voxels and the neighbourhood relationship of the voxels is used to compute the Laplacian of the adjacency graph. The nodes are then mapped to 6-D Laplacian eigenspace using the eigenvectors corresponding to the smallest non-zero eigenvalues of the Laplacian matrix. We show that this transformation maps segments whose lengths are greater than their thicknesses to 1-D curves in eigenspace. We can then fit splines to these 1-D curves and segment them at their joints. The two images on the left correspond to the 6-D eigenspace and we have segmented one segment by fitting a spline.

Model acquisition The second part of the project is to acquire a set of key frames where the voxels have been segmented and registered using a prior model and a probabilistic registration method. This set of key frames can be used to estimate the human body model in two steps: estimate a skeleton based human body model and joint locations using human body statistics and computed skeleton, and then fit a super-quadric model using the segmented voxels. The images (from left to right) denote the voxels (unsegmented), voxels (segmented), computed skeleton curve and estimated super-quadric skeleton model. Five frames were used to estimate the model.

Tracking using multiple cues

Tracking of a complex articulated object such as a human being is a difficult task and we need to use human models to obtain robust and accurate results.

In our approach we use multiple cameras and a human shape model to track the human motion. We use both the motion and tructural cues that we can obtain from the synchronized video in a predictor-corrector framework (Iterated Extended Kalman Filter). Motion cues, though reliable and and robust, by themselves are not sufficient because the error in the estimation tends to accumulate over time and we eventually lose track. Structural information such as edges and rough silhoettes do not suffer from that problem, but they are difficult to estimate and use in the estimation. They can be efficiently used if the initial estimate of the pose is close to the correct value. In our method we predict the change in pose using the motion cues and correct the pose using static cues such as edges and motion segmentation.

Publications

  • A. Sundaresan, R. Chellappa: "Markerless Motion Capture using Multiple Cameras", Computer Vision for Interactive and Intelligent Environments, (Eds. C. Jaynes and R. Collins), IEEE Press, 2006.
  • A. Sundaresan, R. Chellappa, "Model driven segmentation and registration of articulating humans in Laplacian Eigenspace", IEEE Transactions on Pattern Analysis and Machine Intelligence (submitted).
  • A. Sundaresan and R. Chellappa, "Segmentation and Probabilistic Registration of Articulated Body Models", International Conference on Pattern Recognition, Hong Kong, 2006. [Best Student Paper Award in Computer Vision and Image Analysis] [pdf]
  • A. Sundaresan and R. Chellappa, "Multi-camera Tracking of Articulated Human Motion Using Motion and Shape Cues ", Asian Conference on Computer Vision, 2006. [pdf]
  • A. Sundaresan and R. Chellappa, "Acquisition of Articulated Human Body Models using Multiple Cameras", IV Conference on Articulated Motion and Deformable Objects Andratx, Mallorca, Spain, 2006. [pdf]
  • A Sundaresan, A RoyChowdhury, and R Chellappa, "3D Modelling of Human Motion using Kinematic Chains and Multiple Cameras for Tracking", Eighth International Symposium on the 3-D Analysis of Human Movement, Tampa, Mar-Apr. 2004. [pdf ]
  • A Sundaresan, A RoyChowdhury, and R Chellappa, "Multiple View Tracking of Human Motion Modelled by Kinematic Chains", submitted to International Conference on Image Processing, 2004. [pdf ]

Last updated Mar 7, 2007.