3D Modeling

Home

Fusion of Video and Digital Elevation Map (DEM) for Visualization

The aim of this project is to develop algorithms for fusion of 3D and video for applications such as remote surveillance, enhanced tele-presence and visualization etc. Given a terrain map of a location and ground video, the video can be registered to terrain map to generate texture-mapped 3D models of the environment. These can be virtually navigated. The 3D models are rendered using OpenGL. The software is written in Visual C++ and Matlab. For more information click here.

3D Model Refinement using Surface Parallax

In this project, an algorithm to refine a coarse and partial 3D model (obtained from Digital Elevation Map (DEM)) using surface parallax was developed. Existing parallax based approaches comes under "plane+parallax" methods which assume the presence of a dominant plane in the scene. In "plane+parallax", the homography for the plane is estimated using which images can be warped toward a reference image. The resulting flow or parallax field is an epipolar field which is estimated and used to compute the depths of points relative to the plane.

The assumption of a dominant plane is not valid in several scenarios such as outdoor urban environments. We have developed a technique which does not require the above assumption. In this technique, we utilize the parallax field obtained by aligning a general non-planar 3D surface (obtained from the DEM or range sensor) to calculate the depths relative to the camera. For a planar surface, a homography exists which encapsulates the camera rotation and calibration effects. No such simple relationship exists for a non-planar surface. We require camera calibration for this approach. We make an assumption that a "small" planar surface is present in the scene for camera motion estimation. This planar surface can be identified by least square fitting of planes over N*N neighborhood on the reference surface.

The algorithm uses two frames (called key and offset frames). Using the estimated camera motion and given reference depths, the offset image is warped toward the key image. We make use of the fact that the parallax direction can be obtained from the FOE estimates and thus we need to estimate the parallax magnitude only to estimate correct depths. The parallax magnitude is estimated using a tensor based approach (TLS estimator).

Reference:    Amit K Agrawal and Rama Chellappa, "3D Model Refinement using Surface Parallax", accepted in ICASSP 2004.   Results

Extension to Hierarchical Framework

The above approach can deal well when the difference between initial reference surface and true surface is small enough to keep the parallax magnitude small. When the parallax magnitude increases, it may not be possible to estimate it at the finest level. The algorithm was extended to a multi-scale framework to deal with such cases.

A gaussian pyramid of intensity images and reference depth map is build. The camera motion is estimated as in above. We use the same camera motion at all levels. At the coarsest level, using the above approach, depths are refined using the reference depth and these are propagated to next higher level. Thus, the refined depth map from lower level becomes the initial reference surface for each level until the finest level is reached.

Results

Robust Ego-Motion estimation and 3D modeling

The above two approaches assume the presence of a small planar surface for camera motion estimation. They were designed to target applications such as modeling of urban terrain in which the assumption is quite valid.  To deal with more general scenarios, a different approach is required. We have developed an algorithm which works well in outdoor as well as indoor environments and can robustly handle significant depth variations and noise in available depth information. This algorithm doesn't require any assumption of a planar scene. The algorithm has been tested on real and synthetic data (both outdoor and indoor).

The algorithm iteratively estimates the camera motion and refine depths. Using the reference depth and brightness constancy assumption, camera motion is estimated. The estimated camera motion and reference depth is used to refined depths which in turn is used to refine motion until the ego-motion parameters converge. The motion estimate at each iteration can be used to constraint the direction of the parallax field. Thus the aperture problem has be resolved here. The parallax magnitude is estimated using two models namely

  1. Constant parallax model (CPM): The parallax magnitude is assumed constant within a neighborhood for each pixel
  2. Depth Based Parallax Model (DBPM): The parallax magnitude is modeled on depths i.e. parallax magnitude is assumed to be a parametric function of Z for each pixel. Thus the depth refinement phase can have non-smooth depth refinement. From experiments, we observe that this performs much better as compared to CPM when noise and significant depth variations were present.

A structure tensor based approach is used to estimate the parallax magnitude which provides a TLS solution. Effectively, the problem is formulated as a eigen-value analysis for CPM and a generalized eigen-value analysis for DBPM. Confidence measures based on eigen-values of the resulting system are proposed and are used to filter out regions containing incorrect depth estimates for robustly estimating the ego-motion in subsequent iterations.

Reference: accepted in ICIP 2004

Results



Home