Home
The aim of this project is to develop algorithms for fusion of 3D and video for applications such as remote surveillance, enhanced tele-presence and visualization etc. Given a terrain map of a location and ground video, the video can be registered to terrain map to generate texture-mapped 3D models of the environment. These can be virtually navigated. The 3D models are rendered using OpenGL. The software is written in Visual C++ and Matlab. For more information click here.
In this project, an algorithm to refine a coarse and partial 3D model (obtained from Digital Elevation Map (DEM)) using surface parallax was developed. Existing parallax based approaches comes under "plane+parallax" methods which assume the presence of a dominant plane in the scene. In "plane+parallax", the homography for the plane is estimated using which images can be warped toward a reference image. The resulting flow or parallax field is an epipolar field which is estimated and used to compute the depths of points relative to the plane.
The assumption of a dominant plane is not valid in several scenarios such as outdoor urban environments. We have developed a technique which does not require the above assumption. In this technique, we utilize the parallax field obtained by aligning a general non-planar 3D surface (obtained from the DEM or range sensor) to calculate the depths relative to the camera. For a planar surface, a homography exists which encapsulates the camera rotation and calibration effects. No such simple relationship exists for a non-planar surface. We require camera calibration for this approach. We make an assumption that a "small" planar surface is present in the scene for camera motion estimation. This planar surface can be identified by least square fitting of planes over N*N neighborhood on the reference surface.
The algorithm uses two frames (called key and offset frames). Using the estimated camera motion and given reference depths, the offset image is warped toward the key image. We make use of the fact that the parallax direction can be obtained from the FOE estimates and thus we need to estimate the parallax magnitude only to estimate correct depths. The parallax magnitude is estimated using a tensor based approach (TLS estimator).
Reference: Amit K Agrawal and Rama Chellappa, "3D Model Refinement using Surface Parallax", accepted in ICASSP 2004. ResultsThe above approach can deal well when the difference between initial reference surface and true surface is small enough to keep the parallax magnitude small. When the parallax magnitude increases, it may not be possible to estimate it at the finest level. The algorithm was extended to a multi-scale framework to deal with such cases.
A gaussian pyramid of intensity images and reference depth map is build. The camera motion is estimated as in above. We use the same camera motion at all levels. At the coarsest level, using the above approach, depths are refined using the reference depth and these are propagated to next higher level. Thus, the refined depth map from lower level becomes the initial reference surface for each level until the finest level is reached.
ResultsThe above
two approaches assume the
presence of
a small planar surface for camera motion estimation. They were designed
to target applications such as modeling of urban terrain in which the
assumption is quite valid. To deal with more general scenarios,
a different approach is required. We have developed an algorithm which
works well in outdoor as well as indoor environments and can robustly
handle significant depth variations and noise in available depth
information. This algorithm doesn't require any assumption of a planar
scene. The
algorithm has been tested on real and synthetic data (both outdoor and
indoor).
The
algorithm iteratively estimates the camera motion and refine
depths. Using the reference depth and brightness constancy
assumption, camera motion is estimated. The estimated camera motion
and reference depth is used to refined depths which in turn is used
to refine motion until the ego-motion parameters converge. The motion
estimate at each iteration can be used to constraint the
direction of the parallax field. Thus the aperture problem has be
resolved here. The parallax magnitude is estimated using two models
namely
A structure tensor based approach is used to estimate the parallax magnitude which provides a TLS solution. Effectively, the problem is formulated as a eigen-value analysis for CPM and a generalized eigen-value analysis for DBPM. Confidence measures based on eigen-values of the resulting system are proposed and are used to filter out regions containing incorrect depth estimates for robustly estimating the ego-motion in subsequent iterations.
Reference: accepted in ICIP 2004