Date Speaker Title and Abstract
Aug 31, 2007
3:00 - 4:00 PM
A.V.W 4424
Dr. Manohar Mareboyana

Event Detection in Video Stream 

Background modeling is a complex issue in video surveillance.  We address a simple case of observing a parking lot using a fixed wireless webcam.  We propose knn-distance metric for region (cluster) characterization.  A set of temporal data belonging to every region is used to compute mean and standard deviation of the knn-distance at location (x,y).  The deviation of knn-distance of the region at time t, from the mean value of the training set is used to make a decision of occurrence of an event (anomaly).  We compare the performance of the proposed method with other mathematical models used in literature
Sep 7, 2007
3:00 - 4:00 PM
A.V.W 4424
Aswin Sankaranarayanan
Tracking and Recognition from Multi-Camera Networks
Visual tracking of objects has attracted considerable attention over a long time. The basic issues in many application domains have been well addressed with increasingly efficient solutions over a wide class of domains. Motivated by surveillance needs, there has been a growing trend towards systems involving multiple cameras. Common examples of this are transport networks like roadways and subways, public places like shopping malls. In this research proposal, we address the estimations problems in the context of systems employing a host of sensors, of which video cameras form an integral part. Specifically, we look at two kinds of systems: systems employing a sparse set of cameras, and systems consisting of collocated acoustic video nodes. In the context of video camera networks, our primary contribution is in the understanding of how random variables transform under projective transformations. While the geometric properties induced by the transformation have been well studied, there is a lack of understanding of the effect of the transformation on statistical entities. Specifically, we are interested in how the non-linearity of the transformation affect properties of the random variable. We show that Normal random variables transform to a mixture density containing a Cauchy component. This has implications in the choice of estimators used for inference problems.

Further analysis of the form of the mixture density allows us to justify the use of linear estimators when certain geometric relationships are satisfied. The underlying theory finds application in a wide range of applications tied with localization and estimation, especially in the context of multiple cameras observing a scene. The theory also finds application in vision tasks such as mosaicing, camera placement. We present a multi-target tracking system for collocated video and acoustic sensors. We formulate the tracking problem using a particle filter based on a state space approach, operating over the joint state space of both video and acoustic modality. For the joint operation of the filter, we combine the state vectors of the individual modalities and also introduce a time delay variable to handle the acoustic-video data synchronization issue, caused by acoustic propagation delays. A novel particle filter proposal strategy for joint state space tracking is introduced, which places the random support of the joint filter where the final posterior is likely to lie. By using the Kullback-Leibler divergence measure, it is shown that the joint operation of the filter decreases the worst case divergence of the individual modalities. The resulting joint
tracking filter is quite robust against video and acoustic occlusions.

In addition to these two settings, we also discuss verification algorithms for establishing identity across non-overlapping cameras. Finally, we study methods for making particle filtering, a popular tool for nonlinear
filtering more amenable for parallel implementation.
Sep 14, 2007
3:00 - 4:00 PM
A.V.W 4424
Ashok Veeraraghavan Dappled Photography: Mask Enhanced Cameras For Heterodyned Light Fields and Coded Aperture Refocussing

 [LowResPDF] [HighResPDF] [SlideShow] [Project Page] [Movie of SIGGRAPH Talk]

We describe reversible modulation of 4D light field by inserting a patterned planar mask in the optical path of a lens based camera. We can reconstruct the 4D light field from a 2D camera image without any additional lenses as required by previous light field cameras. The patterned mask attenuates light rays inside the camera instead of bending them, and the attenuation recoverably encodes the ray on the 2D sensor. Our mask-equipped camera focuses just as a traditional camera might to capture conventional 2D photos at full sensor resolution, but the raw pixel values also hold a modulated 4D light field. The light field can be recovered by rearranging the tiles of the 2D Fourier transform of sensor values into 4D planes, and computing the inverse Fourier transform.

We also show how a broad-band mask placed at the lens enables us to compute refocusing at full sensor resolution for images of layered Lambertian scenes. This partial encoding of 4D ray-space data enables editing of image contents by depth to remove or suppress unwanted occluders, yet does not require computational recovery of the complete 4D light field.

Sep 21, 2007
3:00 - 4:00 PM
A.V.W 4424
Volkan Cevher Compressive Sensing

The dogma of signal processing maintains that a signal must be sampled at a rate at least twice its highest frequency in order to be represented without error. However, in practice, we often compress the data soon after sensing, trading off signal representation complexity (bits) for some error (consider JPEG image compression in digital cameras, for example). Clearly, this is wasteful of valuable sensing resources. Over the past few years, a new theory of "compressive sensing" has begun to emerge, in which the signal is sampled (and simultaneously compressed) at a greatly reduced rate.

Compressive sensing is also referred to in the literature by the terms: compressed sensing, compressive sampling, and sketching/heavy-hitters.