Initial Background Plate Finding

The background plate is created from 50 images, each sampled at every “Nth” frame of video sequence. Fifty was chosen arbitrarily, as we do not see major performance tradeoffs for memory and speed when using a larger number of frames. The choice of sampling frequency N is then decided by the video length.

We use a “max filter” to derive each pixel in the background plate, inspired by [Haritaoglu]’s median filter. In [Haritaoglu], each initial background plate pixel is the median of a sequence of sampled images. We consider a background pixel as the most common pixel value over time. We have verified its slight superiority and ease in computation empirically by comparing results from a max filter with a median and an average filter.

From experiment, we also find blurring the background plate improves segmentation results.

Figure 3 Sample frames from a lecture sequence

Background plate from every 50^th frames, blurred

Figure 4 Background detected for above sequence

Motion Foreground Segmentation From the Background

The following flow char is followed.

First, a frame is subtracted from the background plate. This is the initial foreground. A binary mask is constructed identifying each pixel as either foreground or background. We find empirically a difference threshold of 10-15 works best. The result contains a lot of noise.
Then a Gaussian blur is applied on the foreground to eliminate much noise.
Further noise reduction is performed using Expand and Shrink operators as described in [Jain].

Step 1. Differencing (threshold = 10)

Step 2. Blur foreground

Step 3. Foreground noise reduction

Figure 5 Foreground segmentation steps

[ 1 | 2 | 3 | 4 ]