CS 766 Assignment 3: Locality-constrained Linear Coding for Scene Classification

Saikat R. Gomes (saikat@cs.wisc.edu) & Stephen Lazzaro (slazzaro@cs.wisc.edu)

Contents

  1. Introduction
  2. Hard Code Word
    1. Results
  3. Locality-constrained Linear
    1. Results
  4. Grid Search
  5. Sequential Hierarchy Classifier
    1. Manually assigned clusters
      1. Results
    2. Clusters from K-means
      1. Results
  6. Other Dataset Evaluation
    1. Birds
    2. Butterflies
  7. Other Experiments
    1. Results
  8. Scene Datasets
  9. Code
  10. Git Logs
  11. References

Locality-constrained Linear Coding (LLC)

We also experimented with the Locality-constrained Linear Coding method which involves using the spatial pyramid matching scheme but with a soft codeword assignment. LLC consists of the following modifications to the original spatial pyramid code.
  1. Rather than assigning each SIFT descriptor to just 1 of the M clusters, find the k-nearest neighbors (out of the M clusters) for each of the SIFT descriptors in each image. We chose to use k = 5.

  2. Then, use those nearest neighbors to reconstruct the feature x to be an M x 1 vector where that vector has k non zero values. These k values will be normalized according to the relevant clusters' distances to the particular SIFT descriptor.

  3. Rather than using a sum pooling method and concatenating all of the features for each sub region in the spatial pyramid scheme, LLC uses a max pooling method. This involves taking the maximum c value (cluster assignment value) to construct the c vector for each pyramid level, and then normalizing it by the length of that c vector.
In using the LLC method, we found good results with the benefit of better computational speed!