SVM Light Demo

SVM Light is a C program by Thorsten Joachims that implements a support vector machine. This demo, which is compiled from the information at http://svmlight.joachims.org, shows how to use it as a classifier. First you need to install the SVM Light program by following the instructions below.

  1. mkdir svm_light
  2. Download svm_light.tar.gz from svmlight.joachims.org and put it in your svm_light directory.
  3. cd svm_light
  4. gunzip -c svm_light.tar.gz | tar xvf -
  5. make

Part 1 - Running the code

First we will download a simple example dataset: a text classification problem to learn which articles in a Reuters corpus are about corporate acquisitions.

  1. Download example1.tar.gz from svmlight.joachims.org.
  2. gunzip -c example1.tar.gz | tar xvf -

This will create a directory called example1 that contains the training examples (train.dat) and the test examples (test.dat). To learn a support vector classifier for the training examples and then use it to classify the test examples, do this:

  1. svm_learn example1/train.dat example1/model
  2. svm_classify example1/test.dat example1/model example1/predictions

This will produce output that tells you how accurate the classifier was on the test set.

Part 2 - Trying several kernels

SVM Light provides several kernels, such as linear, polynomial, radial basis function, and sigmoid. Which kernel you use is controlled via the -t parameters. To see all the available parameters, type 'svm_learn -?' (or 'svm_learn -/?' depending on your shell). These are the -t parameters:

Repeat the commands from Part 1 using each of these kernels and report the test set accuracy. Which kernel would you recommend for this dataset?

Part 3 - Varying parameter settings

The radial basis kernel has a parameter called gamma. To see all the available parameters, type 'svm_learn -?' or 'svm_learn -/?' in your shell.

Try varying gamma from 10-6 to 102 and report the test set accuracy. Which setting would you recommend for this dataset?