Lab 5: Object-Oriented Programming (Classes and Modules)

Classes

5.0 Three Theories of Programming

There are three very well-established theories of how to structure a program. Placed roughly in chronological order, they are Imperative, Functional, and Object-Oriented programming.

  • Imperative programming envisions a program as an ordered list of instructions, which are executed one at a time. There is no concept of "scope". Repetition is arranged by telling the program where in the script to jump to.
  • Functional programming imagines a program as a collection of functions, each of which does a specific task. A program is a (possibly unnamed) function which breaks down its overall task into smaller pieces until we get to bite-sized tasks where we can easily complete them.
  • Object-oriented programming presents a program as a set of objects. An object "knows how to" do things we would expect it to do by writing functions within a template for that object, and each object may also have arbitrary properties that it stores. I like to use a language analogy here. A class (the "template" for an object) corresponds to a noun. Attributes within that class are adjectives. Methods and functions are verbs.

5.1 Introducing Classes

Let's look at the simplest possible class definition.

class HelloWorld():
  message = "Hello World"

We define a class using "class", followed by the name of the class, followed by parentheses, followed by a colon. Any code within the following block "belongs" to the class. We can access code within the block by prefacing our call with the name of the class:

print HelloWorld.message

Note the lack of parentheses after message. This is a variable, but it belongs to HelloWorld. We can also make variables that have HelloWorld type:

i = HelloWorld()
print i.message

I've done two sneaky things here. First, I called HelloWorld() like a function, but I didn't write any functions or methods inside the class. It may look like I am calling the class itself, but I did not. Each class has exactly one "constructor", a function that tells python how to make new objects of this class type. Since every class must have exactly one constructor, and I did not provide any, python gives me a default constructor.

Task:

  • Figure out what python's default constructor does. (Hint: What would you like it to do?)

The other sneaky thing I've done is to refer to message (a class variable) using i (the name of an instance). That leads us straight into...

5.2 Classes versus Instances

Let's talk classes and instances.

class SugarCookie:
  taste = "delicious!"
  health = "probably not very good for me"
  eaten = False

  def eat(self):
    if not eaten:
      print "Yum!"
      eaten = True

I have added a method to my SugarCookie class. The method has one parameter (self) referring to the current instance. All methods start with this parameter. If we try to run this code, we will get the complaint that eaten is a local variable, used before it is initialized. We can replace "eaten" with "SugarCookie.eaten" to refer to the class variable.

Let's try creating two cookies.

mine = SugarCookie()
yours = SugarCookie()
mine.eat()
print yours.eaten

I'll eat mine. But look: yours got eaten too! This is because eaten (and taste and health) are all class variables. They're shared among all instances of the class. Sometimes we want this behavior. In a minute, we'll make a cookie baking constructor, and we'll keep a counter of how many cookies are baked but uneaten. What do we do though if we don't want an attribute to be shared? The keyword "self" refers to the particular instance the call to the function belongs to. Let's modify eaten to self.eaten.

class SugarCookie:
  taste = "delicious!"
  health = "probably not very good for me"
  eaten = False

  def eat(self):
    if not self.eaten:
      print "Yum!"
      self.eaten = True

When I eat a SugarCookie, I don't want to eat all of the SugarCookies, I just want to eat one of them. Let's again try creating two of these objects and eating one of them. Notice that when I write self.eaten = True I overwrite the class variable eaten (technically shared by all cookies) with a new instance variable also called eaten. So I can see that the cookie I didn't eat still has eaten == False, while the cookie I did eat has eaten == True. Big picture: Class variables are shared between all objects that share a particular type. So all SugarCookies have a taste of "delicious!". If I write SugarCookie.taste = "bad batch, ew", then all SugarCookies have a "bad batch, ew" taste. Instance variables are particular to the specific object they call home. Eating one cookie has no effect on other cookies.

5.3 The Constructor

The constructor is called "__init__" (it has 2 underscores on either side). It always takes "self" as its first parameter, and has as many other parameters as inputs you want to give to it.

class GingerSnap:
  taste = "delicious!"
  count = 0
  def __init__(self):
    self.eaten = False
    GingerSnap.count +=1
  def eat(self):
    if not self.eaten:
      print "Yum!"
      self.eaten = True
      GingerSnap.count -= 1

Remember, you can only have one constructor per class. If I want to allow multiple ways to create an object of a given type, I can use optional parameters in my constructor:

class Envelope:
  def __init__(self,addr,returnAddr,stamp,contents=[]):
    self.addr = addr
    self.returnAddr = returnAddr
    self.stamp = stamp
    self.contents = contents

This Envelope class must be given an address, return address, and stamp when it is created. You can also give it contents, but if you leave that out python will assume that the contents are an empty list.

5.4 Static Methods

In addition to methods, which refer to their owning instance and are always called with statements like myCookie.eat(), we can also write functions (called "static methods") within a class. A static method does not care which instance owns it. You can still call a static method from an instance like any other method, but you can also call it from the class name. Let's modify our GingerSnap example.

class GingerSnap:
  taste = "delicious!"
  count = 0
  def __init__(self):
    self.eaten = False
    GingerSnap.count +=1
  def eat(self):
    if not self.eaten:
      print "Yum!"
      self.eaten = True
      GingerSnap.count -= 1
  @staticmethod
  def displayCount():
    print "There are",GingerSnap.count,"cookies left."

We need the @staticmethod statement immediately above the static method's definition in order to prevent python from complaining. Make sure you don't try to access any instance variables from inside a static method.

5.5 HL Reader Version 3

When we created our readML() functions, it wasn't really clear how to store the data in a way where other programs could use that data. Build a class called DataSet with instance attributes that are the various pieces of information you want to store about a dataset.

  1. Write a static method called help() that prints out a message explaining how to access the data in your DataSet class. Show this to the person next to you before you continue. It's always a good idea to get a second opinion on your interface design before you start seriously coding!
  2. Write a constructor for your DataSet type. (Hint: Your readML function could be useful here.)
  3. Since we are not trying to process the data yet, you should not need any extra methods. Test your class using the titanic_fatalities and dice_game data sets from last week.

Modules

5.6 Using Modules

Modules are essentially code files. You make them available using an import statement. Once you have imported a module, all subsequent code can access the classes, functions, and variables contained in the module file.

import numpy
print numpy.pi

The simplest form of the import statement is shown above. Generally, you want to make any needed import statements at the top of your program (before even any class or function definitions) and add a comment explaining why you need the import. The latter is important because it is very easy with larger programs to develop an enormous list of import statements that you do not actually need.

from numpy import pi
print pi

If you don't want to import all of a large module, then you can use a from-import statement like this one. Here, I am just importing the variable pi from the module. Notice that I don't use numpy.pi to refer to the variable after the import statement. I have essentially added pi from numpy to my local module.

import numpy as np
print np.pi

Sometimes you want all of a module, but you don't want to use its full name. An import-as statement lets you (locally) rename a module. np is a pretty common renaming of numpy. Since you may be writing "NAMEOFMODULE." quite a lot, this is a rare case where it can be a good idea to shrink the module name down beyond readability.

Task:

  • Close your console and start up a new one. Verify that you cannot create a SugarCookie in your console. Now import the file containing your Cookie classes and use it to create some cookies.
  • There is an unbelievable amount of useful code out there. Spyder's autocomplete tool helps you survey the functions and attributes available through a module. Use the remaining class time to explore pre-existing modules. Identify an interesting function. Figure out how to use it. When you're ready, share your discovery with the class. I recommand looking in the following modules: numpy, math, csv, sklearn (Note: The last one is a machine learning toolkit. There are some very useful tools in there, but you should expect to not be allowed to use them in the fall.)

5.7 Pickling

We use the term "pickle" to refer to packing up and storing program data. There is a python module (pickle) that includes utility functions to do this. We're going to look at a version of that module called "cPickle" which was retooled for increased performance. If you're using python 3, just use pickle.

import cPickle as pickle

To test the pickling process, start by creating an object. (A DataSet object or a SugarCookie object should be good for testing, but anything will do.) Call the following statement to store the pickled version in a file, and the statement after it to unpickle a pickled object:

pickle.dump( myCookie, open(filename, 'wb'))
cookieCopy = pickle.load(open(filename, 'rb'))

The 'wb' and 'rb' modes indicate binary write and read. When dealing with files that are not text files, the lines used in standard reading and writing are not as useful. As a general rule, if you wrote a file using 'wb', read it using 'rb'.

Warning: Do not open pickled files you did not make yourself. This is a really good way to get a computer virus.