Logo of Zeyuan's personal website

Table Of Contents

About this site

The power of Introspection

Introspection:

code looking at other modules and functions in memory as objects, getting information about them, and manipulating them.

4.1. Diving In

Example 4.1. apihelper.py (download here)

# info takes any object that has functions or methods (like a module, which has
# functions, or a list,which has methods) and prints out the functions and their doc
# strings.

# collapse: multi-line doc strings are collapsed into a single long line ( if collapse == 1)
# spacing: adjust the display of function names to make it easy to read

def info(object, spacing=10, collapse=1):
"""Print methods and doc strings.

Takes module, class, list, dictionary, or string."""
        methodList = [e for e in dir(object) if callable(getattr(object, e))]
        processFunc = collapse and (lambda s: " ".join(s.split())) or (lambda s: s)
        print "\n".join(["%s %s" % (method.ljust(spacing),processFunc(str(getattr(object, method).__doc__)))
                        for method in methodList])

if __name__ == "__main__":
    print help.__doc__

Example 4.2 Sample Usage of apihelper.py

In [18]: from apihelper import info

In [20]: li = []

In [21]: info(li)
append     L.append(object) -- append object to end
count      L.count(value) -> integer -- return number of occurrences of value
extend     L.extend(iterable) -- extend list by appending elements from the iterable
index      L.index(value, [start, [stop]]) -> integer -- return first index of value. Raises ValueError if the value is not present.
insert     L.insert(index, object) -- insert object before index
pop        L.pop([index]) -> item -- remove and return item at index (default last). Raises IndexError if list is empty or index is out of range.
remove     L.remove(value) -- remove first occurrence of value. Raises ValueError if the value is not present.
reverse    L.reverse() -- reverse *IN PLACE*
sort       L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*; cmp(x, y) -> -1, 0, 1

Example 4.3 Advanced Usage of apihelper.py

In [25]: import odbchelper

In [26]: info(odbchelper)
buildConnectionString Build a connection string from a dictionary Returns string.

In [28]: info(odbchelper, 30)
buildConnectionString          Build a connection string from a dictionary Returns string.

In [29]: info(odbchelper, 30, 0)
buildConnectionString          Build a connection string from a dictionary

        Returns string.

4.2. Using Optional and Named Arguments

Python allows function arguments to have default values

  • if the function is called without the argument, the argument gets its default value.
  • arguments can be specified in any order by using named arguments.

Example 4.4.a example of info

# spacing and collapse are optional (they have default values defined)
# object is required (it has no default value)
def info (object, spacing=10, collapse=1);

Example 4.4.b Valid Calls of info

# With only one argument (object), spacing defaults to 10 and collapse defaults to 1.
info(odbchelper)

# With two arguments, collapse gets its default value of 1.
info(odbchelper, 12)

# here you are naming the collapse argument explicitly and specifying its value.
# spacing still gets its default value of 10
info(odbchelper, collapse=0)

# Required arguments (i.e., object) can be named, and named arguments can appear in any order
info(spacing=15, object=odbchelper)
  • Arguments are simply a dictionary. The “normal” method of calling functions without argument names is actually just a shorthand where Python matches up the values with the argument names in the order they are specified in the function of declaration.

4.3. Built-in Functions

key funcs

type(object), str(object), dir(object), callable(object)

4.3.1 type Function

  • the type function returns the datatype of any arbitrary object.
  • the possible types are listed in the types module.

Example 4.5 Introducing type

In [31]: type(1)
Out[31]: int

In [32]: li = []

In [33]: type(li)
Out[33]: list

In [34]: import odbchelper

In [35]: type(odbchelper)
Out[35]: module

In [36]: import types

In [37]: type(odbchelper) == types.ModuleType
Out[37]: True
  • type takes anything and returns its datatype. Integers, strings, lists, dictionaries, tuples, functions, classes, modules, even types are acceptable.
  • You can use the constants in the types module to compare types of objects.

4.3.2 str Function

  • The str coerces data into a string.
  • Every datatype can be coerced into a string.

Example 4.6 Introducing str

In [40]: horsemen = ['war','pestilence','famine']

In [41]: horsemen
Out[41]: ['war', 'pestilence', 'famine']

In [42]: horsemen.append('Powerbuilder')

In [43]: str(horsemen)
Out[43]: "['war', 'pestilence', 'famine', 'Powerbuilder']"   # str works on list

In [44]: str(odbchelper)
Out[44]: "<module 'odbchelper' from 'odbchelper.py'>"   # str works on modules

In [45]: str(None)   # str works on None
Out[45]: 'None'

4.3.2 dir Function

  • The dir returns a list of the attributes and methods of any object: modules, functions, strings, lists, dictionaries, ...

Example 4.7 Introducing dir

In [47]: li = []

In [48]: dir(li)
Out[48]: ['__add__',
          '__class__',
          '__contains__',
          '__delattr__',
          '__delitem__',
          '__delslice__',
          '__doc__',
          '__eq__',
          '__format__',
          '__ge__',
          '__getattribute__',
          '__getitem__',
          '__getslice__',
          '__gt__',
          '__hash__',
          '__iadd__',
          '__imul__',
          '__init__',
          '__iter__',
          '__le__',
          '__len__',
          '__lt__',
          '__mul__',
          '__ne__',
          '__new__',
          '__reduce__',
          '__reduce_ex__',
          '__repr__',
          '__reversed__',
          '__rmul__',
          '__setattr__',
          '__setitem__',
          '__setslice__',
          '__sizeof__',
          '__str__',
          '__subclasshook__',
          'append',
          'count',
          'extend',
          'index',
          'insert',
          'pop',
          'remove',
          'reverse',
          'sort']

In [51]: import odbchelper

In [52]: dir(odbchelper)
Out[52]: ['__author__',
          '__builtins__',
          '__copyright__',
          '__date__',
          '__doc__',
          '__file__',
          '__license__',
          '__name__',
          '__package__',
          '__version__',
          'buildConnectionString']

4.3.3 callable Function

  • the callable function takes any object and returns True if the object can be called, or False otherwise.
  • Callable objects include functions, class methods, even classes themselves.

Example 4.8 Introducing callable

In [54]: import string

In [55]: string.punctuation
Out[55]: '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

In [56]: string.join
Out[56]: <function string.join>

In [57]: callable(string.punctuation)
Out[57]: False

In [58]: callable(string.join)
Out[58]: True

In [59]: print string.join.__doc__
join(list [,sep]) -> string

    Return a string composed of the words in list, with
    intervening occurrences of sep.  The default separator is a
    single space.

    (joinfields and join are synonymous)

usage of callable

By using the callable function on each of an object’s attributes, you can determine which attributes you care about (methods, functions, classes) and which you want to ignore (constants and so on) without knowing anything about the object ahead of time.

4.3.4 Built-in Functions

  • type, str, dir, and all the rest of Python’s built-in functions are grouped into a special module called __builtin__

Example 4.9 Built-in Attributes and Functions

In [60]: from apihelper import info

In [61]: import __builtin__

In [62]: info(__builtin__,20)
ArithmeticError      Base class for arithmetic errors.
AssertionError       Assertion failed.
AttributeError       Attribute not found.
BaseException        Common base class for all exceptions
BufferError          Buffer error.

[...skip...]

4.4. Getting Object References With getattr

key func

getattr(object, attr)

getattr allows you to get a reference to a function

Example 4.10 Introducing getattr

In [63]: li = ["Larry","Curly"]

In [64]: li.pop
Out[64]: <function pop>

In [65]: getattr(li,"pop")
Out[65]: <function pop>

In [66]: getattr(li,"append")("Moe")

In [67]: li
Out[67]: ['Larry', 'Curly', 'Moe']

In [68]: getattr({},"clear")
Out[68]: <function clear>

In [69]: getattr((),"pop")
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-69-e1701c6ecc00> in <module>()
----> 1 getattr((),"pop")

AttributeError: 'tuple' object has no attribute 'pop'
  • [64]: this gets a reference to the pop method of the list. Note that this is not calling the pop method; that would be li.pop().
  • [65]: This also returns a reference to the pop method, but this time, the method name is specified as a string argument to the getattr function.
  • [65]: getattr returns any attribute of any object. In this case, the object is a list, and the attribute is the pop method.
  • [66]: getattr(x,y) == x.y . In this case, getattr(li,”append”)(“Moe”) == li.append(“Moe”)

4.4.1 getattr with Modules

Example 4.11 The getattr Function in apihiper.py

In [71]: import odbchelper

In [72]: odbchelper.buildConnectionString
Out[72]: <function odbchelper.buildConnectionString>

In [73]: getattr(odbchelper,"buildConnectionString")
Out[73]: <function odbchelper.buildConnectionString>

In [74]: object = odbchelper

In [75]: method = "buildConnectionString"

In [76]: getattr(object,method)
Out[76]: <function odbchelper.buildConnectionString>

In [77]: type(getattr(object,method))
Out[77]: function

In [78]: import types

In [80]: type(getattr(object, method)) == types.FunctionType
Out[80]: True

In [81]: callable(getattr(object,method))
Out[81]: True
  • Using getattr, you can get the same reference to the same function. In general, getattr(object,”attribute”) is equivalent to object.attribute. If object is a module, then attribute can be anything defined in the module: a function, class, or global variable.

4.4.2. getattr As a Dispatcher

idea

For example, if you had a program that could output data in a variety of different formats, you could define separate functions for each output format and use a single dispatch function to call the right one.

Let’s imagine a program that prints site statisticss in HTML, XML, and plain text formats. The choice of output format could be specified on the command line, or stored in a configuration file. A statsout module defines three functions, output_html, output_xml, and output_text. Then the main program defines a single output function:

Example 4.12 Creating a Dispatcher with getattr

import statsout

# output function takes one required argument, data, and one optional argument, format.
# if format is not specified, it defaults to text
def output(data, format="text"):
    output_function = getattr(statsout, "output_%s" % format)
    return output_function(data)

advantage

This allows you to easily extend the program later to support other output formats, without changing this dispatch function. Just add another function to statsout named, for instance, output_pdf, and pass “pdf” as the format into the output function.

bug

There is no error checking. What happends if the user passes in a format that doesn’t have a corresponding function defined in statsout? Well, getattr will return None, which will be assigned to output_function instead of a valid function, and next line that attempts to call that function will crash and raise an exception. That’s bad.

solution

getattr takes an optional third argument, a default value.

Example 4.13 getattr Default Values

import statsout

def output(data, format="text"):

    # The third argument (i.e., statsout.output_text) is a default value
    # that is returned if the attribute or method specified by the second argument wasn't found.
    output_function = getattr(statsout, "output_%s" % format, statsout.output_text)
    return output_function(data)

4.5. Filtering Lists

a filtering mechanism, where some elements in the list are mapped while others are skipped entirely.

list filtering syntax

# any element for which the filter expression evaluates true will be included in the mapping.
# All other elements are ignored,
# so they are never put through the mapping expression and are not included in the output list.
[mapping-expression for element in source-list if filter-expression]

Example 4.14 Introducing List Filtering

In [1]: li = ["a","mpilgrim","foo","b","c","b","d","d"]

In [2]: [elem for elem in li if len(elem) > 1]
Out[2]: ['mpilgrim', 'foo']

In [3]: [elem for elem in li if elem != "b"]
Out[3]: ['a', 'mpilgrim', 'foo', 'c', 'd', 'd']

In [4]: [elem for elem in li if li.count(elem) == 1]
Out[4]: ['a', 'mpilgrim', 'foo', 'c']
  • [4]: filter would not eliminate duplicates from a list.

mechanism

The mapping expression in the previous example is simple (it just returns the value of each element). As Python loops through the list, it runs each element through the filter expression. If the filter expression is true, the element is mapped and the result of the mapping expression is included in the returned list.

4.6. The Peculiar Nature of and and or

and and or perform boolean logic, but they do not return boolean values; instead, they return one of the actual values they are comparing.

Example 4.15 Introducing and

In [1]: 'a' and 'b'
Out[1]: 'b'

In [2]: '' and 'b'
Out[2]: ''

In [3]: 'a' and 'b' and 'c'
Out[3]: 'c'
  • If all values are true in a boolean context, and returns the last value.
  • If any value is false in a boolean context, and returns the first false value.

Example 4.16 Introducing or

In [5]: 'a' or 'b'
Out[5]: 'a'

In [6]: '' or 'b'
Out[6]: 'b'

In [7]: '' or [] or {}
Out[7]: {}

In [8]: def sidefx():
   ...:     print "in sidefx()"
   ...:     return 1

In [9]: 'a' or sidefx()
Out[9]: 'a'
  • If any value is true, or returns that value immediately. ( or evaluates values only until it finds one that is true in a boolean context, and then it ignores the rest.)
  • If all values are false, or returns the last value.

4.6.1 Using the and-or Trick

usage

  • simplify the syntax
  • some cases in Python where if statements are not allowed (i.e., lambda functions)

Analogy with C

C:

/* expression evaluates to a if bool is true, and b otherwise*/
bool ? a : b

Python:

# expression evaluates to a if (bool and a) is true, and b otherwise
# requires a to be True
bool and a or b

Example 4.17 Introducing the and-or Trick

In [11]: a = "first"

In [12]: b = "second"

In [13]: 1 and a or b
Out[13]: 'first'

In [14]: 0 and a or b
Out[14]: 'second'

bug

If the value of a is false, the expression will not works as you would expect it to. In other words, The and-or trick, bool and a or b, will not work like the C expression bool ? a : b when a is false in boolean context

Example 4.18 When the and-or Trick Fails

In [15]: a = ""

In [16]: b = "second"

In [17]: 1 and a or b
Out[17]: 'second'

solution

The real trick behind the and-or trick, then, is to make sure that the value of a is never false. One common way of doing this is to turn a into [a] and b into [b], then taking the first element of the returned list, which will be either a or b.

Example 4.19 Using the and-or Trick safely

In [18]: a = ""

In [19]: b = "second"

In [20]: (1 and [a] or [b])[0]
Out[20]: ''
  • since [a] is a non-empty list, it is never false. Even if a is 0 or ‘’ or some other false value, the list [a] is true because it has one element.

4.7 Using lambda Functions

Python supports an interesting syntax that lets you define one-line mini-functions on the fly. Borrowed from Lisp, these so-called lambda functions can be used anywhere a function is required.

Example 4.20 Introducing lambda Functions

In [1]: def f(x):
   ...:     return x*2

In [2]: f(3)
Out[2]: 6

In [3]: g = lambda x: x*2

In [4]: g(3)
Out[4]: 6

In [5]: (lambda x: x*2)(3)
Out[5]: 6
  • [3]: This is a lambda function that accomplishes the same thing as the normal function above it.

  • [3]: lambda function:

    • function that takes any number of arguments (including optional arguments) and returns the value of a single expression
    • lambda functions cannot contain commands
    • cannot contain more than one expression
    • no parentheses around the argument list
    • return keyword is missing (it is implied, since the entire function can only be one expression)
    • function has no name, but it can be called through the variable it is assigned to

4.7.1 Real-World lambda Functions

Here are the lambda functions in apihelper.py:

# processFunc is now a function, but which function it is depends on the value of the collapse variable.
# If collapse is true, processFunc(string) will collapse whitespace;
# otherwise, processFunc(string) will return its argument unchanged

processFunc = collapse and (lambda s: "".join(s.split())) or (lambda s:s)

Example 4.21 split With No Arguments

In [1]: s = "this   is\na\ttest"

In [2]: print s
this   is
a       test

In [3]: print s.split()
['this', 'is', 'a', 'test']

In [5]: print " ".join(s.split())
this is a test
  • [2]: This is a multiline string, defined by escape characters instead of triple quotes. n is a carriage return, and t is a tab character
  • [3]: split without any arguments splits on whitespace. So three spaces, a carriage return, and a tab character are all the same.
  • [5]: You can normalize whitespace by splitting a string with split and then rejoining it with join, using a single space as a delimiter.
  • [5]: This is what the info function does to collapse multi-line doc strings into a single line.

4.8 Putting It All Together

print "\n".join(["%s %s" % (method.ljust(spacing), processFunc(str(getattr(object,method).__doc__)))
                for method in methodList])

Example 4.22 Getting a doc string Dynamically

In [29]: import odbchelper

In [30]: object = odbchelper

In [31]: method = 'buildConnectionString'

In [32]: getattr(object,method)
Out[32]: <function odbchelper.buildConnectionString>

In [33]: print getattr(object,method).__doc__
Build a connection string from a dictionary

        Returns string.

Example 4.23 Why Use str on a doc string?

In [35]: def foo(): print 2

In [36]: foo()
2

In [37]: foo.__doc__

In [38]: foo.__doc__ == None
Out[38]: True

In [39]: str(foo.__doc__)
Out[39]: 'None'
  • reason: processFunc is assuming a string argument and calling its split method, which would crash if you passed it None because None doesn’t have a split method.

Example 4.24 Introducing ljust

In [41]: s = 'buildConnectionString'

In [42]: s.ljust(30)
Out[42]: 'buildConnectionString         '

In [43]: s.ljust(20)
Out[43]: 'buildConnectionString'
# pads the string with spaces to the given length.
# if the given length is smaller than the length of the string, ljust will simply return the string unchanged.
# it never truncates string,
string.ljust(length)
  • This is what the info function uses to make two columns of output and line up all the doc strings in the second column.

Example 4.25 Printing a List

In [44]: li = ['a','b','c']

In [45]: print '\n'.join(li)
a
b
c