SourceForge.net. Fast, secure and Free Open Source software downloads

mechanize — Documentation

Full API documentation is in the docstrings and the documentation of urllib2. The documentation in these web pages is in need of reorganisation at the moment, after the merge of ClientCookie and ClientForm into mechanize.

Tests and examples

Examples

The front page has some introductory examples.

The examples directory in the source packages contains a couple of silly, but working, scripts to demonstrate basic use of the module.

See also the forms examples (these examples use the forms API independently of mechanize.Browser).

Tests

To run the tests:

python test.py

There are some tests that try to fetch URLs from the internet. To include those in the test run:

python test.py discover --tag internet

The urllib2 interface

mechanize exports the complete interface of urllib2. See the urllib2 documentation. For example:

import mechanize
response = mechanize.urlopen("http://www.example.com/")
print response.read()

Compatibility

These notes explain the relationship between mechanize, ClientCookie, ClientForm, cookielib and urllib2, and which to use when. If you’re just using mechanize, and not any of those other libraries, you can ignore this section.

  1. mechanize works with Python 2.4, Python 2.5, Python 2.6, and Python 2.7.

  2. When using mechanize, anything you would normally import from urllib2 should be imported from mechanize instead.

  3. Use of mechanize classes with urllib2 (and vice-versa) is no longer supported. However, existing classes implementing the urllib2 Handler interface are likely to work unchanged with mechanize.

  4. mechanize now only imports urllib2.URLError and urllib2.HTTPError from urllib2. The rest is forked. I intend to merge fixes from Python trunk frequently.

  5. ClientForm is no longer maintained as a separate package. The code is now part of mechanize, and its interface is now exported through module mechanize (since mechanize 0.2.0). Old code can simply be changed to import mechanize as ClientForm and should continue to work.

  6. ClientCookie is no longer maintained as a separate package. The code is now part of mechanize, and its interface is now exported through module mechanize (since mechanize 0.1.0). Old code can simply be changed to import mechanize as ClientCookie and should continue to work.

  7. The cookie handling parts of mechanize are in Python 2.4 standard library as module cookielib and extensions to module urllib2. mechanize does not currently use cookielib, due to the presence of thread synchronisation code in cookielib that is not present in the mechanize fork of cookielib.

API differences between mechanize and urllib2:

  1. mechanize provides additional features.

  2. mechanize.urlopen differs in its behaviour: it handles cookies, whereas urllib2.urlopen does not. To make a urlopen function with the urllib2 behaviour:

import mechanize
handler_classes = [mechanize.ProxyHandler,
mechanize.UnknownHandler,
mechanize.HTTPHandler,
mechanize.HTTPDefaultErrorHandler,
mechanize.HTTPRedirectHandler,
mechanize.FTPHandler,
mechanize.FileHandler,
mechanize.HTTPErrorProcessor]
opener = mechanize.OpenerDirector()
for handler_class in handler_classes:
opener.add_handler(handler_class())
urlopen = opener.open
  1. Since Python 2.6, urllib2 uses a .timeout attribute on Request objects internally. However, urllib2.Request has no timeout constructor argument, and urllib2.urlopen() ignores this parameter. mechanize.Request has a timeout constructor argument which is used to set the attribute of the same name, and mechanize.urlopen() does not ignore the timeout attribute.

UserAgent vs UserAgentBase

mechanize.UserAgent is a trivial subclass of mechanize.UserAgentBase, adding just one method, .set_seekable_responses() (see the documentation on seekable responses).

The reason for the extra class is that mechanize.Browser depends on seekable response objects (because response objects are used to implement the browser history).

I prefer questions and comments to be sent to the mailing list rather than direct to me.

John J. Lee, July 2010.