mechanize — Documentation
Full API documentation is in the docstrings and the documentation of urllib2
. The documentation in these web pages is in need of reorganisation at the moment, after the merge of ClientCookie and ClientForm into mechanize.
Tests and examples
Examples
The front page has some introductory examples.
The examples
directory in the source packages contains a couple of silly, but working, scripts to demonstrate basic use of the module.
See also the forms examples (these examples use the forms API independently of mechanize.Browser
).
Tests
To run the tests:
python test.py
There are some tests that try to fetch URLs from the internet. To include those in the test run:
python test.py discover --tag internet
The urllib2
interface
mechanize exports the complete interface of urllib2
. See the urllib2
documentation. For example:
import mechanize
response = mechanize.urlopen("http://www.example.com/")
print response.read()
Compatibility
These notes explain the relationship between mechanize, ClientCookie, ClientForm, cookielib
and urllib2
, and which to use when. If you’re just using mechanize, and not any of those other libraries, you can ignore this section.
mechanize works with Python 2.4, Python 2.5, Python 2.6, and Python 2.7.
When using mechanize, anything you would normally import from
urllib2
should be imported frommechanize
instead.Use of mechanize classes with
urllib2
(and vice-versa) is no longer supported. However, existing classes implementing theurllib2 Handler
interface are likely to work unchanged with mechanize.mechanize now only imports
urllib2.URLError
andurllib2.HTTPError
fromurllib2
. The rest is forked. I intend to merge fixes from Python trunk frequently.ClientForm is no longer maintained as a separate package. The code is now part of mechanize, and its interface is now exported through module mechanize (since mechanize 0.2.0). Old code can simply be changed to
import mechanize as ClientForm
and should continue to work.ClientCookie is no longer maintained as a separate package. The code is now part of mechanize, and its interface is now exported through module mechanize (since mechanize 0.1.0). Old code can simply be changed to
import mechanize as ClientCookie
and should continue to work.The cookie handling parts of mechanize are in Python 2.4 standard library as module
cookielib
and extensions to moduleurllib2
. mechanize does not currently usecookielib
, due to the presence of thread synchronisation code incookielib
that is not present in the mechanize fork ofcookielib
.
API differences between mechanize and urllib2
:
mechanize provides additional features.
mechanize.urlopen
differs in its behaviour: it handles cookies, whereasurllib2.urlopen
does not. To make aurlopen
function with theurllib2
behaviour:
import mechanize
handler_classes = [mechanize.ProxyHandler,
mechanize.UnknownHandler,
mechanize.HTTPHandler,
mechanize.HTTPDefaultErrorHandler,
mechanize.HTTPRedirectHandler,
mechanize.FTPHandler,
mechanize.FileHandler,
mechanize.HTTPErrorProcessor]
opener = mechanize.OpenerDirector()
for handler_class in handler_classes:
opener.add_handler(handler_class())
urlopen = opener.open
- Since Python 2.6,
urllib2
uses a.timeout
attribute onRequest
objects internally. However,urllib2.Request
has no timeout constructor argument, andurllib2.urlopen()
ignores this parameter.mechanize.Request
has atimeout
constructor argument which is used to set the attribute of the same name, andmechanize.urlopen()
does not ignore the timeout attribute.
UserAgent vs UserAgentBase
mechanize.UserAgent
is a trivial subclass of mechanize.UserAgentBase
, adding just one method, .set_seekable_responses()
(see the documentation on seekable responses).
The reason for the extra class is that mechanize.Browser
depends on seekable response objects (because response objects are used to implement the browser history).
I prefer questions and comments to be sent to the mailing list rather than direct to me.
John J. Lee, July 2010.