Parrot is Copyright (C) 2003 Douglas Thain. This program is released under the GNU General Public License. See the file COPYING for details. This manual may be out of date. Please check the Parrot Web Page for the most recent version.
vi
like so:
% parrot vi /anonftp/ftp.cs.wisc.edu/RoadMap
Parrot is useful to users of distributed systems, because it frees them from rewriting code to work with new systems and relying on remote administrators to trust and install new software. Parrot is also useful to developers of distributed systems, because it allows rapid deployment of new code to real applications and real users that do not have the time, inclination, or permissions to build a kernel-level filesystem.
Parrot currently supports a variety of remote I/O systems, all detailed below. We welcome contributions of new remote I/O drivers from others. However, if you are working on a protocol driver please drop us a note so that we can make sure work is not duplicated.
Almost any application - whether static or dynmically linked,
standard or commercial, command-line or GUI - should work with
Parrot. There are a few exceptions. Because Parrot relies on
the Linux ptrace
interface
any program that relies on the ptrace interface cannot run under Parrot.
This means Parrot cannot run a debugger, nor can it run itself recursively.
In addition, Parrot cannot run setuid programs, as the operating
system considers this a security risk.
Like any software, Parrot is bound to have some bugs. Please check the known bugs page for the latest scoop.
Parrot might already be installed on your system.
To check for Parrot, simply run parrot -v
.
If you see the following message:
% parrot -v parrot version 0.9.5 built by thain@coral.cs.wisc.edu on Jul 16 2003 at 11:16:58
parrot: Command not found.
bin
directory to your path.
For example, to install in /home/fred/parrot:
% cd /home/fred % gunzip parrot-xxx-yyy.tar.gz % tar xvf parrot-xxx-yyy.tar % setenv PATH /home/fred/parrot/bin:$PATH
% gunzip parrot-xxx.tar.gz % tar xvf parrot-xxx.tar % cd parrot-xxx % ./configure --prefix /home/fred/parrot -with-globus-path /usr/local/globus ... % make % make install
parrot
command followed by any other Unix program. For example, to run a Parrot-enabled vi
, execute this command:
% parrot vi /anonftp/ftp.cs.wisc.edu/RoadMap
parrot
before every command you run, so try starting a shell with Parrot already loaded:
% parrot tcsh
% acroread /http/www.cs.wisc.edu/condor/doc/usenix_1.92.pdf % grep Yahoo /http/www.yahoo.com % set autolist % cat /anonftp/ftp.cs.wisc.edu/[Press TAB here]
example path | remote service | more info |
/http/www.yahoo.com/index.html | Hypertext Transfer Protocol | included |
/ftp/ftp.cs.wisc.edu/RoadMap | File Transfer Protocol | included |
/anonftp/ftp.cs.wisc.edu/RoadMap | Anonymous File Transfer Protocol | included |
/chirp/target.cs.wisc.edu/path | Condor Chirp I/O | included |
/gsiftp/ftp.globus.org/path | Globus Security + File Transfer Protocol | more info |
/nest/nest.cs.wisc.edu/path | Network Storage Technology | more info |
/rfio/host.cern.ch/path | Castor Remote File I/O | more info |
/dcap/dcap.cs.wisc.edu/pnfs/cs.wisc.edu/path | DCache Access Protocol | mode info |
ls
happy
by producing a bogus directory entry:
% parrot ls -la /http/www.yahoo.com/ -r--r--r-- 1 thain thain 0 Jul 16 11:50 /http/www.yahoo.comA less-drastic example is found in FTP. If you attempt to perform a directory listing of an FTP server, Parrot fills in the available information -- the file names and their sizes -- but again inserts bogus information to fill the rest out:
% parrot ls -la /anonftp/ftp.cs.wisc.edu total 0 -rwxrwxrwx 1 thain thain 2629 Jul 16 11:53 RoadMap -rwxrwxrwx 1 thain thain 1622222 Jul 16 11:53 ls-lR -rwxrwxrwx 1 thain thain 367507 Jul 16 11:53 ls-lR.Z -rwxrwxrwx 1 thain thain 212125 Jul 16 11:53 ls-lR.gzIf you would like to get a better idea of the underlying behavior of Parrot, try running it with the
-d remote
option,
which will display all of the remote I/O operations that it performs
on a program's behalf:
% parrot -d remote ls -la /anonftp/ftp.cs.wisc.edu ... ftp.cs.wisc.edu <-- TYPE I ftp.cs.wisc.edu --> 200 Type set to I. ftp.cs.wisc.edu <-- PASV ftp.cs.wisc.edu --> 227 Entering Passive Mode (128,105,2,28,194,103) ftp.cs.wisc.edu <-- NLST / ftp.cs.wisc.edu --> 150 Opening BINARY mode data connection for file list. ...If your program is upset by the unusual semantics of such storage systems, then consider using the Chirp protocol and server, described in more detail below.
The simplest name resolver is the mountlist, given by the -m mountfile option. This file corresponds closely to /etc/ftsab in Unix. A mountlist is simply a file with two columns. The first column gives a logical directory or file name, while the second gives the physical path that it must be connected to.
For example, if a database is stored at an FTP server under the path /anonftp/ftp.cs.wisc.edu/db, it may be spliced into the filesystem under /dbase with a mount list like this:
/dbase /anonftp/ftp.cs.wisc.edu/dbInstruct Parrot to use the mountlist as follows:
% parrot -m mountfile tcsh % cd /dbase % ls -laA single mount entry may be given on the command line with the -M option as follows:
% parrot -M /dbase=/anonftp/ftp.cs.wisc.edu/db tcsh
Chirp is a simple protocol that corresponds closely to the traditional Unix I/O interface, include open(), read(), stat(), readdir(), and so forth. A standalone Chirp server can offer your programs fine-grained file access from anywhere on the network. A Chirp server is started as follows:
% chirp_server -d all -a my.authfileThe -d all option turns on debugging, which helps you to understand how it works initially. You may remove this option once everything is working. The -a my.authfile specifies a file which gives the authentication and authorization policy for the server. More on that in a minute.
Suppose the Chirp server is running on bird.cs.wisc.edu. Using Parrot, you may access all of the Unix features of that host from elsewhere:
% parrot tcsh % cd /chirp/bird.cs.wisc.edu/tmp % ls -la % ...
Naturally, one should be concerned about the security of such a service. The Chirp server has a flexible security policy which allows you to accept or deny users via one of several authentication schemes. The Chirp server may be a personal server for only you, or it may be run as the superuser and satisfy a number of users. It's up to you.
Here is a summary of the authentication schemes:
Type | Summary | Personal? | Multi-User? |
kerberos | Centralized private key system | no | yes (host cert) |
globus | Distributed public key system | yes (user cert) | yes (user cert) |
filesystem | Authenticate via a local or distributed filesystem. | yes | yes |
hostname | Reverse DNS lookup | yes | yes |
address | Identify by IP address | yes | yes |
Parrot will attempt all of the authentication types
it knows until it successfully connects to a Chirp server.
You must explicitly specify the security policy for the Chirp
server in an authfile, passed on the command line.
An example authfile is distributed with Parrot in
etc/chirp.authfile.example
.
Here's how it works. Each line in the file has four fields separated by colons: the authentication type, the permitted hostnames, the permitted remote users, and the corresponding local users. Asterisks may be used in the first three fields as wildcards. The fourth field must be either a valid local username or an asterisk, indicating that the local username is chosen by the authentication type. Each line in the file is compared against the calling user in order. If one matches, the user is accepted and assigned the username in the fourth field.
Here are some examples. Suppose that I wish to run a personal
server as an ordinary user thain
, and I am willing to trust
any user calling from two different hosts called red
and blue
, as well as any hosts that can authenticate
with my X.509 identity:
hostname:red.cs.wisc.edu:red.cs.wisc.edu:thain hostname:blue.cs.wisc.edu:blue.cs.wisc.edu:thain globus:*:/C=US/O=National Computational Science Alliance/CN=Douglas Thain:thainOr, suppose that I am running a server as the superuser, and I am willing to trust any user that can authenticate via Kerberos or via the local filesystem if on the same host. In addition, I consider any user on the host
operator.cs.wisc.edu
to be equivalent to the user named sysop
:
kerberos:*:*:* filesystem:bird.cs.wisc.edu:*:* hostname:operator.cs.wisc.edu:operator.cs.wisc.edu:sysopA Chirp server creates a new process for every incoming client. If the server is run as the superuser, the process will setuid to the id of the authenticated user. If the server is run as an ordinary user, it will check to make sure that the authenticated user matches the owner user, otherwise the connection is declined.
Each of the authentication types has a few things you should know:
Kerberos: The server will attempt to use the Kerberos identity of the host it is run on. (i.e. host/coral.cs.wisc.edu@CS.WISC.EDU) Thus, it must be run as the superuser in order to access its certificates.
Globus: The server and client will attempt to perform peer-to-peer authentication using the Grid Security Infrastructure. Both sides must have access to a proxy certificate by running grid-proxy-init.
Filesystem: This method makes use of an existing filesystem (local or distributed) to establish the client's identity. It assumes that both machines share the same conception of the user database and have a common directory which they can read and write. By default, the server will pick a filename in /tmp, and challenge the client to create that file. If it can, than the server will examine the owner of the file to determine the client's username. Naturally, /tmp will only be available to clients on the same machine. However, if a shared filesystem directory is available, give that to the chirp server via the -c option. Then, any authorized client of the filesystem can authenticate to the server. For example, at UW, we use -c /afs/cs.wisc.edu/common/tmp to authenticate via our AFS distributed file system.
Hostname: The server will rely on a reverse DNS lookup to establish the fully-qualified hostname of the calling client. The fourth field is then used to select an appropriate local username. Notice that the second and third fields of a 'hostname' line in the authfile must be identical. Address: Like "hostname" authentication, except the server simply looks at the client's IP address.
If you have difficulty getting authorization to work, we recommend that you run the Chirp server and the corresponding Parrot client with the -d auth option. This will show details of all the authentication methods attempted, as well as the lines in the authfile that are accepted or rejected, along with the reasons why.
By default, Parrot will attempt every authentication type that it knows until one succeeds. If you wish to restrict or re-order the authentication types that Parrot uses, give one or more -a options, naming the authentication types to be used, in order. For example, to attempt only hostname and kerberos authentication, in that order:
% parrot -a hostname -a kerberos tcsh
Option | Purpose | Environment Variable |
-a <list> | ||
-b <bytes> | Set the recommended remote I/O block size. | PARROT_LOCAL_BLOCK_SIZE |
-B <bytes> | Set the recommended local I/O block size. | PARROT_REMOTE_BLOCK_SIZE |
-c | Connect to the local Condor Chirp proxy. | PARROT_RESOLVE_CHIRP |
-C <MB> | Set the size of the I/O channel. | PARROT_CHANNEL_SIZE |
-d <system> | Enable debugging for this sub-system. | PARROT_DEBUG_FLAGS |
-h | Show this screen. | |
-m <file> | Use this file as a mountlist. | PARROT_MOUNT_FILE |
-M <local>=<remote> | Mount this remote file on this local directory. | |
-o <file> | Send debugging messages to this file. | PARROT_DEBUG_FILE |
-p <host:port> | Use this proxy for HTTP requests. | PARROT_HTTP_PROXY |
-t <dir> | Where to store temporary files. | PARROT_TEMP_DIR |
-v | Display version number. |
The flexible debugging flags can be a great help in both debugging and understanding Parrot. To turn on multiple debugging flags, you may either issue multiple -d options:
% parrot -d ftp -d chirp tcshOr, you may give a space separated list in the corresponding environment variable:
% setenv PARROT_DEBUG_FLAGS "ftp chirp" % parrot tcshHere is the meaning of each of the debug flags.
syscall | This shows all of the system calls attempted by each program, even those that Parrot does not trap or modify. (To see arguments and return values, try -d libcall instead.) |
libcall | This shows only the I/O calls that are actually trapped and implemented by Parrot. The arguments and return codes are the logical values seen by the application, not the underlying operations. (To see the underlying operations try -d remote or -d local instead.) |
cache | This shows all of the shared segments that are loaded into the channel cache and shared by multiple programs. For most programs, this means all the shared libraries. |
process | This shows all process creations, deletions, signals, and process state changes. |
resolve | This shows every invocation of the name resolver. A plain file name indicates the name was not modified, while more detailed records show names that were changed or denied access. |
local | This shows all local I/O calls from the perspective of Parrot. Notice that the file descriptors and file names shown are internal to Parrot. (To see fds and names from the perspective of the job, try -d libcall.) |
remote | This shows all non-local file activity. |
http | This shows only HTTP operations. |
ftp | This shows only FTP operations. |
nest | This shows only NeST operations. |
chirp | This shows only Chirp operations. |
rfio | This shows only RFIO operations. |
poll | This shows all activity related to processes that block (explicitly or implicitly) waiting for I/O. |
time | This adds the current time to every debug message. |
pid | This adds the calling process id to every debug message. |
all | This shows all possible debugging messages. |