next up previous contents index
Next: 3.13 Setting Up for Up: 3. Administrators' Manual Previous: 3.11 The High Availability   Contents   Index

Subsections


3.12 Quill

Quill is an optional component of Condor that maintains a mirror of Condor operational data in a relational database. The condor_quill daemon updates the data in the relation database, and the condor_dbmsd daemon maintains the database itself.

As of Condor version 7.5.5, Quill is distributed only with the source code. It is not included in the builds of Condor provided by UW, but it is available as a feature that can be enabled by those who compile Condor from the source code. Find the code within the condor_contrib directory, in the directories condor_tt and condor_dbmsd.


3.12.1 Installation and Configuration

Quill uses the PostgreSQL database management system. Quill uses the PostgreSQL server as its back end and client library, libpq to talk to the server. We strongly recommend the use of version 8.2 or later due to its integrated facilities of certain key database maintenance tasks, and stronger security features.

Obtain PostgreSQL from

http://www.postgresql.org/ftp/source/

Installation instructions are detailed in: http://www.postgresql.org/docs/8.2/static/installation.html

Configure PostgreSQL after installation:

  1. Initialize the database with the PostgreSQL command initdb.

  2. Configure to accept TCP/IP connections. For PostgreSQL version 8, use the listen_addresses variable in postgresql.conf file as a guide. For example, listen_addresses = '*' means listen on any IP interface.

  3. Configure automatic vacuuming. Ensure that these variables with these defaults are commented in and/or set properly in the postgresql.conf configuration file:
    # Turn on/off automatic vacuuming
    autovacuum = on
    
    # time between autovacuum runs, in secs
    autovacuum_naptime = 60
    
    # min # of tuple updates before vacuum
    autovacuum_vacuum_threshold = 1000
    
    # min # of tuple updates before analyze
    autovacuum_analyze_threshold = 500
    
    # fraction of rel size before vacuum
    autovacuum_vacuum_scale_factor = 0.4 
    
    # fraction of rel size before analyze
    autovacuum_analyze_scale_factor = 0.2
    
    # default vacuum cost delay for 
       # autovac, -1 means use 
       # vacuum_cost_delay
    autovacuum_vacuum_cost_delay = -1  
    
    # default vacuum cost limit for 
       # autovac, -1 means use
       # vacuum_cost_limit
    autovacuum_vacuum_cost_limit = -1
    

  4. Configure PostgreSQL to accept TCP/IP connections from specific hosts. Modify the pg_hba.conf file (which usually resides in the PostgreSQL server's data directory). Access is required by the condor_quill daemon, as well as the database users ``quillreader'' and ``quillwriter''. For example, to give database users ``quillreader'' and ``quillwriter'' password-enabled access to all databases on current machine from any machine in the 128.105.0.0/16 subnet, add the following:

    host all quillreader 128.105.0.0 255.255.0.0 md5
    host all quillwriter 128.105.0.0 255.255.0.0 md5

    Note that in addition to the database specified by the configuration variable QUILL_DB_NAME, the condor_quill daemon also needs access to the database "template1". In order to create the database in the first place, the condor_quill daemon needs to connect to the database.

  5. Start the PostgreSQL server service. See the installation instructions for the appropriate method to start the service at http://www.postgresql.org/docs/8.2/static/installation.html

  6. The condor_quill and condor_dbmsd daemons and client tools connect to the database as users ``quillreader'' and ``quillwriter''. These are database users, not operating system users. The two types of users are quite different from each other. If these database users do not exist, add them using the createuser command supplied with the installation. Assign them with appropriate passwords; these passwords will be used by the Quill tools to connect to the database in a secure way. User ``quillreader'' should not be allowed to create more databases nor create more users. User ``quillwriter'' should not be allowed to create more users, however it should be allowed to create more databases. The following commands create the two users with the appropriate permissions, and be ready to enter the corresponding passwords when prompted.

    /path/to/postgreSQL/bin/directory/createuser quillreader \
    	--no-createdb --no-createrole --pwprompt
    
    /path/to/postgreSQL/bin/directory/createuser quillwriter \
    	--createdb --no-createrole --pwprompt
    

    Answer ``no'' to the question about the ability for role creation.

  7. Create a database for Quill to store data in with the createdb command. Create this database with the ``quillwriter'' user as the owner. A sample command to do this is
    createdb -O quillwriter quill
    
    quill is the database name to use with the QUILL_DB_NAME configuration variable.

  8. The condor_quill and condor_dbmsd daemons need read and write access to the database. They connect as user ``quillwriter'', which has owner privileges to the database. Since this gives all access to the ``quillwriter'' user, its password cannot be stored in a public place (such as in a ClassAd). For this reason, the ``quillwriter'' password is stored in a file named .pgpass in the Condor spool directory. Appropriate protections on this file guarantee secure access to the database. This file must be created and protected by the site administrator; if this file does not exist as and where expected, the condor_quill and condor_dbmsd daemons log an error and exit. The .pgpass file contains a single line that has fields separated by colons and is properly terminated by an operating system specific newline character (Unix) or CRLF (Windows). The first field may be either the machine name and fully qualified domain, or it may be a dotted quad IP address. This is followed by four fields containing: the TCP port number, the name of the database, the "quillwriter" user name, and the password. The form used in the first field must exactly match the value set for the configuration variable QUILL_DB_IP_ADDR . Condor uses a string comparison between the two, and it does not resolve the host names to compare IP addresses. Example:
    machinename.cs.wisc.edu:5432:quill:quillwriter:password
    

After the PostgreSQL database is initialized and running, the Quill schema must be loaded into it. First, load the plsql programming language into the server:

createlang plpgsql [databasename]

Then, load the Quill schema from the sql files in the sql subdirectory of the Condor release directory:

psql [databasename] [username] < common_createddl.sql
psql [databasename] [username] < pgsql_createddl.sql
where [username] will be quillwriter.

After PostgreSQL is configured and running, Condor must also be configured to use Quill, since by default Quill is configured to be off.

Add the file .pgpass to the VALID_SPOOL_FILES variable, since condor_preen must be told not to delete this file. This step may not be necessary, depending on which version of Condor you are upgrading from.

Set up configuration variables that are specific to the installation, and check that the HISTORY variable is set.
HISTORY                 = $(SPOOL)/history
QUILL_ENABLED           = TRUE
QUILL_USE_SQL_LOG       = FALSE
QUILL_NAME              = some-unique-quill-name.cs.wisc.edu
QUILL_DB_USER           = quillwriter
QUILL_DB_NAME           = database-for-some-unique-quill-name
QUILL_DB_IP_ADDR        = databaseIPaddress:port
# the following parameter's units is in seconds
QUILL_POLLING_PERIOD    = 10
QUILL_HISTORY_DURATION 	= 30
QUILL_MANAGE_VACUUM     = FALSE
QUILL_IS_REMOTELY_QUERYABLE = TRUE
QUILL_DB_QUERY_PASSWORD =  password-for-database-user-quillreader
QUILL_ADDRESS_FILE      = $(LOG)/.quill_address
QUILL_DB_TYPE           = PGSQL
# The Purge and Reindex intervals are in seconds
DATABASE_PURGE_INTERVAL	= 86400
DATABASE_REINDEX_INTERVAL = 86400
# The History durations are all in days 
QUILL_RESOURCE_HISTORY_DURATION  = 7
QUILL_RUN_HISTORY_DURATION = 7
QUILL_JOB_HISTORY_DURATION = 3650
#The DB Size limit is in gigabytes
QUILL_DBSIZE_LIMIT      = 20
QUILL_MAINTAIN_DB_CONN  = TRUE
SCHEDD_SQLLOG           = $(LOG)/schedd_sql.log
SCHEDD_DAEMON_AD_FILE   = $(LOG)/.schedd_classad

The default Condor configuration file should already contain definitions for QUILL and QUILL_LOG. When upgrading from a previous version that did not have Quill to a new one that does, define these two configuration variables.

Only one machine should run the condor_dbmsd daemon. On this machine, add it to the DAEMON_LIST configuration variable. All Quill-enabled machines should also run the condor_quill daemon. The machine running the condor_dbmsd daemon can also run a condor_quill daemon. An example DAEMON_LIST for a machine running both daemons, and acting as both a submit machine and a central manager might look like the following:

DAEMON_LIST  = MASTER, SCHEDD, COLLECTOR, NEGOTIATOR, DBMSD, QUILL

The condor_dbmsd daemon will need configuration file entries common to all daemons. If not already in the configuration file, add the following entries:

DBMSD = $(SBIN)/condor_dbmsd
DBMSD_ARGS = -f
DBMSD_LOG = $(LOG)/DbmsdLog
MAX_DBMSD_LOG = 10000000

Descriptions of these and other configuration variables are in section 3.3.31. Here are further brief details:

QUILL_DB_NAME and QUILL_DB_IP_ADDR
These two variables are used to determine the location of the database server that this Quill would talk to, and the name of the database that it creates. More than one Quill server can talk to the same database server. This can be accomplished by letting all the QUILL_DB_IP_ADDR values point to the same database server.

QUILL_DB_USER
This is the PostgreSQL user that Quill will connect as to the database. We recommend ``quillwriter'' for this setting. There is no default setting for QUILL_DB_USER, so it must be specified in the configuration file.

QUILL_NAME
Each condor_quill daemon in the pool has to be uniquely named.

QUILL_POLLING_PERIOD
This controls the frequency with which Quill polls the job_queue.log file. By default, it is 10 seconds. Since Quill works by periodically sniffing the log file for updates and then sending those updates to the database, this variable controls the trade off between the currency of query results and Quill's load on the system, which is usually negligible.

QUILL_RESOURCE_HISTORY_DURATION , QUILL_RUN_HISTORY_DURATION , and QUILL_JOB_HISTORY_DURATION
These three variables control the deletion of historical information from the database. QUILL_RESOURCE_HISTORY_DURATION is the number of days historical information about the state of a resource will be kept in the database. The default for resource history is 7 days. An example of a resource is the ClassAd for a compute slot. QUILL_RUN_HISTORY_DURATION is the number of days after completion that auxiliary information about a given job will stay in the database. This includes user log events, file transfers performed by the job, the matches that were made for a job, et cetera. The default for run history is 7 days. QUILL_JOB_HISTORY_DURATION is the number of days after completion that a given job will stay in the database. A more precise definition is the number of days since the history ad got into the history database; those two might be different, if a job is completed but stays in the queue for a while. The default for job history is 3,650 days (about 10 years.)

DATABASE_PURGE_INTERVAL
As scanning the entire database for old jobs can be expensive, the other variable DATABASE_PURGE_INTERVAL is the number of seconds between two successive scans. DATABASE_PURGE_INTERVAL is set to 86400 seconds, or one day.

DATABASE_REINDEX_INTERVAL
PostgreSQL does not aggressively maintain the index structures for deleted tuples. This can lead to bloated index structures. Quill can periodically reindex the database, which is controlled by the variable DATABASE_REINDEX_INTERVAL. DATABASE_PURGE_INTERVAL is set to 86400 seconds, or one day.

QUILL_DBSIZE_LIMIT
Quill can estimate the size of the database, and send email to the Condor administrator if the database size exceeds this threshold. The estimate is checked after every DATABASE_PURGE_INTERVAL. The limit is given as gigabytes, and the default is 20.

QUILL_MAINTAIN_DB_CONN
Quill can maintain an open connection the database server, which speeds up updates to the database. However, each open connection consumes resources at the database server. The default is TRUE, but for large pools we recommend setting this FALSE.

QUILL_MANAGE_VACUUM
Set to False by default, this variable determines whether Quill is to perform vacuuming tasks on its tables or not. Vacuuming is a maintenance task that needs to be performed on tables in PostgreSQL. The frequency with which a table is vacuumed typically depends on the number of updates (inserts/deletes) performed on the table. Fortunately, with PostgreSQL version 8.1, vacuuming tasks can be configured to be performed automatically by the database server. We recommend that users upgrade to 8.1 and use the integrated vacuuming facilities of the database server, instead of having Quill do them. If the user does prefer having Quill perform those vacuuming tasks, it can be achieved by setting this variable to ExprTrue. However, it cannot be overstated that Quill's vacuuming policy is quite rudimentary as compared to the integrated facilities of the database server, and under high update workloads, can prove to be a bottleneck on the Quill daemon. As such, setting this variable to ExprTrue results in some warning messages in the log file regarding this issue.

QUILL_IS_REMOTELY_QUERYABLE
Thanks to PostgreSQL, one can now remotely query both the job queue and the history tables. This variable controls whether this remote querying feature should be enabled. By default it is True. Note that even if this is False, one can still query the job queue at the remote condor_schedd daemon. This variable only controls whether the database tables are remotely queryable.

QUILL_DB_QUERY_PASSWORD
In order for the query tools to connect to a database, they need to provide the password that is assigned to the database user ``quillreader''. This variable is then advertised by the condor_quill daemon to the condor_collector. This facility enables remote querying: remote condor_q query tools first ask the condor_collector for the password associated with a particular Quill database, and then query that database. Users who do not have access to the condor_collector cannot view the password, and as such cannot query the database. Again, this password only provides read access to the database.

QUILL_ADDRESS_FILE
When Quill starts up, it can place its address (IP and port) into a file. This way, tools running on the local machine do not need to query the central manager to find Quill. This feature can be turned off by commenting out the variable.


3.12.2 Four Usage Examples

  1. Query a remote Quill daemon on regular.cs.wisc.edu for all the jobs in the queue
    	condor_q -name quill@regular.cs.wisc.edu
    	condor_q -name schedd@regular.cs.wisc.edu
    
    There are two ways to get to a Quill daemon: directly using its name as specified in the QUILL_NAME configuration variable, or indirectly by querying the condor_schedd daemon using its name. In the latter case, condor_q will detect if that condor_schedd daemon is being serviced by a database, and if so, directly query it. In both cases, the IP address and port of the database server hosting the data of this particular remote Quill daemon can be figured out by the QUILL_DB_IP_ADDR and QUILL_DB_NAME variables specified in the QUILL_AD sent by the quill daemon to the collector and in the SCHEDD_AD sent by the condor_schedd daemon.

  2. Query a remote Quill daemon on regular.cs.wisc.edu for all historical jobs belonging to owner einstein.
    	condor_history -name quill@regular.cs.wisc.edu einstein
    

  3. Query the local Quill daemon for the average time spent in the queue for all non-completed jobs.
    	condor_q -avgqueuetime
    
    The average queue time is defined as the average of (currenttime - jobsubmissiontime) over all jobs which are neither completed (JobStatus == 4) or removed (JobStatus == 3).

  4. Query the local Quill daemon for all historical jobs completed since Apr 1, 2005 at 13h 00m.
    	condor_history -completedsince '04/01/2005 13:00'
    
    It fetches all jobs which got into the 'Completed' state on or after the specified time stamp. It use the PostgreSQL date/time syntax rules, as it encompasses most format options. See http://www.postgresql.org/docs/8.2/static/datatype-datetime.html for the various time stamp formats.


3.12.3 Quill and Security

There are several layers of security in Quill, some provided by Condor and others provided by the database. First, all accesses to the database are password-protected.

  1. The query tools, condor_q and condor_history connect to the database as user ``quillreader''. The password for this user can vary from one database to another and as such, each Quill daemon advertises this password to the collector. The query tools then obtain this password from the collector and connect successfully to the database. Access to the database by the ``quillreader'' user is read-only, as this is sufficient for the query tools. The condor_quill daemon ensures this protected access using the sql GRANT command when it first creates the tables in the database. Note that access to the ``quillreader'' password itself can be blocked by blocking access to the collector, a feature already supported in Condor.

  2. The condor_quill and condor_dbmsd daemons, on the other hand, need read and write access to the database. As such, they connect as user ``quillwriter'', who has owner privileges to the database. Since this gives all access to the ``quillwriter'' user, this password cannot be stored in a public place (such as the collector). For this reason, the ``quillwriter'' password is stored in a file called .pgpass in the Condor spool directory. Appropriate protections on this file guarantee secure access to the database. This file must be created and protected by the site administrator; if this file does not exist as and where expected, the condor_quill daemon logs an error and exits.

  3. The IsRemotelyQueryable attribute in the Quill ClassAd advertised by the Quill daemon to the collector can be used by site administrators to disallow the database from being read by all remote Condor query tools.


3.12.4 Quill and Its RDBMS Schema

Notes:

Although the current version of Condor does not support Oracle, we anticipate supporting it in the future, so Oracle support in this schema document is for future reference.


3.12.4.1 Administrative Tables

Attributes of currencies Table
Name O. Type P. Type Description
datasource varchar(4000) varchar(4000) Identifier of the data source.
lastupdate ts(3) w tz ts(3) w tz Time of the last update sent to the database from the data source.



Attributes of error_sqllogs Table
Name O. Type P. Type Description
logname varchar(100) varchar(100) Name of the SQL log file causing a SQL error.
host varchar(50) varchar(50) The host where the SQL log resides.
lastmodified ts(3) w tz ts(3) w tz The last modified time of the SQL log.
errorsql varchar(4000) text The SQL statement causing an error.
logbody clob text The body of the SQL log.
errormessage varchar(4000) varchar(4000) The description of the error.
INDEX: Index named error_sqllog_idx on (logname, host, lastmodified)



Attributes of maintenance_log Table
Name O. Type P. Type Description
eventts ts(3) w tz ts(3) w tz Time the event occurred.
eventmsg varchar(4000) varchar(4000) Message describing the event.



Attributes of quilldbmonitor Table
Name O. Type P. Type Description
dbsize integer integer Size of the database in megabytes.



Attributes of quill_schema_version Table
Name O. Type P. Type Description
major int int Major version number.
minor int int Minor version number.
back_to_major int int The major number of the old version this version is compatible to.
back_to_minor int int The minor number of the old version this version is compatible to.



Attributes of throwns Table
Name O. Type P. Type Description
filename varchar(4000) varchar(4000) The name of the log that was truncated.
machine_id varchar(4000) varchar(4000) The machine where the truncated log resides.
log_size numeric(38) numeric(38) The size of the truncated log.
throwtime ts(3) w tz ts(3) w tz The time when the truncation occurred.



3.12.4.2 Daemon Tables

Attributes of daemons_horizontal Table
Name O. Type P. Type Description
mytype varchar(100) varchar(100) The type of daemon ClassAd, e.g. ``Master''
name varchar(500) varchar(500) The name identifier of the daemon ClassAd.
lastreportedtime ts(3) w tz ts(3) w tz Time when the daemon last reported to Quill.
monitorselftime ts(3) w tz ts(3) w tz The time when the daemon last collected information about itself.
monitorselfcpuusage numeric(38) numeric(38) The amount of CPU this daemon has used.
monitorselfimagesize numeric(38) numeric(38) The amount of virtual memory this daemon has used.
monitorselfresidentsetsize numeric(38) numeric(38) The amount of physical memory this daemon has used.
monitorselfage integer integer How long the daemon has been running.
updatesequencenumber integer integer The sequence number associated with the update.
updatestotal integer integer The number of updates received from the daemon.
updatessequenced integer integer The number of updates that were in order.
updateslost integer integer The number of updates that were lost.
updateshistory varchar(4000) varchar(4000) Bitmask of the last 32 updates.
lastreportedtime_epoch integer integer The equivalent epoch time of last heard from.
PRIMARY KEY: (mytype, name)
NOT NULL: mytype and name cannot be null



Attributes of daemons_horizontal_history Table
Name O. Type P. Type Description
mytype varchar(100) varchar(100) The type of daemon ClassAd, e.g. ``Master''
name varchar(500) varchar(500) The name identifier of the daemon ClassAd.
lastreportedtime ts(3) w tz ts(3) w tz Time when the daemon last reported to Quill.
monitorselftime ts(3) w tz ts(3) w tz The time when the daemon last collected information about itself.
monitorselfcpuusage numeric(38) numeric(38) The amount of CPU this daemon has used.
monitorselfimagesize numeric(38) numeric(38) The amount of virtual memory this daemon has used.
monitorselfresidentsetsize numeric(38) numeric(38) The amount of physical memory this daemon has used.
monitorselfage integer integer How long the daemon has been running.
updatesequencenumber integer integer The sequence number associated with the update.
updatestotal integer integer The number of updates received from the daemon.
updatessequenced integer integer The number of updates that were in order.
updateslost integer integer The number of updates that were lost.
updateshistory varchar(4000) varchar(4000) Bitmask of the last 32 updates.
endtime ts(3) w tz ts(3) w tz End of when the ClassAd is valid.



Attributes of daemons_vertical Table
Name O. Type P. Type Description
mytype varchar(100) varchar(100) The type of daemon ClassAd, e.g. ``Master''
name varchar(500) varchar(500) The name identifier of the daemon ClassAd.
attr varchar(4000) varchar(4000) Attribute name.
val clob text Attribute value.
lastreportedtime ts(3) w tz ts(3) w tz Time when the daemon last reported to Quill.
PRIMARY KEY: (mytype, name, attr)
NOT NULL: mytype, name, and attr cannot be null



Attributes of daemons_vertical_history Table
Name O. Type P. Type Description
mytype varchar(100) varchar(100) The type of daemon ClassAd, e.g. ``Master''
name varchar(500) varchar(500) The name identifier of the daemon ClassAd.
lastreportedtime ts(3) w tz ts(3) w tz Time when the daemon last reported to Quill.
attr varchar(4000) varchar(4000) Attribute name.
val clob text Attribute value.
endtime ts(3) w tz ts(3) w tz End of when the ClassAd is valid.



Attributes of submitters_horizontal table
Name O. Type P. Type Description
name varchar(500) varchar(500) Name of the submitter ClassAd.
scheddname varchar(4000) varchar(4000) Name of the schedd where the submitter ad is from.
lastreportedtime ts(3) w tz ts(3) w tz Last time a submitter ClassAd was sent to Quill.
idlejobs integer integer Number of idle jobs of the submitter.
runningjobs integer integer Number of running jobs of the submitter.
heldjobs integer integer Number of held jobs of the submitter.
flockedjobs integer integer Number of flocked jobs of the submitter.



Attributes of submitters_horizontal_history table
Name O. Type P. Type Description
name varchar(500) varchar(500) Name of the submitter ClassAd.
scheddname varchar(4000) varchar(4000) Name of the schedd where the submitter ad is from.
lastreportedtime ts(3) w tz ts(3) w tz Last time a submitter ClassAd was sent to Quill.
idlejobs integer integer Number of idle jobs of the submitter.
runningjobs integer integer Number of running jobs of the submitter.
heldjobs integer integer Number of held jobs of the submitter.
flockedjobs integer integer Number of flocked jobs of the submitter.
endtime ts(3) w tz ts(3) w tz End of when the ClassAd is valid.



3.12.4.3 Files Tables

Attributes of files Table
Name O. Type P. Type Description
file_id int int Unique numeric identifier of the file.
name varchar(4000) varchar(4000) File name.
host varchar(4000) varchar(4000) Name of machine where the file is located.
path varchar(4000) varchar(4000) Directory path to the file.
acl_id integer integer Not yet used, null.
lastmodified ts(3) w tz ts(3) w tz Timestamp of the file.
filesize numeric(38) numeric(38) Size of the file in bytes.
checksum varchar(32) varchar(32) MD5 checksum of the file.
PRIMARY KEY: file_id
NOT NULL: file_id cannot be null



Attributes of fileusages Table
Name O. Type P. Type Description
globaljobid varchar(4000) varchar(4000) Global identifier of the job that used the file.
file_id int int Numeric identifier of the file.
usagetype varchar(4000) varchar(4000) Type of use of the file by the job, e.g., input, output, command.
REFERENCE: file_id references files(file_id)



Attributes of transfers Table
Name O. Type P. Type Description
globaljobid varchar(4000) varchar(4000) Unique global identifier for the job.
src_name varchar(4000) varchar(4000) Name of the file on the source machine.
src_host varchar(4000) varchar(4000) Name of the source machine.
src_port integer integer Source port number used for the transfer.
src_path varchar(4000) varchar(4000) Path to the file on the source machine.
src_daemon varchar(30) varchar(30) Condor demon performing the transfer on the source machine.
src_protocol varchar(30) varchar(30) The protocol used on the source machine.
src_credential_id integer integer Not yet used, null.
src_acl_id integer integer Not yet used, null.
dst_name varchar(4000) varchar(4000) Name of the file on the destination machine.
dst_host varchar(4000) varchar(4000) Name of the destination machine.
dst_port integer integer Destination port number used for the transfer.
dst_path varchar(4000) varchar(4000) Path to the file on the destination machine.
dst_daemon varchar(30) varchar(30) Condor daemon receiving the transfer on the destination machine.
dst_protocol varchar(30) varchar(30) The protocol used on the destination machine.
dst_credential_id integer integer Not yet used, null.
dst_acl_id integer integer Not yet used, null.
transfer_intermediary_id integer integer Not yet used, null; will use someday if a proxy is used.
transfer_size_bytes numeric(38) numeric (38) Size of the data transfered in bytes.
elapsed numeric(38) numeric(38) Number of seconds that elapsed during the transfer.
checksum varchar(256) varchar(256) Checksum of the file.
transfer_time ts(3) w tz ts(3) w tz Time when the transfer took place.
last_modified ts(3) w tz ts(3) w tz Last modified time for the file that was transfered.
is_encrypted varchar(5) varchar(5) (boolean) True if the file is encrypted.
delegation_method_id integer integer Not yet used, null.
completion_code integer integer Indicates whether the transfer failed or succeeded.



3.12.4.4 Interface Tables

Attributes of cdb_users Table
Name O. Type P. Type Description
userid varchar(30) varchar(30) Unique identifier of the user
password character(32) character(32) Encrypted password
admin varchar(5) varchar(5) (boolean) True if the user has administrator privileges




Attributes of l_eventtype Table
Name O. Type P. Type Description
eventtype integer integer Numeric type code of the event.
description varchar(4000) varchar(4000) Description of the type of event associated with the eventtype code.



Attributes of l_jobstatus Table
Name O. Type P. Type Description
jobstatus integer integer Numeric code for job status.
abbrev char(1) char(1) Single letter code for job status.
description varchar(4000) varchar(4000) Description of job status.
PRIMARY KEY: jobstatus
NOT NULL: jobstatus cannot be null



3.12.4.5 Jobs Tables

Attributes of clusterads_horizontal Table
Name O. Type P. Type Description
scheddname varchar(4000) varchar(4000) Name of the schedd the job is submitted to.
cluster_id integer integer Cluster identifier for the job.
owner varchar(30) varchar(30) User who submitted the job.
jobstatus integer integer Current status of the job.
jobprio integer integer Priority for this job.
imagesize numeric(38) numeric(38) Estimate of memory image size of the job in kilobytes.
qdate ts(3) w tz ts(3) w tz Time the job was submitted to the job queue.
remoteusercpu numeric(38) numeric(38) Total number of seconds of user CPU time the job used on remote machines.
remotewallclocktime numeric(38) numeric(38) Committed cumulative number of seconds the job has been allocated to a machine.
cmd clob text Path to and filename of the job to be executed.
args clob text Arguments passed to the job.
jobuniverse integer integer The Condor universe used by the job.
PRIMARY KEY: (scheddname, cluster_id)
NOT NULL: scheddname and cluster_id cannot be null



Attributes of clusterads_vertical Table
Name O. Type P. Type Description
scheddname varchar(4000) varchar(4000) Name of the schedd that the job is submitted to.
cluster_id integer integer Cluster identifier for the job.
attr varchar(2000) varchar(2000) Attribute name.
val clob text Attribute value.
PRIMARY KEY: (scheddname, cluster_id, attr)



Attributes of jobs_horizontal_history Table - Part 1 of 3
Name O. Type P. Type Description
scheddname varchar(4000) varchar(4000) Name of the schedd that submitted the job.
scheddbirthdate integer integer The birth date of the schedd where the job is submitted.
cluster_id integer integer Cluster identifier for the job.
proc_id integer integer Process identifier for the job.
qdate ts(3) w tz ts(3) w tz Time the job was submitted to the job queue.
owner varchar(30) varchar(30) User who submitted the job.
globaljobid varchar(4000) varchar(4000) Unique global identifier for the job.
numckpts integer integer Number of checkpoints written by the job during its lifetime.
numrestarts integer integer Number of restarts from a checkpoint attempted by the job in its lifetime.
numsystemholds integer integer Number of times Condor-G placed the job on hold.
condorversion varchar(4000) varchar(4000) Version of Condor that ran the job.
condorplatform varchar(4000) varchar(4000) Platform of the computer where the schedd runs.
rootdir varchar(4000) varchar(4000) Root directory on the system where the job is submitted from.
iwd varchar(4000) varchar(4000) Initial working directory of the job.
jobuniverse integer integer The Condor universe used by the job.
cmd clob text Path to and filename of the job to be executed.
minhosts integer integer Minimum number of hosts that must be in the claimed state for this job, before the job may enter the running state.
maxhosts integer integer Maximum number of hosts this job would like to claim.
jobprio integer integer Priority for this job.
negotiation_user_name varchar(4000) varchar(4000) User name in which the job is negotiated.
env clob text Environment under which the job ran.
userlog varchar(4000) varchar(4000) User log where the job events are written to.
coresize numeric(38) numeric(38) Maximum allowed size of the core file.
Table Continues on Next Page



Attributes of jobs_horizontal_history Table - Part 2 of 3
Name O. Type P. Type Description
killsig varchar(4000) varchar(4000) Signal to be sent if the job is put on hold.
stdin varchar(4000) varchar(4000) The file used as stdin.
transferin varchar(5) varchar(5) (boolean) For globus universe jobs. True if input should be transferred to the remote machine.
stdout varchar(4000) varchar(4000) The file used as stdout.
transferout varchar(5) varchar(5) (boolean) For globus universe jobs. True if output should be transferred back to the submit machine.
stderr varchar(4000) varchar(4000) The file used as stderr.
transfererr varchar(5) varchar(5) (boolean) For globus universe jobs. True if error output should be transferred back to the submit machine.
shouldtransferfiles varchar (4000) varchar(4000) Whether Condor should transfer files to and from the machine where the job runs.
transferfiles varchar(4000) varchar(4000) Depreciated. Similar to shouldtransferfiles.
executablesize numeric(38) numeric(38) Size of the executable in kilobytes.
diskusage integer integer Size of the executable and input files to be transferred.
filesystemdomain varchar(4000) varchar(4000) Name of the networked file system used by the job.
args clob text Arguments passed to the job.
lastmatchtime ts(3) w tz ts(3) w tz Time when the job was last successfully matched with a resource.
numjobmatches integer integer Number of times the negotiator matches the job with a resource.
jobstartdate ts(3) w tz ts(3) w tz Time when the job first began running.
jobcurrentstartdate ts(3) w tz ts(3) w tz Time when the job's current run started.
jobruncount integer integer Number of times a shadow has been started for the job.
filereadcount numeric(38) numeric(38) Number of read(2) calls the job made (only standard universe).
filereadbytes numeric(38) numeric(38) Number of bytes read by the job (only standard universe).
filewritecount numeric(38) numeric(38) Number of write calls the job made (only standard universe).
filewritebytes numeric(38) numeric(38) Number of bytes written by the job (only standard universe).
Table Continues on Next Page



Attributes of jobs_horizontal_history Table - Part 3 of 3
Name O. Type P. Type Description
fileseekcount numeric(38) numeric(38) Number of seek calls that this job made (only standard universe).
totalsuspensions integer integer Number of times the job has been suspended during its lifetime
imagesize numeric(38) numeric(38) Estimate of memory image size of the job in kilobytes.
exitstatus integer integer No longer used by Condor.
localusercpu numeric(38) numeric(38) Number of seconds of user CPU time the job used on the submit machine.
localsyscpu numeric(38) numeric(38) Number of seconds of system CPU time the job used on the submit machine.
remoteusercpu numeric(38) numeric(38) Number of seconds of user CPU time the job used on remote machines.
remotesyscpu numeric(38) numeric(38) Number of seconds of system CPU time the job used on remote machines.
bytessent numeric(38) numeric(38) Number of bytes sent to the job.
bytesrecvd numeric(38) numeric(38) Number of bytes received by the job.
rscbytessent numeric(38) numeric(38) Number of remote system call bytes sent to the job.
rscbytesrecvd numeric(38) numeric(38) Number of remote system call bytes received by the job.
exitcode integer integer Exit return code of the user job. Used when a job exits by means other than a signal.
jobstatus integer integer Current status of the job.
enteredcurrentstatus ts(3) w tz ts(3) w tz Time the job entered into its current status.
remotewallclocktime numeric(38) numeric(38) Cumulative number of seconds the job has been allocated to a machine.
lastremotehost varchar(4000) varchar(4000) The remote host for the last run of the job.
completiondate ts(3) w tz ts(3) w tz Time when the job completed; 0 if job has not yet completed.
enteredhistorytable ts(3) w tz ts(3) w tz Time when the job entered the history table.
PRIMARY KEY: (scheddname, scheddbirthdate, cluster_id, proc_id)
NOT NULL: scheddname, scheddbirthdate, cluster_id, and proc_id cannot be null
INDEX: Index named hist_h_i_owner on owner



Attributes of jobs_vertical_history Table
Name O. Type P. Type Description
scheddname varchar(4000) varchar(4000) Name of the schedd that submitted the job.
scheddbirthdate integer integer The birth date of the schedd where the job is submitted.
cluster_id integer integer Cluster identifier for the job.
proc_id integer integer Process identifier for the job.
attr varchar(2000) varchar(2000) Attribute name.
val clob text Attribute value.
PRIMARY KEY: (scheddname, scheddbirthdate, cluster_id, proc_id, attr)
NOT NULL: scheddname, scheddbirthdate, cluster_id, proc_id, and attr cannot be null



Attributes of procads_horizontal Table
Name O. Type P. Type Description
scheddname varchar(4000) varchar(4000) Name of the schedd that submitted the job.
cluster_id integer integer Cluster identifier for the job.
proc_id integer integer Process identifier for the job.
jobstatus integer integer Current status of the job.
imagesize numeric(38) numeric(38) Estimate of memory image size of the job in kilobytes.
remoteusercpu numeric(38) numeric(38) Total number of seconds of user CPU time the job used on remote machines.
remotewallclocktime numeric(38) numeric(38) Cumulative number of seconds the job has been allocated to a machine.
remotehost varchar(4000) varchar(4000) Name of the machine running the job.
globaljobid varchar(4000) varchar(4000) Unique global identifier for the job.
jobprio integer integer Priority of the job.
args clob text Arguments passed to the job.
shadowbday ts(3) w tz ts(3) w tz The time when the shadow was started.
enteredcurrentstatus ts(3) w tz ts(3) w tz Time the job entered its current status.
numrestarts integer integer Number of times the job has restarted.
PRIMARY KEY: (scheddname, cluster_id, proc_id)
NOT NULL: scheddname, cluster_id, and proc_id cannot be null



Attributes of procads_vertical Table
Name O. Type P. Type Description
scheddname varchar(4000) varchar(4000) Name of the schedd that submitted the job.
cluster_id integer integer Cluster identifier for the job.
proc_id integer integer Process identifier for the job.
attr varchar(2000) varchar(2000) Attribute name.
val clob text Attribute value.



3.12.4.6 Machines Tables

Attributes of machines_horizontal Table - Part 1 of 2
Name O. Type P. Type Description
machine_id varchar(4000) varchar(4000) Unique identifier of the machine.
opsys varchar(4000) varchar(4000) Operating system running on the machine.
arch varchar(4000) varchar(4000) Architecture of the machine.
state varchar(4000) varchar(4000) Condor state of the machine.
activity varchar(4000) varchar(4000) Condor job activity on the machine.
keyboardidle integer integer Number of seconds since activity has been detected on any keyboard or mouse associated with the machine.
consoleidle integer integer Number of seconds since activity has been detected on the console keyboard or mouse.
loadavg real real Current load average of the machine.
condorloadavg real real Portion of load average generated by Condor
totalloadavg real real  
virtualmemory integer integer Amount of currently available virtual memory in kilobytes.
memory integer integer Amount of RAM in megabytes.
totalvirtualmemory integer integer  
cpubusytime integer integer Time in seconds since cpuisbusy became true.
cpuisbusy varchar(5) varchar(5) (boolean) True when the CPU is busy.
currentrank real real The machine owner's affinity for running the Condor job which it is currently hosting.
clockmin integer integer Number of minutes passed since midnight.
clockday integer integer The day of the week.
lastreportedtime ts(3) w tz ts(3) w tz Time when the Condor central manager last received a status update from this machine.
enteredcurrentactivity ts(3) w tz ts(3) w tz Time when the machine entered the current activity.
enteredcurrentstate ts(3) w tz ts(3) w tz Time when the machine entered the current state.
updatesequencenumber integer integer Each update includes a sequence number.
Table Continues on Next Page



Attributes of machines_horizontal Table - Part 2 of 2
updatestotal integer integer The number of updates received from the daemon.
updatessequenced integer integer The number of updates that were in order.
updateslost integer integer The number of updates that were lost.
globaljobid varchar(4000) varchar(4000) Unique global identifier for the job.
lastreportedtime_epoch integer integer The equivalent epoch time of lastreportedtime.
PRIMARY KEY: machine_id



Attributes of machines_horizontal_history Table - Part 1 of 2
Name O. Type P. Type Description
machine_id varchar(4000) varchar(4000) Unique identifier of the machine.
opsys varchar(4000) varchar(4000) Operating system running on the machine.
arch varchar(4000) varchar(4000) Architecture of the machine.
state varchar(4000) varchar(4000) Condor state of the machine.
activity varchar(4000) varchar(4000) Condor job activity on the machine.
keyboardidle integer integer Number of seconds since activity has been detected on any keyboard or mouse associated with the machine.
consoleidle integer integer Number of seconds since activity has been detected on the console keyboard or mouse.
loadavg real real Current load average of the machine.
condorloadavg real real Portion of load average generated by Condor
totalloadavg real real  
virtualmemory integer integer Amount of currently available virtual memory in kilobytes.
memory integer integer Amount of RAM in megabytes.
totalvirtualmemory integer integer  
cpubusytime integer integer Time in seconds since cpuisbusy became true.
cpuisbusy varchar(5) varchar(5) (boolean) True when the CPU is busy.
currentrank real real The machine owner's affinity for running the Condor job which it is currently hosting.
clockmin integer integer Number of minutes passed since midnight.
clockday integer integer The day of the week.
lastreportedtime ts(3) w tz ts(3) w tz Time when the Condor central manager last received a status update from this machine.
enteredcurrentactivity ts(3) w tz ts(3) w tz Time when the machine entered the current activity.
enteredcurrentstate ts(3) w tz ts(3) w tz Time when the machine entered the current state.
updatesequencenumber integer integer Each update includes a sequence number.
Table Continues on Next Page



Attributes of machines_horizontal_history Table - Part 2 of 2
Name O. Type P. Type Description
updatestotal integer integer The number of updates received from the daemon.
updatessequenced integer integer The number of updates that were in order.
updateslost integer integer The number of updates that were lost.
globaljobid varchar(4000) varchar(4000) Unique global identifier for the job.
end_time ts(3) w tz ts(3) w tz The end of when the ClassAd is valid.



Attributes of machines_vertical Table
Name O. Type P. Type Description
machine_id varchar(4000) varchar(4000) Unique identifier of the machine.
attr varchar(2000) varchar(2000) Attribute name.
val clob text Attribute value.
start_time ts(3) w tz ts(3) w tz Time when this attribute-value pair became valid.
PRIMARY KEY: (machine_id, attr)
NOT NULL: machine_id and attr cannot be null



Attributes of machines_vertical_history Table
Name O. Type P. Type Description
machine_id varchar(4000) varchar(4000) Unique identifier of the machine.
attr varchar(4000) varchar(4000) Attribute name.
val clob text Attribute value.
start_time ts(3) w tz ts(3) w tz Time when this attribute-value pair became valid.
end_time ts(3) w tz ts(3) w tz Time when this attribute-value pair became invalid.



3.12.4.7 Matchmaking Tables

Attributes of matches Table
Name O. Type P. Type Description
match_time ts(3) w tz ts(3) w tz Time the match was made.
username varchar(4000) varchar(4000) User who submitted the job.
scheddname varchar(4000) varchar(4000) Name of the schedd that the job is submitted to.
cluster_id integer integer Cluster identifier for the job.
proc_id integer integer Process identifier for the job.
globaljobid varchar(4000) varchar(4000) Unique global identifier for the job.
machine_id varchar(4000) varchar(4000) Identifier of the machine the job matched with.
remote_user varchar(4000) varchar(4000) User that was preempted.
remote_priority real real The preempted user's priority.



Attributes of rejects Table
Name O. Type P. Type Description
reject_time ts(3) w tz ts(3) w tz Time when the job was rejected.
username varchar(4000) varchar(4000) User who submitted the job.
scheddname varchar(4000) varchar(4000) Name of the schedd that submitted the job.
cluster_id integer integer Cluster identifier for the job.
proc_id integer integer Process identifier for the job.
globaljobid varchar(4000) varchar(4000) Unique global identifier for the job.



3.12.4.8 Runtime Tables

Attributes of events Table
Name O. Type P. Type Description
scheddname varchar(4000) varchar(4000) Name of the schedd that submitted the job.
cluster_id integer integer Cluster identifier for the job.
proc_id integer integer Process identifier for the job.
globaljobid varchar(4000) varchar(4000) Global identifier of the job that generated the event.
run_id numeric(12,0) numeric(12,0) Identifier of the run that the event is associated with.
eventtype integer integer Numeric type code of the event.
eventtime ts(3) w tz ts(3) w tz Time the event occurred.
description varchar(4000) varchar(4000) Description of the event.



Attributes of generic_messages Table
Name O. Type P. Type Description
eventtype varchar(4000) varchar(4000) The type of event.
eventkey varchar(4000) varchar(4000) The key of the event.
eventtime ts(3) w tz ts(3) w tz The time of the event.
eventloc varchar(4000) varchar(4000) The location of the event.
attname varchar(4000) varchar(4000) The attribute name.
attval clob text The attribute value.
attrtype varchar(4000) varchar(4000) The attribute type.



Attributes of runs Table
Name O. Type P. Type Description
run_id numeric(12) numeric(12) Unique identifier of the run.
machine_id varchar(4000) varchar(4000) Identifier of the machine where the job ran.
scheddname varchar(4000) varchar(4000) Name of the schedd that submitted the job.
cluster_id integer integer Cluster identifier for the job.
proc_id integer integer Process identifier for the job.
spid integer integer Subprocess identifier for the job.
globaljobid varchar(4000) varchar(4000) Identifier of the job that was run.
startts ts(3) w tz ts(3) w tz Time when the job started.
endts ts(3) w tz ts(3) w tz Time when the job ended.
endtype smallint smallint The type of ending event.
endmessage varchar(4000) varchar(4000) The ending message.
wascheckpointed varchar(7) varchar(7) Whether the run was checkpointed.
imagesize numeric(38) numeric(38) The image size of the executable.
runlocalusageuser integer integer The time the job spent in usermode on execute machines (only standard universe).
runlocalusagesystem integer integer The time the job was in system calls.
runremoteusageuser integer integer The time the shadow spent working for the job.
runremoteusagesystem integer integer The time the shadow spent in system calls for the job.
runbytessent numeric(38) numeric(38) Number of bytes sent to the run.
runbytesreceived numeric(38) numeric(38) Number of bytes received from the run.
PRIMARY KEY: run_id
NOT NULL: run_id cannot be null



3.12.4.9 System Tables

Attributes of dummy_single_row_table Table
Name O. Type P. Type Description
a varchar(1) varchar(1) A dummy column.



Attributes of history_jobs_to_purge Table
scheddname varchar(4000) varchar(4000) Name of the schedd that submitted the job.
cluster_id integer integer Cluster identifier for the job.
proc_id integer integer Process identifier for the job.
globaljobid varchar(4000) varchar(4000) Unique global identifier for the job.



Attributes of jobqueuepollinginfo Table
Name O. Type P. Type Description
scheddname varchar(4000) varchar(4000) Name of the schedd that submitted the job.
last_file_mtime integer integer The last modification time of the file.
last_file_size numeric(38) numeric(38) The last size of the file in bytes.
last_next_cmd_offset integer integer The last offset for the next command.
last_cmd_offset integer integer The last offset of the current command.
last_cmd_type smallint smallint The last type of command.
last_cmd_key varchar(4000) varchar(4000) The last key of the command.
last_cmd_mytype varchar(4000) varchar(4000) The last my ClassAd type of the command.
last_cmd_targettype varchar(4000) varchar(4000) The last target ClassAd type.
last_cmd_name varchar(4000) varchar(4000) The attribute name of the command.
last_cmd_value varchar(4000) varchar(4000) The attribute value of the command.


next up previous contents index
Next: 3.13 Setting Up for Up: 3. Administrators' Manual Previous: 3.11 The High Availability   Contents   Index
condor-admin@cs.wisc.edu