Query Syntax

Transforms

Transforms are applied in order. The last entry should be a chart type. If no chart is specified, a table view of the data is shown.

Normally transforms will be one per line, but when used with HTCondorView.simple, they should be concatenated with ampersands ("&").

limit=n

Discard all data after n rows.

filter=key>=value1,value2,...

Show only rows where the column named key has a value >= value1 or value2 etc. The operator (>= in this example) can be one of =, <, >, <=, >=, <>, or !=. If you specify more than one value, they are combined in an OR configuration (ie. a row matches if any of the values match). If you provide more than one filter= clause, they are combined successively in an AND configuration (ie. a row matches only if all the filters are true).

q=value1,value2,...

Show only rows where any of the columns contain the substring value1 or value2 etc. If more than one value is specified, they are combined in an OR configuration. If you provide more than one q= clause, they are combined successively in an AND configuration.

order=[-]key1,[-]key2

Sort the table in order of key1 and then (if key1 is the same) by key2 etc. If a key starts with '-' that means to sort that key in descending order (largest to smallest).

extract_regexp=key=regexp(match)regexp

Search for the given regular expression in each row in the column named key. If it's found, replace the column with the substring in the (match) part of the regexp. For example, extract_regexp=ver=version-(.*) would replace a string version-25.4 in column ver with the string 25.4.

group=key1,key2,...;val1,val2,...

Summarize the table data by summing and counting. This operation works like a simplified version of SQL's "group by" clause. In the resulting output, the order of the columns will be key1,key2,...,val1,val2... and there will only be at most one row with any particular combination of key1,key2,... fields. The val columns will be summed (if they were already numerical) or counted (if they were non-numeric).

A clause like group=a,b;x,y (if x is a string and y is a number) would be equivalent to this in SQL: select a, b, count(x) x, sum(y) y from MyTable group by a, b

If you apply multiple group= clauses, it works like using multiple nested subqueries in SQL. (That is, the output of one group= clause becomes the MyTable of the next one.)

If you leave out the ';' and the val column names, the default is to automatically include all columns other than the key columns.

If you include the ';' but leave out the val column names, that means you don't want any value columns (so only the key fields will be included, and nothing will be summed or counted at that step). So group=a,b; (with a trailing semicolon) is equivalent to this in SQL: select a, b from MyTable group by a, b.

treegroup=key1,key2,...;[val1,[val2]]

Like group=, but produces an output table arranged hierarchically by each key1..keyn, so you can drill down. There can be zero, one, or two val columns; the first value is the size of each box in the tree view (if omitted, they are all the same size), and the second value is the colour of each box (if omitted, the colour varies with the size). treegroup= isn't really useful unless you also use chart=tree.

pivot=rowkeys...;colkeys...;valkeys...

A pivot= clause works like a group= clause, but creates a pivot table. Pivot tables are a bit complicated; the easiest way to learn about them is to play with an example.

The simplest way to think of a pivot table is like this: the values originally in the columns named by rowkeys end up down the left of the output table, one column per rowkey. The values originally in the columns named by colkeys end up as headings across the top of the output table, the values concatenated together with a space. The values originally in the columns named by valkeys end up as values in the body section of the output table. A pivot table is very handy when you have raw data in SQL-like format and you want to rearrange it to be suitable for charting (where each line in a line chart, say, is usually one column of the table).

If the rowkeys section is empty, the output will have exactly one row (with all the value fields counted or summed into that one row). If the colkeys section is empty, the pivot= operation is essentially equivalent to a group=rowkeys...;valkeys... operation. If the valkeys section is empty, there are no values used to calculate the table body, so it is equivalent to an group=rowkeys...; operation.

For example, given this data:

color	highlight	shape	size	count
green	black	cube	2	1
red	white	cube	1	4
red	white	sphere	2	4
green	black	sphere	2	2
green	white	sphere	1	5
red	white	cube	1	4
red	black	sphere	2	3
red	white	cube	1	3
green	white	cube	1	2
green	black	sphere	2	5
red	black	cube	2	2
red	white	cube	2	5
red	white	sphere	1	1
green	black	cube	1	4
red	black	cube	1	4
red	black	cube	2	2
green	white	sphere	1	3
red	white	cube	2	2
green	black	cube	2	5
red	white	cube	2	1

The pivot

                            pivot=color,highlight;shape,size;count

yields this table:

color	highlight	cube 2	cube 1	sphere 2	sphere 1
green	black	6	4	7	0
red	white	8	11	4	1
green	white	0	2	0	8
red	black	4	4	3	0

rename=keysrc=keydst

Change the column header from keysrc to keydst.

rename=col1,col2,...

Transpose numeric data, with new column headers given by the col. Only works with numeric data.

chart=charttype

Instead of showing a table of values, show a chart. The available charttypes are currently: stacked (a stacked area chart), line, spark, column, bar, pie, tree (see treegroup), candle, timeline, dygraph, dygraph+errors.

delta=key1,key2...

Replace entries for the named columns with the difference from the previous row.

unselect=key1,key2...

Remove the named columns.

yspread

For each field in a row that is a number, divide it by the sum of all numbers in that row.

finishtree

inverttree

cracktree

quantize

Functions

avg(l): Returns average (mean) of all entries in a list.
cat(l): Returns a string of all entries in a list joined together with spaces.
color(l): Returns a unique integer for each unique entry in the list.
count(l): Returns number of entries in a list
count_distinct: Returns number of distinct values in a list.
count_nz(l): Returns number of entries in a list that are not null or 0.
first(l): Returns first element from a list.
last(l): Returns last element from a list.
max(l): Returns largest entry from the list.
median(l): Returns median (middle-most) value from the list.
min(l): Returns smallest entry from the list.
only(l): If a list has only one item, returns the item. If the list is empty, returns null. Otherwise throws an error.
stddev(l): Returns standard deviation for all entries in a list.
sum(l): Returns sum of all entries in a list.

HTCondorView.simple

The following options are only used with HTCondorView.simple. In normal usage they are separate fields.

url=url: URL where data should be loaded from.
title=title: Add a title to the chart. Not used for tables.

Transforms
- limit
- filter
- q
- order
- extract_regexp
- group
- treegroup
- pivot
- rename
- transpose
- chart
- delta
- unselect
- yspread
- finishtree
- inverttree
- cracktree
- quantize
Functions
- avg
- cat
- color
- count
- count_distinct
- count_nz
- first
- last
- max
- median
- min
- only
- stddev
- sum
HTCondorView.simple