Operators for Niagara


Introduction

     With  the  new  execution  engine  in  Niagara the line
between Query Engine and Search Engine is blurred  consider-
ably,  perhaps  to  the  point of nonexistence.  Streams can
contain QE information such as XNodes, SE  information  such
as  Postings,  as well as inline values.  Many of the opera-
tions which can be performed upon  the  data  contained  are
generic  database  operations  which  are independent of the
type of data being processed.  Note, though, that there  are
operations  which make sense only in the context of one type
of data, for example contains only works with two  postings.

Random Notes


Outer Variants
     We  have  discovered that it is important to be able to
     perform outer joins (and other outer-operators too)  on
     streams  of  XML  data.   This  is required both in the
     query engine and in search engine processing.  The need
     for  this  arises from the semantics of XQuery and XML,
     where results may be required from many stages of query
     processing  -- even if no match occurs at that process-
     ing stage.  So far we have identified  such  things  as
     outer  contains,  and  outer  join as useful idioms for
     query processing.  We should be able to  support  outer
     joins on any specified number of inputs to an operator.
     Note that we may want to restrict which  operators  are
     capable  of  doing outer processing to keep implementa-
     tion and execution costs reasonable.

Stream Order
     Most if not all data produced from  the  Index  Manager
     and  the  Data Manager are presented as sorted in docu-
     ment order.  When multiple documents are retrieved from
     the  IM,  the  documents  are  presented in document id
     order.  These well known  properties  can  be  used  to
     advantage during query processing to allow simpler more
     efficient algorithms to be used in a  great  number  of
     cases.   These  algorithms  also produce results in the
     same order as the inputs, and  maintain  that  property
     for  future  operators.   However,  the  most efficient
     algorithms for some cases  may  be  ones  that  produce
     unordered  or partially ordered results.  When a execu-
     tion plan decides to use such an operator,  the  output
     stream must be marked as unordered.  The plan generator
     will then  need  to  use  other  algorithms  which  can









                             -2-


     operator  correctly  on unordered streams, or will need
     to insert a sort operation to reorder the tuples  prop-
     erly  to allow the use of order-assuming operators.  Of
     course some operator algorithms may work correctly with
     either  type of stream, and may even preserve the order
     which was present on the inputs.

Another thing to note about order is that it may only  apply
to  some  fields  of a tuple.  For example, dereferencing an
IDREF and extracting information from it may add  non  docu-
ment-order  fields  to a tuple, since the order in which IDs
are referenced relative to IDREFs is indeterminate.

Group and Aggregate
     It is useful to perform  both  grouping  and  aggregate
     operations  on  SE  as  well as QE data.  Note that the
     Search Engine requirements for grouping are for  simple
     numerical  grouping  and aggregation as in a relational
     system, not the more complex structural  grouping  that
     is possible at the XML or QE level.

Operators in General



Select


Project


Join

Order


Group




Select
     Select  evaluates predicates against the values or sub-
     trees which are present in the input tuple.   The  com-
     plexity of the select operator has been greatly reduced
     from the current Niagara as it no  longer  unnests,  it
     just does predicate evaluation.  This reduction in com-
     plexity allows select to  be  piggy-backed  on  top  of
     other  operators if desired.  The predicate that select
     evaluates may contain various  non-predicate  functions
     to  perform  data  conversion  as  needed for the query
     evaluation.











                             -3-


The above being said, there is one unnest-like  thing  which
must  be done That task is comparing the textual representa-
tion of document subtrees for predicate evaluation.  This is
something  which  XML  and XQuery rely on from time to time,
and must be executed.  This task will be performed by a spe-
cial incremental comparison operator which will realize only
as much of a subtree as is necessary to evaluate the  predi-
cate in question.  The constructed result from comparison to
comparison may be cached  and  reused  (or  evaluated  to  a
greater length) to perform  additional comparisons with.  If
the subtree  is  large  enough,  it  will  be  necessary  to
read/write it from disk to allow processing of this query to
control memory use of the operator.

Adding subtree hash values to the DM  will  allow  filtering
equality  comparisons  of  subtrees.   In  other words, only
potential matches must be instantiated for  actual  compari-
son,  instead of needing to start guaranteed-fail instantia-
tions for all possible matches.

Project
     As is expected, project projects values from the  input
     tuple  to  the  output  tuple.  One interesting need we
     have discovered  is  the  need  for  a  predicate-based
     select.   In  other words project may evaluate a predi-
     cate and decide what fields from  the  input  tuple  to
     project to the output tuple based upon the results.

Conceptually  project  can also be used to execute functions
on values from input tuples.

Join Conceptually join takes a join predicate and  an  arbi-
     trary   number  of  inputs and performs the appropriate
     cartesian product.  There can be many  join  algorithms
     in  Niagara,  such  as  hash  join symmetric hash join,
     nested loops, merge, sort-merge,  grace,  choose  those
     which are needed.



Operations

     Operations  are actions which are performed on individ-
ual tuples or data items.  They are often the building block
used  to  construct  an  operator.   Operations know nothing
about streams, they just know about data, and how to manipu-
late it.
















                             -4-


Search Engine

     Here  is  a  brief  list  and explanation of the Search
Engine operators which generating execution plans  from  our
query  corpus  has discovered.  These operations can be per-
formed on any appropriate Posting field in a tuple.

All of these SE operators are expressed as full-blown opera-
tors.   In  execution  it may often be possible to plumb the
functionality onto an existing operator.  There may be limi-
tations  on  how  that plumbing may occur, however, so there
still may be a need for a full-fledged version of the opera-
tors.

Contains
     This  has been there all along.  The L input is an Ele-
     ment or an Attribute, something which contain something
     else.   The  R input for Elements can be Elements, Pay-
     load Words, Attributes,  or  Attribute  Words.   The  R
     input  for Attributes can be Attribute Words.  It finds
     all occurences of a R input which is nested within  the
     structure of the L input.

It  is possible to negate the condition of a contains opera-
tor, which converts it to not-contains.  For both normal and
not  contains,  the  semantics of containment of null values
must be considered; given the variability in  null  handling
between queries, it seems best to leave that decision to the
optimizer and allow the  operator  to  produce  all  variant
results.

There  is  an  orthogonal  component to contain operators; a
contain may be executed by any join.  Choice of  join  algo-
rithm  is  possible based upon input streams, expected data,
statistics, and other such criteria.   An  appropriate  con-
tains  expression,  in the guise of a predicate, can be sup-
plied and performed.  With  this  methodology,  contains  is
nothing  but an ordinary join, and not a dedicated operator.

Direct-Contains
     Again, this operators is already present.   The  inputs
     are  the  same  as the contains operator.  It finds all
     occurences of a R input which is directly nested within
     the structure of the L input.

Only-Contains
     This  is  a variant of Direct-Contains.  I believe it's
     current name in the system is is, however it really  is
     a  form  of contains.  Inputs are the same as contains.
     It returns all occurences of a R input being  the  only
     thing directly nested within the L input.

Exists-Contains
     This is another variant of Contains, though it can be a









                             -5-


     variant on any contains variant.   The  above  contains
     operators  return  a  complete  list of matching pairs.
     However, there can be cases where that  information  is
     not  needed  by the rest of the query; it qualifies the
     node for other purposes.  One example of this is unnest
     for  other values.  This singleton variant provides the
     L input if the qualifying condition is true.  As in all
     the  other  operators,  the R input may be projected if
     needed.  Note that Exists-contains is the  behavior  of
     the  current  contains code, as it qualifies a node for
     just the first instance of a match, allowing unnest  to
     do  the  bulk  of the work.  In the new IM and SE, con-
     tains does much of the  work  which  unnest  previously
     performed.

Range
     It  is often necessary to extract an arbitrary range of
     matches on a input stream.  Ordering of the  stream  is
     not  the  responsibility  of  the range operator, it is
     repsonsible for selecting a subset of whatever ordering
     the  stream  has.   For example range use, all combina-
     tions of the first match, or the first several matches,
     or  a  particular  match, or a range of matches, or all
     matches after N are possibilities.  Rank, Top and Range
     can  all be represented by a two-argument range expres-
     sion which takes a begin and  end  specification  of  a
     range.   Begin values include start-of-document (other-
     wise known as rank 1), and end values  include  end-of-
     document.  When end-of-document is specified, the start
     argument may be negative to specify distance  from  the
     end.  Translations of common requests are shown here:

               +----------------------------------+
               |Top(5)            Range(Start, 5) |
               |Rank(3)           Range(3,3)      |
               |Between(10,15)    Range(10,15)    |
               |Between(9, End)   Range(9, End)   |
               |Last(5)           Range(-5, End)  |
               +----------------------------------+

The  last  form  need not be implemented, I mention it to be
complete.  It is the only form which requires  the  operator
to  store  an  arbitrary,  but  known  in advance, number of
tuples before it generates its result.  As such it is  also,
of necessity, a blocking operator.

Range  is  trivial  to  implement since it has nothing to do
with ordering.  When performing multi-document queries it is
necessary  to  combine  range  with  grouping or aggregation
operators to provide per-document  results  in  the  typical
case.   As  long as the input stream obeys the assumed docu-
ment_id-order model this is trivial to  implement  in  range
itself.   If  an  non-document_id-order  stream  is used, it
probably best to use the grouping operator and provide  rank









                             -6-


as an aggregation predicate to it.  Note that we may need to
upgrade the contents of the  IDREF  table  to  provide  type
information  of  destination nodes to allow for proper indi-
rection based queries, as the destination may  be  an  arbi-
trary element.

Deref
     The  drefer  operator  is used to provide access to the
     refid to id table.  In other words it is  a  probe-join
     used to provide dreference information for IDREF infor-
     mation.  It is innapropriate to use a  normal  join  in
     this  case  as  IDREFs  do  not  have  any order in the
     requesting document.  Information returned by deref  is
     not  in  sorted  order.  IDREFs do not reference IDs in
     document order.  If an order is desired on  that  field
     it must be applied via an operator.

Ref-By
     The ref-by operator is something we are considering for
     inclusion to aid queries which  require  such  informa-
     tion.   It finds IDREFs which correspond to an ID.  One
     implementation of this operator is to  perform  a  full
     join, instead of a probe join, of the input against the
     IDREF index.  Another  possible  implementation  is  to
     provide   both   forward   (IDREFs->ID)   and  backward
     (ID->IDREFs) indices on the IDREF  table.   Most  notes
     about deref apply equally to refby.

Is  it  possible  to  use a multi-dimensional index for such
mapping?

Reinject
     Experimental The intent of the reinject operator is  to
     allow  a  query stream to re-enter the SE after leaving
     it.  I thought I had good reason for its existence, but
     perhaps  it is too complex to implement.  It seems that
     it is necessary for executing  correlated  sub-queries.
     For example, the reinject operator contains a simple QE
     expression, into which it can insert values or Postings
     from  the input tuple.  The simple SE query may then be
     executed with those parameters, the results output, and
     the next tuple processed.  If the expression is complex
     it is better to try to integrate it  into  the  overall
     query plan.  However some plans, with the right statis-
     tics, could use this to great advantage to  reduce  the
     work  needed  for some queries.  We weed to find a good
     exaple of why this is important, to decide if it should
     be tossed or not.

Merge
     The  merge  operator  takes two streams of postings and
     merges them.  If the streams are in document and  docu-
     ment_id order the output will have the same order.  For
     example, using an earlier query:









                             -7-


                (A contains B) or (A contains C)

     One plan is to query

                     A contains (B merge C)

     It may be desireable for merge to project an additional
     field into the output which reflects which input stream
     the merged value originated from.  This would  be  used
     for  later  select  predicates  to discriminate between
     different matches.

Except
     The except operator outputs all  Postings  from  the  L
     input,  except  those  which  exist  in the R input.  I
     believe this operator may also be known as  difference.

Project
     For  all  intents and purposes, we can consider all the
     SE operators to always add more fields to  the  tuples.
     However,  some of those values will no longer be needed
     for further query processing and should be  eliminated.
     The  project  operator  takes a single input stream and
     projects desired fields from that stream to the  output
     stream.   It is interesting to observe that project can
     be considered a special case of a generic sorted  merge
     operator  with  project  capabilities .. a single input
     merge.

Query Engine

     The following are the Query Engine  operators  we  have
found  a  need  for  while  creating execution plans for our
query corpus.  As noted in the Search Engine section, it may
be  possible to plumb together operators directly for execu-
tion, so that individual operators are not required at  run-
time.  The purpose of each operators is mentioned by itself,
as it stands on its own and can be optimized the  same  way.
For example, an operator which adds fields to a tuple has an
implicit project functionality built into it.

Simple-Unnest
     To one point of view, Unnest is the translator  between
     the  SE  stream and the QE stream.  To another point of
     view, unnest just adds values to a tuple.  In the  cur-
     rent  Niagara,  unnest  always  performs  an exhaustive
     search, using the DM, to unnest the path expression  in
     question.   This  can  be an expensive operation.  How-
     ever, the new Niagara can reduce  the  amount  of  work
     required  by  unnest,  as  it  provides  the exact ele-
     ment_ids of elements to be unnested.  With this  infor-
     mation  unnest  need  only  extract the required values
     from the document and add them  to  the  tuple.   If  a
     expression  to  be unnested is found to be large, or to









                             -8-


     be a XML subtree, the value is not added to the  tuple.
     Instead  a  flag is placed in the extracted value field
     of the tuple which indicates that the  value  is  large
     and that an expensive evaluation must be performed upon
     demand.  Note: This implies that the XNode  correspond-
     ing  to  that  value is always kept in the tuple by the
     execution plan generator.  If the expected atomic value
     is  a  small  subtree, it can be converted to a textual
     representation and used  for  further  queries,  if  it
     falls  below  some  size threshold.  Note that the con-
     trolling XNode will always be used for  node  identity,
     as  unnested  values are just that -- anonymous values.
     If something to be unnested is not found, there may  be
     either  an  error or the value treated as a null value.
     Such decision should be made by the plan generator, not
     the  runtime.   The  runtime  will  either  generate an
     error, or put a flag in the value slot indicating  that
     the value is null.

Unnest
     This  version  of  unnest  is  the full-blown expensive
     unnest that  is  currently  available  in  the  system.
     There  are some cases where results can not be supplied
     from the SE to allow use of Simple-unnest; in that case
     the exhaustive search of the document in the DM must be
     performed.  Other than that detail, Unnest does  every-
     thing which is mentiond for simple-unnest.

Merge
     The merge operator takes multiple streams of tuples and
     merges them together.  It is  be  possible  to  specify
     predicates of how the tuples should be merged, though I
     don't know how useful that may ultimately be.

Contruct
     Construct is a family of operators which  are  directly
     related  to  the algebra.  Given input tuples new docu-
     ments or document fragments are  constructed  from  the
     values available in the tuple.

Filter-Construct
     The filter-construct operator is a special construction
     operator  which  implements  the  filter  semantics  of
     XQuery.   The  input to filter-construct is a stream of
     candidate Postings or XNodes.  A state stack  is  main-
     tained  to  connect  the  XNodes  together, using their
     start and end numbers to determine  the  closest  match
     document  structure  as  mentioned in the XQuery paper.
     The resulting document may be a a single cohesive docu-
     ment or an ordered forrest.

Rank See  the rank operator in the Search Engine section for
     details.










                             -9-


Sort Yup ... Sorting ..  we all  know  what  it  is.   'Nuff
     said.

Group-By
     Strictly  speaking,  group-by  in  XQuery and XML means
     structural aggregation.  Due to  that  necessary  evil,
     the grouping subsystem must be capable of grouping both
     a large number of groups as well as a large  amount  of
     data  in  individual groups.  This is considerably dif-
     ferent from standard  relational  grouping  which  only
     requires a large group count and storage for the aggre-
     gate result and temporary space used  to  develop  that
     result.

Note  this only discusses grouping and aggregation at a con-
ceptual level; see the vignette on grouping and  aggregation
for the details.  Conceptually the group-by operator takes a
stream as an input and has an known or  potentially  unknown
number  of streams as an output.  Each of the output streams
corresponds to a unique grouping value.  Given this,  group-
by is a simple switch which routes packets to an appropriate
output stream.  It has nothing to do with aggregation, stor-
age, or any other issues.

To extract the output of this conceptual

Aggregation
     Aggregate operations can be used both with the group-by
     operator and as a standalone operator.  With the  above
     mentioned  group-by  operator,  aggregate  handling is,
     conceptually performed by simple, stream based,  aggre-
     gate operators for either case.  Almost anything can be
     implemented as an aggregate function, and  it  will  be
     possible  to  perform  multiple aggregate operations as
     well as filter (aka select operator)  aggregates  which
     are  processed.   I'll mention some aggregate operators
     here as an example, but  again,  the  framework  allows
     most  anything  to  be  implemented.  See the aggregate
     vignette for details.