###############################################################################
##
## Quickstep*
##
## Class:
##   CS 764
##   Spring 2014
##
## Authors:
##   Anand Mundada
##   Dale Willis
##
## Description:
##   This project implemented quickstep into the YARN Hadoop architecture.
##
###############################################################################


== Setup YARN/Hadoop cluster ==

To run quickstep* first you have to have a Hadoop cluster up and working.
specifically it has to support YARN (hadoop-2.2.0).

We are including a snapshot of the hadoop-2.2.0 repo to guarantee this will work.
It can be found under @hadoop-2.2.0@


Our cluster was setup using the following links:
http://thecodeway.blogspot.com/2013/11/hadoop-220-installation-steps-for.html

We setup a new user "hduser" and set the hadoop working directory at:
/usr/local/hadoop/

The configuration files are important, I'm including our config files at:
@hadoop-config@

These files should go under /usr/local/hadoop/etc/hadoop/*


== Setup Quickstep* ==

2 versions of quickstep need to be built, one that supports a TCP server and one "vanilla"
version but where the storage locations (qsstor and catalog) are unique.
So for the first version I used the "yarn" branch, and you should be able to apply the
patch file located here: 000-yarn_network_server.patch

For the other "vanilla" quickstep, just change a few things in the QuickstepCli.cpp file:

#ifdef QUICKSTEP_OS_WINDOWS
#const char *STORAGE_PATH = "shellqsstor\\";
##else
#const char *STORAGE_PATH = "shellqsstor/";
##endif
#
query_processor.reset(new QueryProcessor("shellcatalog.json", STORAGE_PATH));

(Note that the node that runs the Yarn Client process can also be a slave,
in this case both versions of quickstep (quickstep_cli_shell and quickstep_cli_net)
will run on the same node in the same directory (/home/hduser), this could cause
a collision between the catalog and qsstor blocks, this is why I change the names
for the shell version).

Also note that the Python code that runs the quickstep stuff expects the names to be:
shellcatalog.json
and
shellqsstor/

In order to run Quickstep* the quickstep binaries and library files have to be
in the right locations.

What we did was just copy everything in the @scripts@ directory into the home
directory of the hadoop user (so in our case @/home/hduser@).


== Run Quickstep* ==

Once you have a full hadoop cluster up and running, start the YARN client
via a call:

yarn jar distributedDB.jar distributeddb.Client -db quickstep

The only important argument is "-db <quickstep|sqlite3>"

This launches the client daemon.

The Client listens on port 23456, so in another terminal launch:
telnet <hostname> 23456.


From there you can submit queries directly like "select * from test"
or you can see the syntax and help information by typing "!help"

There are a series of helpful commands, all of them start with "!.*"





