class Driver_Fib : public MWDriver

The driver class derived from the MWDriver class for this application

Inheritance:


Public Methods

Driver_Fib()
Constructor
~Driver_Fib()
Destructor

Public

Checkpointing Methods
void write_master_state( FILE *fp )
Write out the state of the master to an fp
void read_master_state( FILE *fp )
Read the state from an fp
MWTask* gimme_a_task()
That simple annoying function
Implemented Methods
Get the info from the user. Don't forget to get the
MWReturn get_userinfo( int argc, char *argv[] )
worker_executable!
MWReturn setup_initial_tasks( int *, MWTask *** )
Set up an array of tasks here
MWReturn act_on_completed_task( MWTask * )
What to do when a task finishes:
MWReturn pack_worker_init_data( void )
Put things in the send buffer here that go to a worker
OK, this one doesn't *have* to be...but you want to be able to
void printresults()
tell the world the results, don't you? :-)

Inherited from MWDriver:

Public Fields

static MWRMComm* RMC

Public Methods

void go()
void go( int argc, char *argv[] )
virtual void printresults()

Protected

A. Pure Virtual Methods

virtual MWReturn get_userinfo( int argc, char *argv[] )
This function is called to read in all information specific to a user's application and do any initialization on this information
virtual MWReturn setup_initial_tasks( int *n, MWTask ***task )
This function must return a number n > 0 of pointers to Tasks to "jump start" the application
virtual MWReturn act_on_completed_task( MWTask * )
This function performs actions that happen once the Driver receives notification of a completed task
virtual MWReturn pack_worker_init_data( void )
A common theme of Master-Worker applications is that there is a base amount of "initial" data defining the problem, and then just incremental data defining "Tasks" to be done by the Workers
virtual void unpack_worker_initinfo( MWWorkerID *w )
This one unpacks the "initial" information sent to the driver once the worker initializes
virtual void pack_driver_task_data( void )
OK, This one is not pure virtual either, but if you have some "driver" data that is conceptually part of the task and you wish not to replicate the data in each task, you can pack it in a message buffer by implementing this function

B. Task List Management

void addTask( MWTask * )
Add a task to the list
void addTasks( int, MWTask ** )
Add a bunch of tasks to the list
void set_task_key_function( MWKey (*)( MWTask * ) )
Sets the function that MWDriver users to get the "key" for a task
int set_task_add_mode( MWTaskAdditionMode )
Set the mode you wish for task addition.
int set_task_retrieve_mode( MWTaskRetrievalMode )
Set the mode you wish for task retrieval.
int set_machine_ordering_policy( MWMachineOrderingPolicy )
Sets the machine ordering policy.
int print_task_keys( void )
(Mostly for debugging) -- Prints the task keys in the todo list
int sort_task_list( void )
This sorts the task list by the key that is set
int delete_tasks_worse_than( MWKey )
This deletes all tasks in the task list with a key worse than the one specified
int get_number_tasks()
returns the number of tasks on the todo list.
int get_number_running_tasks()
returns the number of running tasks.

Benchmarking

void register_benchmark_task( MWTask *t )
register the task that will be used for benchmarking
MWTask* get_benchmark_task()
get the benchmark task

C. Worker Policy Management

void set_target_num_workers( int iWantThisMany )
Sets the desired number of workers
void set_target_num_workers( int *iWantThisany, int num_arches )
Sets the desired number of workers
void set_suspension_policy( MWSuspensionPolicy )
Set the policy to use when suspending
int worker_timeout
If 0 : workers never timeout and can potentially work forever on a task If 1 : workers time out after worker_timeout_limit seconds
double worker_timeout_limit
Limit of seconds after which workers are considered time out and tasks are re-assigned
int worker_timeout_check_frequency
frequency at which we check if there are timed out workers
int next_worker_timeout_check
based on the time out frequency, next timeout check time
void set_worker_timeout_limit(double timeout_limit, int timeout_frequency)
Sets the timeout_limit and turn worker_timeout to 1
void reassign_tasks_timedout_workers()
Go through the list of timed out WORKING workers and reschedule tasks
double timeval_to_double( struct timeval t )
A helper function

D. Event Handling Methods

virtual MWReturn handle_benchmark( MWWorkerID *w )
Here, we get back the benchmarking results, which tell us something about the worker we've got
virtual void handle_hostdel()
This is what gets called when a host goes away
virtual void handle_hostsuspend()
Implements a suspension policy
virtual void handle_hostresume()
Here's where you go when a host gets resumed
virtual void handle_taskexit()
We do basically the same thing as handle_hostdel()
virtual void handle_checksum()
Routine to handle when the communication layer says that a checksum error happened

E. Checkpoint Handling Functions

void checkpoint()
This function writes the current state of the job to disk
void restart_from_ckpt()
This function does the inverse of checkpoint
int set_checkpoint_frequency( int freq )
This function sets the frequency with with checkpoints are done
int set_checkpoint_time( int secs )
Set a time-based frequency for checkpoints
virtual void write_master_state( FILE *fp )
Here you write out all 'state' of the driver to fp
virtual void read_master_state( FILE *fp )
Here, you read in the 'state' of the driver from fp
virtual MWTask* gimme_a_task()
It's really annoying that the user has to do this, but they do

Private Fields

MWStatistics stats

Private Methods

double get_instant_pool_perf()

Private

Checkpoint internal helpers...

int checkpoint_frequency
How often to checkpoint? Task frequency based
int checkpoint_time_freq
How often to checkpoint? Time based
long next_ckpt_time
Time to do next checkpoint
int num_completed_tasks
The number of tasks acted upon up to now
MWTask* bench_task
The benchmark task
const char* ckpt_filename
The name of the checkpoint file

Internal Task List Routines

void pushTask( MWTask * )
This puts a (generally failed) task at the beginning of the list
MWTask* getNextTask()
Get a Task.
void putOnRunQ( MWTask *t )
This puts a task at the end of the list
MWTask* rmFromRunQ( int jobnum )
Removes a task from the queue of things to run
void printRunQ()
Print the tasks in the list of tasks to do
void ckpt_addTask( MWTask * )
Add one task to the todo list; do NOT set the 'number' of the task - useful in restarting from a checkpoint
MWWorkerID* task_assigned( MWTask *t )
returns the worker this task is assigned to, NULL if none.
bool task_in_todo_list( MWTask *t )
Returns true if "t" is still in the todo list
void ControlPanel( )
The control panel that controls the execution of the independent mode
MWKey (*task_key)( MWTask * )
A pointer to a (user written) function that takes an MWTask and returns the "key" for this task
MWTaskAdditionMode addmode
Where should tasks be added to the list?
MWTaskRetrievalMode getmode
Where should tasks by retrived from the list
MWKey (*worker_key)( MWWorkerID * )
A pointer to the function that returns the "key" by which machines are ranked
int task_counter
MWDriver keeps a unique identifier for each task -- here's the counter
bool listsorted
Is the list sorted by the current key function
MWTask* todo
The head of the list of tasks to do
This is Jeff's nasty addition so that he can get access
MWTask* get_todo_head()
to the tasks on the master
MWTask* todoend
The tail of the list of tasks to do
MWTask* running
The head of the list of tasks that are actually running
MWTask* runningend
The tail of the list of tasks that are actually running

Main Internal Handling Routines

MWReturn master_setup( int argc, char *argv[] )
This method is called before master_mainloop() is
MWReturn master_mainloop()
This is the main controlling routine of the master
MWReturn worker_init( MWWorkerID *w )
unpacks the initial worker information, and sends the application startup information (by calling pure virtual pack_worker_init_data()

The return value is taken as the return value from the user's pack_worker_init_data() function

MWReturn create_initial_tasks()
This routine sets up the list of initial tasks to do on the todo list
MWReturn handle_worker_results( MWWorkerID *w )
Act on a "completed task" message from a worker
void send_task_to_worker( MWWorkerID *w )
We grab the next task off the todo list, make and send a work message, and send it to a worker
void rematch_tasks_to_workers( MWWorkerID *nosend )
After each result message is processed, we try to match up tasks with workers
void call_hostaddlogic()
A wrapper around the lower level's hostaddlogic
void kill_workers()
Kill all the workers
void hostPostmortem( MWWorkerID *w )
This is called in both handle_hostdelete and handle_taskexit

Worker management methods

void addWorker( MWWorkerID *w )
Adds a worker to the list of avaiable workers
MWWorkerID* lookupWorker( int tid )
Looks up information about a worker given its task ID
MWWorkerID* rmWorker( int tid )
Removes a worker from the list of available workers
void printWorkers()
Prints the available workers
MWWorkerID* get_workers_head()
Another terrible addition so that Jeff can print out the worker list in his own format
MWWorkerID* workers
The head of the list of workers.
MWSuspensionPolicy suspensionPolicy
Here's where we store what should happen on a suspension...
void sort_worker_list()
Based on the ordering policy, place w in the worker list appropriately
int numWorkers()
Counts the existing workers
int numWorkers( int arch )
Counts the number of workers in the given arch class
int numWorkersInState( int ThisState )
Counts the number of workers in the given state
MWKey return_best_todo_keyval( void )
Returns the value (only) of the best key in the Todo list
MWKey return_best_running_keyval( void )
Returns the best value (only) of the best key in the Running list.

XML and Status Methods.

void write_XML_status()
virtual char* get_XML_results_status(void )
If you want to display information about status of some results variables of your solver, you have to dump a string in ASCII, HTML or XML format out of the following method
char* get_XML_status()
char* get_XML_job_information()
char* get_XML_problem_description()
char* get_XML_interface_remote_files()
char* get_XML_resources_status()
const char* xml_filename
const char* xml_menus_filename
const char* xml_jobinfo_filename
const char* xml_pbdescrib_filename
void get_machine_info()
Set the current machine information
char* get_Arch()
Returns a pointer to the machine's Arch
char* get_OpSys()
Returns a pointer to the machine's OpSys
char* get_IPAddress()
Returns a pointer to the machine's IPAddress
double get_CondorLoadAvg()
double get_LoadAvg()
int get_Memory()
int get_Cpus()
int get_VirtualMemory()
int get_Disk()
int get_KFlops()
int get_Mips()
char Arch[64]
char OpSys[64]
char IPAddress[64]
double CondorLoadAvg
double LoadAvg
int Memory
int Cpus
int VirtualMemory
int Disk
int KFlops
int Mips
int check_for_int_val(char* name, char* key, char* value)
Utility functions used by get_machine info
char mach_name[64]
The name of the machine the worker is running on.

Documentation

The driver class derived from the MWDriver class for this application.

In particular, this application is a very simple one that calculates the fibonacci sequence for different pairs of starting numbers.

This simple app will not need any special math packages to run, and is designed with the non-math-specialist in mind (like me!).

Driver_Fib()
Constructor

~Driver_Fib()
Destructor

Implemented Methods
These methods are the ones that *must* be implemented in order to create an application

Get the info from the user. Don't forget to get the

MWReturn get_userinfo( int argc, char *argv[] )
worker_executable!

MWReturn setup_initial_tasks( int *, MWTask *** )
Set up an array of tasks here

MWReturn act_on_completed_task( MWTask * )
What to do when a task finishes:

MWReturn pack_worker_init_data( void )
Put things in the send buffer here that go to a worker

OK, this one doesn't *have* to be...but you want to be able to

void printresults()
tell the world the results, don't you? :-)

Checkpointing Methods

void write_master_state( FILE *fp )
Write out the state of the master to an fp

void read_master_state( FILE *fp )
Read the state from an fp. This is the reverse of write_master_state().

MWTask* gimme_a_task()
That simple annoying function


This class has no child classes.
Author:
Mike Yoder

alphabetic index hierarchy of classes


this page has been generated automatically by doc++

(c)opyright by Malte Zöckler, Roland Wunderling
contact: doc++@zib.de