How to use Condor

Table of Contents

Workflow

  • Prepare scripts on local computer and take care of random seed handling (CHTC seems to use different seeds for different R jobs, but sames seed for different Matlab jobs.)
  • Upload scripts to submit-1.chtc.wisc.edu and run many jobs simultaneously there. Each job writes outputs into a file simu#.out.
  • Collect outputs from different jobs and download them to local computer.
  • For more information, see http://chtc.cs.wisc.edu/howto_overview.shtml.

Matlab

  • In a local computer, gather all Matlab scripts into a folder, named source.
    |--source
    |  |-- main_program.m
    |  |-- additional_program_1.m
    |  |-- additional_program_2.m
    |  |-- .....
    
  • At the top of main_program.m, add the following code to handle the seed of random number generator. This seed will also be written into the filename of output file.
    dr = pwd;
    [ig1,ig2,ig3,tmp]= regexp(dr, '(\d)*?(\d)*$');
    seed=char(tmp);
    s = RandStream('mcg16807','Seed',str2num(seed));
    RandStream.setDefaultStream(s);
    filename = strcat('simu',seed,'.out');
    
  • At the bottom of main_program.m, add the following code to write output into a file. (The corresponding file name is simu*****.out) It is fine to store multiple results in multiple files, but every file name should start with "simu", e.g simu*****_1.out, simu*****_2.out.
    save(char(filename),'output','-ASCII');   % assume output is the variable of interest
    
  • Make a working directory, say /project_name/, on submit-1.chtc.wisc.edu. Unzip and copy Matlab templates matlab.zip to that directory. This Matlab template is based on a template provided by CHTC and several shell scripts as shown in the following. The CHTC's template is available at http://chtc.cs.wisc.edu/DAGenv.shtml.
  • Upload the source folder from local computer to the folder /project_name/ in the server.
  • Log into submit-1.chtc.wisc.edu and go to the directory(/project_name/), invoke

./Mjob.sh main_program # additional_program_1.m,additional_program_2.m,main_program.m to submit jobs to CHTC, where # should to be replaced with the number of jobs needed. All additional Matlab scripts including the main script should be included and separated by comma. For example, the following code runs 1000 simulations.

./Mjob.sh main_program 1000 additional_program_1.m,additional_program_2.m,main_program.m

Mjob.sh first compiles main_program.m to obtain an executable program (main_program), copies this program to shared folder so that each job can use it. Then, it builds dag folder to store outputs of each job.

#---------- Mjob.sh----------
#!/bin/bash

## compile.m source file
cd source
chtc_mcc --mtargets=$1.m  --mfiles=$3
mkdir shared
cp $1 shared/

n=`expr $2 - 1`
for i in `seq 0 $n`
do 
mkdir $i
done

cd ..
./mkdag --data=source --dagdir=dag --pattern=simu* --cmdtorun=$1 --type=Matlab

## submit dag
cd dag
condor_submit_dag mydag.dag

For more information of mkdag, invoke ./mkdag --help.

  • After all jobs are done. Invoke ./collect.sh # to collect all outputs into a folder output. # should be replaced by number of jobs have run.
    #---------- collect.sh ----------
    #!/bin/bash
    ### 
    mkdir output
    
    n=`expr $1 - 1`
    for i in `seq 0 $n`
    do
    cp dag/$i/simu*.out output
    done
    
  • Download output to local computer.

R

  • Build a directory source which contains another directory shared. Include all R scripts into shared.
    |--source
    |  |--shared
    |  |  |-- main_program.R
    |  |  |-- additional_program_1.R
    |  |  |-- additional_program_2.R
    |  |  |-- .....
    
  • On top of main_program.R, add the following to handle random seeds and filename of output files.
    seed <- as.integer(abs(rnorm(1)*100000))
    set.seed(seed)
    filename <- paste("simu",seed,".out",sep="")
    
  • If the program requires some R packages, download them from CRAN and compile them by referring to http://chtc.cs.wisc.edu/MATLABandR.shtml. Copy the compiled R libraries, called RLIBS.tar.gz into shared folder.
  • Make a working directory on submit-1.chtc.wisc.edu, say /project_name/. Unzip and copy R template R.zip to that directory. Again, the R template is based on CHTC's template and several other shell scripts.
  • Upload the source folder from local computer to server, under folder /project_name/.
  • At the working directory, Invoke ./Rjob.sh main_program.R # to submit jobs. # should be replaced with the number of jobs needed.
    #---------- Rjob.sh ----------
    #!/bin/bash
    ## usage: ./Rjob.sh  RscriptName.R NumOfJobs
    
    ## compile.m source file
    cd source
    
    n=`expr $2 - 1`
    for i in `seq 0 $n`
    do 
    mkdir $i
    done
    
    cd ..
    ./mkdag --data=source --dagdir=dag --pattern=simu* --cmdtorun=$1 --type=R --version=sl5-R-2.13.1
    
    ## submit dag
    cd dag
    condor_submit_dag mydag.dag
    
  • After all jobs are done. Invoke ./collect.sh # to collect all outputs into folder output.
  • Download output into local computer.

Common Commands

FunctionCommand
check Condor statuscondor_status
submit jobscondor_submit
submit dag jobscondor_submit_dag
inquire running jobscondor_q YourAccount
kill jobscondor_rm jobID

Acknowledgment

Thank Bill Taylor, Hao Zheng and Jie Zhang for help.


Org version 7.8.11 with Emacs version 24