How to use Condor
Table of Contents
Workflow
- Prepare scripts on local computer and take care of random seed handling
(
CHTC seems to use different seeds for different R jobs, but sames seed for different Matlab jobs.) - Upload scripts to submit-1.chtc.wisc.edu and run many jobs
simultaneously there. Each job writes outputs into a file
simu#.out. - Collect outputs from different jobs and download them to local computer.
- For more information, see http://chtc.cs.wisc.edu/howto_overview.shtml.
Matlab
- In a local computer, gather all Matlab scripts into a folder, named
source.|--source | |-- main_program.m | |-- additional_program_1.m | |-- additional_program_2.m | |-- .....
- At the top of
main_program.m, add the following code to handle the seed of random number generator. This seed will also be written into the filename of output file.dr = pwd; [ig1,ig2,ig3,tmp]= regexp(dr, '(\d)*?(\d)*$'); seed=char(tmp); s = RandStream('mcg16807','Seed',str2num(seed)); RandStream.setDefaultStream(s); filename = strcat('simu',seed,'.out'); - At the bottom of
main_program.m, add the following code to write output into a file. (The corresponding file name issimu*****.out) It is fine to store multiple results in multiple files, but every file name should start with "simu", e.gsimu*****_1.out,simu*****_2.out.save(char(filename),'output','-ASCII'); % assume output is the variable of interest
- Make a working directory, say
/project_name/, on submit-1.chtc.wisc.edu. Unzip and copy Matlab templates matlab.zip to that directory. This Matlab template is based on a template provided by CHTC and several shell scripts as shown in the following. The CHTC's template is available at http://chtc.cs.wisc.edu/DAGenv.shtml. - Upload the
sourcefolder from local computer to the folder /project_name/ in the server. - Log into
submit-1.chtc.wisc.eduand go to the directory(/project_name/), invoke
./Mjob.sh main_program # additional_program_1.m,additional_program_2.m,main_program.m to submit jobs to CHTC, where # should to be replaced with the number of jobs needed. All additional Matlab scripts including the main script should be included and separated by comma. For example, the following code runs 1000 simulations.
./Mjob.sh main_program 1000 additional_program_1.m,additional_program_2.m,main_program.m
Mjob.sh first compiles main_program.m to obtain an executable program
(main_program), copies this program to shared folder so that each job can use
it. Then, it builds dag folder to store outputs of each job.
#---------- Mjob.sh---------- #!/bin/bash ## compile.m source file cd source chtc_mcc --mtargets=$1.m --mfiles=$3 mkdir shared cp $1 shared/ n=`expr $2 - 1` for i in `seq 0 $n` do mkdir $i done cd .. ./mkdag --data=source --dagdir=dag --pattern=simu* --cmdtorun=$1 --type=Matlab ## submit dag cd dag condor_submit_dag mydag.dag
For more information of mkdag, invoke ./mkdag --help.
- After all jobs are done. Invoke
./collect.sh #to collect all outputs into a folderoutput. # should be replaced by number of jobs have run.#---------- collect.sh ---------- #!/bin/bash ### mkdir output n=`expr $1 - 1` for i in `seq 0 $n` do cp dag/$i/simu*.out output done
- Download
outputto local computer.
R
- Build a directory
sourcewhich contains another directoryshared. Include all R scripts intoshared.|--source | |--shared | | |-- main_program.R | | |-- additional_program_1.R | | |-- additional_program_2.R | | |-- .....
- On top of
main_program.R, add the following to handle random seeds and filename of output files.seed <- as.integer(abs(rnorm(1)*100000)) set.seed(seed) filename <- paste("simu",seed,".out",sep="") - If the program requires some R packages, download them from CRAN
and compile them by referring to
http://chtc.cs.wisc.edu/MATLABandR.shtml. Copy the compiled R libraries,
called
RLIBS.tar.gzintosharedfolder. - Make a working directory on submit-1.chtc.wisc.edu, say
/project_name/. Unzip and copy R template R.zip to that directory. Again, the R template is based on CHTC's template and several other shell scripts. - Upload the
sourcefolder from local computer to server, under folder/project_name/. - At the working directory, Invoke
./Rjob.sh main_program.R #to submit jobs.#should be replaced with the number of jobs needed.#---------- Rjob.sh ---------- #!/bin/bash ## usage: ./Rjob.sh RscriptName.R NumOfJobs ## compile.m source file cd source n=`expr $2 - 1` for i in `seq 0 $n` do mkdir $i done cd .. ./mkdag --data=source --dagdir=dag --pattern=simu* --cmdtorun=$1 --type=R --version=sl5-R-2.13.1 ## submit dag cd dag condor_submit_dag mydag.dag
- After all jobs are done. Invoke
./collect.sh #to collect all outputs into folderoutput. - Download
outputinto local computer.
Common Commands
| Function | Command |
|---|---|
| check Condor status | condor_status |
| submit jobs | condor_submit |
| submit dag jobs | condor_submit_dag |
| inquire running jobs | condor_q YourAccount |
| kill jobs | condor_rm jobID |
Acknowledgment
Thank Bill Taylor, Hao Zheng and Jie Zhang for help.