Homework 6 // Due at Lecture Wednesday, October 28, 2009
You will perform this assignment on the x86-64 Nehalem-based systems you used previous homeworks:
ale-01.cs.wisc.edu and ale-02.cs.wisc.edu.
You should do this assignment alone. No late assignments.
Purpose
The purpose of this assignment is to give you some experience
converting lock-based synchronization into transactions, using
Intel's prototype C++ STM Compiler.
Programming Environment: POSIX Threads + Software
Treansactional Memory (STM)
Once again, threads in this homework are of the POSIX flavor.
As in HW2, the
orchestration and creation/destruction of threads has been done for
you, as you will be re-using most of your code from HW2.
You will be modifying your concurrent tree code to use
transactions to perform the synchronization. You will be using
Intel's prototype C++ STM compiler. The programming interface for
Intel's STM is described in this
document.
Programming Task: Concurrent Binary Tree, Reloaded
This homework re-uses your lock-based implementation of a concurrent
binary tree from HW2. You
will modify this code to use STM for synchronization instead of locks
for many cases.
The previous code had several bugs which have been corrected. An
updated version of the code is available here.
Problem 1: Compiler Setup
Download and install the Intel
C++ STM Compiler. (You probably want to store the tarball in
/scratch or /tmp and install to AFS from there.) At the first prompt,
you should choose "Install as current user to limit access to user
level", rather than the default, and specify somewhere in your CS
account to install the compiler. To save storage space, you may want
to choose a custom installation. You will not need the Math Kernel
Library, TBB, or the IPP library.
When you are prompted for a license file, specify the following location:
/s/intel_cc-11.0/common/licenses/NCOM_L_CMP_CPP_NRGF-WJJ5F3HH.lic
If anyone has trouble getting the license to work, please contact the TA.
Next, modify the Makefile of the ctree benchmark to use the
Intel C++ STM compiler, which should be located at:
<install_dir>/intel/Compiler/11.0/606/bin/intel64/icpc
Note that to use TM, you will need to specify the -Qtm_enabled flag for both the compile and link steps.
To execute the resulting program, you will also need to set the LD_LIBRARY_PATH environment variable to point to:
<install_dir>/intel/Compiler/11.0/606/lib/intel64
NOTE: Students have reported issues installing the compiler on the ale machines, and possibly other machines. Students have reported that installations performed on the clover machines have worked correctly. Please contact the TA if you have an issue with an installation performed on a clover node.
Problem 2: Transactionalize your Concurrent Tree Operations
The goal of problem 2 is to make a version of the concurrent tree
microbenchmark that uses transactions instead of locks to provide
atomic tree operations.
As described in the Intel
documentation, use atomic blocks to synchronize the
implementations of Lookup(), Set(), and Remove(). This is done using
the __tm_atomic construct, as well as
annotating any method or function called within a transaction with
__attribute__((tm_callable)). You may
find other useful constructs in the documentation. Feel free to
experiment.
You can get statistics on your transactions by setting the environment
variable ITM_STATISTICS to "simple" or "verbose" (remove quotes). The
statistics will be written out to a file called itm.log.
For this step, you should disable the transactional and throughput
tests by commenting out the defines at the top of main.C. When your
implementation passes the parallel tests, proceed to the next problem.
Problem 3: Use STM for Concurrent Tree Transactions
The next step is to synchronize the transactional calls to the
concurrent tree. You will do this by modifying the Transactions.C
file to use atomic blocks and your implementations of Lookup, Set, and
Remove (not the TransactionalLookup, TransactionalSet, and
TransactionalRemove). Using atomic blocks will greatly simplify this
code. For example, you can remove the logging and undo functionality,
since this is provided by the STM.
You should leave the calls to usleep
exactly as they are for the throughput tests. The compiler complains
about them, but you can indicate that they do not effect the
transaction by using the __tm_waiver
annotation (this was your TAs best-guess approach to this problem, but
if you think of something better, let me know).
Problem 4: Description of Synchronization Strategies
Describe where you used transactions to synchronize your
code. Describe where you felt transactions weren't appropriate.
Problem 5: Evaluation
Evaluate the throughput of both the transactional and torture tests on
your implementation, and compare them with the single lock
implementation and, if you got it working, your program from HW#2.
Also, set the ITM_STATISTICS variable to verbose and collect the
transaction statistics for the parallel test, and for the
transactional tests separately. Include a print-out of the "GRAND
TOTAL" section of each with your report.
Problem 6: Questions (Submission Credit)
- Is transactional memory the best thing since sliced bread?
- Describe how you used atomic blocks and the purpose of each.
- What did you observe in the transactional statistics? What sort of differences did you see between the fine-grained parallelism of the parallel tree operations (parallel tests), and the more coarse-grained parallelism of the transactional tests?
- Describe any restructuring you did to the program to improve performance.
- Which implementation -- lock-based or transaction-based -- performed best?
- Which was easier overall -- writing the transactional version, or writing the fine-grained locking version? Include
estimates of coding and debugging time.
Tips and Tricks
Start early.
What to Hand In
Please turn this homework in on paper at the beginning of
lecture. Please STAPLE the pages together (paper clips are
better than nothing, but staples are preferred).
Your code from CTree.[hC] and Transactions.C, as well as any other
modifications you make.
Your results from Problem 5.
Answers to questions in Problem 6.
Important: Include your name on EVERY page.
|