# CS 368-4 (2012 Fall) — Day 14 Homework

Due Thursday, December 13, at the start of class.

## Goal

Play a simple card game… a lot, to estimate the likelihoods of the possible outcomes.

## Background Information

There is a very simple single-deck solitaire card game that I know. I cannot find a name for this game — if you happen to know a name for it, let me know!

Anyway, it is not much of a “game”, in that the outcome is completely determined by the order of cards and by the rules. Call it a pastime.

Here is how it works. Shuffle a complete 52-card deck. Play one card at a time, in order. We will call the most recently played card the “top”. Examine the top-most four cards (if there are that many yet). If the first (top) and fourth cards have the same face value, remove all four cards. If the first and fourth cards have the same suit, remove only the intervening two cards. If there are still four or more cards left, repeat this removal procedure with the new top-most four cards, until either there are fewer than four cards left or until it is not possible to remove any more cards. Then, play the next card and repeat the whole procedure. The game is won if, after dealing all 52 cards and removing ones as above, there are no more cards left in the pile.

### Sample of Play

In the examples below, cards are shown as played from left to right. The “top” or first card is always the furthest to the right. You may need special fonts to see the suit characters.

Suppose the first four cards to be played are:

3♠ 7 J♠ 2♠

Comparing the first (“top”) card, 2♠, with the fourth card, 3♠, we see that they are not the same face value, but they are the same suit. Hence, we can remove the middle two cards, leaving the pile as follows:

3♠ 2♠

Playing two more cards, suppose we have:

3♠ 2♠ 8♣ 8

The first and fourth cards do not have the same face nor suit, so we keep playing. Nothing happens until we get here:

3♠ 2♠ 8♣ 8 K 3 2

Now, the first (2) and fourth (8) cards match suit, so we remove the middle cards, leaving:

3♠ 2♠ 8♣ 8 2

And then, without playing another card yet, we see that the new first (2) and fourth (2♠) cards have the same face value, so we can remove all four top cards, leaving just one:

3♠

And so on, until we have played all 52 cards.

For what it’s worth, this game can be played with real cards in a way that does not require a table or anything like that. Great for long car, bus, train, airplane rides! See me sometime if you want a quick demo.

### The Software

I wrote a little program in C that plays this card game. It creates a random shuffling of the card deck, plays through the whole game, and reports on the outcome. Here is a sample of running the program:

```% /usr/local/bin/cards
HTyfzbDLEKqhvGrJgmlZMiCoAVatWBSXIdkYOwcjRsepxFUnuNPQ  4 HTPQ```

(The `%` is the shell prompt, not something I typed.)

What is going on in the output? Well, there are 52 cards in a deck, and there are 52 letters in the alphabet, if you count upper- and lowercase separately. So, for compactness, I mapped each card to a letter: A = A♣, B = 2♣, …, N = A, …, a = A, …, n = A♠. When finished, the program prints out the entire shuffled deck as a sequence of letters. Then comes the total number of cards remaining (4 in this case), and then the exact remaining cards. One could imagine analyzing the shuffled sequences for repeats or randomness, but all we care about for this exercise is the number of cards remaining.

The program takes a single optional argument, which is an integer that is used to seed the random-number generator. If you run the program with the same seed, you get the same results:

```% /usr/local/bin/cards 42
xSYRiwgHCMpuWrkIDhKjnVXcvmObFdsLAoNQtaqlGeJTzBUEZfPy 12 xSYRiwgHCMpy
% /usr/local/bin/cards 42
xSYRiwgHCMpuWrkIDhKjnVXcvmObFdsLAoNQtaqlGeJTzBUEZfPy 12 xSYRiwgHCMpy```

It is fine to use the cards program without the seed. By default, it initializes the random sequence using the computer’s built-in clock, which should have very fine-grained resolution. But, we can use the seed to make sure that we get a certain number of unique random-number sequences.

For fun, you can run this program a bunch of times from the shell (not Python!). The following command (just copy and paste it verbatim into the command-line prompt on `submit-368`) will run the card game 1000 times and just show the winning games:

`for i in {0..999}; do /usr/local/bin/cards; done | grep '  0 '`

Be kind to your submit machine! Do not run the program more than 1000 times in a row on the submit machine itself. For more runs, let’s use our execute machines!

This program is very fast, but we want to run it lots of times. Think of it this way: There are 52! ways to shuffle a card deck, which is something like 8×1067. Even factoring out symmetric deals, there are still a lot of unique games. So we want to play 10–100 million games or so, to get a good feel for the overall statistics.

### Part I: Design

Obviously, this is a case for a batching wrapper script. Consider the following requirements:

• The wrapper script should run the `cards` program for a given range of seed numbers. Across all runs, we want to start at 0 and go up to 9,999,999 or even 99,999,999. For the sake of the pool, it is best not to exceed 100 million total runs, and certainly do not let your seed integer go above 4,294,967,295. And in my tests, I found that 1 million runs per job is a good number — it takes about 20–30 minutes to run that many games from a single wrapper.

So, how will your wrapper script know which seed numbers to use? Think about the exact command-line argument(s) that each wrapper will take and what it will do with it/them.

• Think about the output: If we were to save all of the output, we would have a lot. The output averages around 70 bytes per run, which means that 100 million runs would produce 7 billion bytes (~6.5 GB). While that is certainly workable these days, we really do not need all of that output. In fact, all we need to know is the number of games that resulted in each possible outcome. That is, how many games did we win (0 remaining cards), how many had 2 remaining cards, 4, 6, and so on. (Yes, the number of remaining cards is always even. Think about it.) With this data from each job, we could combine the results to get overall counts from all 10–100 million runs (but you are not required to do so for the assignment).
• How will you design your submit file? Will you use DAGMan? How many nodes/jobs? Which files will you create by hand and which will you script? Remember that, ultimately, each job will run the wrapper script, which in turn will run the card program. Be sure to transfer the card program itself, because it is not installed on the execute machines.
`transfer_input_files = /usr/local/bin/cards`

And remember that the `cards` program will be in the same directory as your wrapper script on the execute machine, not in `/usr/local/bin` as it is on the submit machine.

### Part II: Implementation

Implementation should be fairly routine, by this point. I suggest building up features in this order:

1. Start your Python wrapper script with code that can run the card game once, with a fixed seed value, and collect and print its output. Make sure this works before going on.
2. Add the ability to run the card game more than once, one time for each seed number in a fixed range (say, 1 through 10).
3. Add one or more command-line arguments to specify the range of seed numbers to use, and make the main loop use those numbers. Again, on the submit machine, limit yourself to 10 to 100 seed numbers at a time.
4. Now, instead of printing the output, tally the number of games per number of remaining cards. Print out the tallies instead of the actual output from `cards`.
5. Your wrapper is done! Test it for a few different seed number ranges, make sure it is OK, and then you are ready to move on.
6. Write a single submit file that uses the wrapper to run `cards` maybe 1,000 to 10,000 times. Make sure you can get a single job with the wrapper to work correctly before scaling up.
7. Now you should be ready to implement the full set of jobs.

This is your last assignment, and so it combines lots of elements that we have done previously. In many ways, a wrapper script is nothing new (see, e.g., Homework #8 and Homework #10). Nor is running lots of jobs with precise arguments and output files and so forth. Nonetheless, start early and give yourself plenty of time to make a few mistakes before finishing! And keep in mind that the final run will take a little while — my final run of 100 million total card games took about 32 minutes, start to finish.

## Extra Challenges

Some ideas for extra learning:

• Use DAGMan to add an extra step or two at the end to combine all individual results into a single (short) data file, then graph the results using your favorite graphing tool (in this course, we have seen `gnuplot` and `R`).
• Also in the DAGMan realm, clean up unnecessary output files automatically. But, do not clean up files if there are any failures, because the files might be helpful to diagnose what went wrong.

## Reminders

Do the work yourself, consulting reasonable reference materials as needed. Any resource that provides a complete solution or offers significant material assistance toward a solution not OK to use. Asking the instructor for help is OK, asking other students for help is not. All standard UW policies concerning student conduct (esp. UWS 14) and information technology apply to this course and assignment.

## Hand In

All homework assignments must be turned in by email! See the email page for more details about formatting, etc.

For this assignment, submit your Python script and any required support files (templates, HTCondor submit files, whatever).