Computer Sciences Department logo

CS 368-3 (2012 Summer) — Day 12 Homework

Due Monday, July 30, at the start of class.

Weather Forecast Analysis, Part I

The purpose of this three-part project is to compare weather forecasts against actual weather observations. In this first part, the goal is to save information about today’s weather forecast. After running this script several days in a row, we will have forecast data against which we can compare actual data. So, this first script will download today’s weather forecast, extract key data, and save them in a simple text format.

Tasks

Every day, sometimes several times during the day, there is a text weather forecast for Madison posted at the following URL:

http://www.ssec.wisc.edu/cgi-bin/lc?mad_for

The forecast is generally for the next several days. But, we are simply interested in the forecast for today and tonight. Here is a sample fragment of the actual HTML that includes the forecast for one day:

<TD>
<H1>Madison Forecast</H1>
Local Madison Forecast
300 AM CDT THU JUL 29 2010
<br><br><font size=+1><B>TODAY...</B></font>SUNNY. HIGHS IN THE LOWER 80S. NORTHWEST WINDS UP TO
5 MPH.
<br><br><font size=+1><B>TONIGHT...</B></font>PARTLY CLOUDY. LOWS AROUND 60. NORTHWEST WINDS UP TO
5 MPH THROUGH AROUND MIDNIGHT BECOMING CALM.

Note: The sample above does not include most of the actual HTML page downloaded from the URL above. There are many more lines in the real file!

The goal for today is to parse enough of the downloaded forecast to get a Unix time for the forecast time, and the textual forecast predictions for the high and low temperatures. Then, we save the data for later analysis.

Required Work

If you like, you need only implement two subroutines in the larger script. That is, I have written the rest of the script (available here), and you can just fill in two critical parts related to today’s lecture. BUT please consider doing the whole script yourself!!! I am much more willing to overlook minor flaws if you do the whole script; if you start with mine and make many mistakes, you risk not getting full credit.

Subroutine #1: convert_timestamp This function converts a timestamp, in the text format used in the forecasts, to a standard Unix (integer) timestamp. As described in the comment before the function itself, the function should return the Unix timestamp; if something goes wrong, it should return undef. For example:

convert_timestamp('1034 AM CDT THU JUL 29 2010') => 1280417640

For testing and validations, you can use the Epoch Converter website to convert a regular date and time to a Unix timestamp. Beware: The main conversion form on that website expects times in GMT (UTC), not U.S. Central Daylight Time. So be sure to enter your times correctly adjusted for UTC, or use the later form that accepts an RFC 2822 formatted date and specify “CDT” as the timezone!

Subroutine #2: write_file This subroutine writes the given string contents to the given file safely. Returns true upon success or false otherwise. So basically, this is an extension of the pattern shown in class. However, to get full credit, you must integrate the tempfile() function for creating and opening your temporary output file; rename the temporary file to its final name as shown. Also, do a good job of checking for errors throughout the subroutine.

Note: The starter script includes the --test option and a few test cases; if you write your own script, you should add the unit testing pattern, too. Be sure your convert_timestamp subroutine works with the given tests. Consider adding more (see below).

Optional Work

If you want to make the assignment a bit more challenging, simply delete the contents of any other subroutines in the starter script, and try writing them yourself. Use the unit tests to make sure you get back to success.

But again, if you feel up to it (and I think many of you are!), try to write the entire script on your own. If you do, think about what the entire script needs to do. It should download and parse the weather forecast, and write a text file with the extracted data. The file should be named:

wx-YYYY-MM-DD.txt

Where YYYY is the year, MM the month, and DD the date of the forecast timestamp in the downloaded data.

The file consists of a single line of text with four data fields, separated by tab characters. For example:

2011-07-29\t06:47\tUPPER 80S\tLOWER 60S\n

The date and time fields are from the forecast timestamp, and the high and low predictions are from the “TODAY” and “TONIGHT” sections of the forecast text.

Also, you could add more unit tests to the script. Try lots of different cases. And anytime that you find something in the real, live downloaded data that breaks your script, reproduce the failure in a unit test first (before fixing the bug).

Reminders

Do the work yourself, consulting reasonable reference materials as needed. Any resource that provides a complete solution or offers significant material assistance toward a solution not OK to use. Asking the instructor for help is OK, asking other students for help is not. All standard UW policies concerning student conduct (esp. UWS 14) and information technology apply to this course and assignment.

Hand In

A printout of your code, ideally on a single sheet of paper. Be sure to put your own name in the initial comment block. Identifying your work is important, or you may not receive appropriate credit.