Computer Sciences Department logo

CS 368-1 (2011 Summer) — Day 8 Homework

Due Monday, July 25, at the start of class.

Description

Practice writing regular expressions.

Details

You are not writing a script this time! Instead, I provide the script (same one we used in class):

#!/usr/bin/perl
use strict;
use warnings;

open(INPUT, '<', $ARGV[0])
    or die "Could not open file: $!\n";

while (<INPUT>) {
    print if /cat/;
}

close INPUT;

Your assignment is to write regular expressions for the following patterns. Use the script to test your expressions. To help with testing, here are the two data files that I used in class:

For some of the patterns, you may wish to create your own input file(s). That is fine.

Most of the patterns below are narrowed defined, but some permit more freedom of interpretation. If you think a pattern is not clear, your answer must include a description of how you interpreted it; include example matches and non-matches to support your interpretation. If your pattern does not match my interpretation, or a fairly obvious other one, it will not count.

Patterns

You must get at least 7 of the 10 patterns below correct in order to get full credit for the assignment. Give yourself time, and be sure to check both matches and possible non-matches!

Think carefully about whether letter case matters in each pattern!

Some patterns are prefixed with “[words]” to indicate that the pattern should work on the words data file, and some are prefixed with “[Henry]” to indicate that the pattern should work on the King Henry V data file. Other patterns are for other data.

  1. [words] I am trying to solve a crossword puzzle clue; it contains exactly five letters, the first letter is “A” and the middle (3rd) letter is “P”. What are the possible words that could go there?
  2. [words] Find all single words, of any length, that contain exactly one vowel letter (a, e, i, o, u).
  3. [words] Match the word “hello” and nothing else.
  4. [words] Find all words that begin with the letter “o” and end with the letter “n”; case do NOT matter.
  5. [words] Find all words that contain “time”.
  6. [Henry] Find the first line of each part spoken by the Chorus. Look at the text carefully to figure out how to identify such lines.
  7. [Henry] Find every line where “France” is the last word of a sentence; sentences end with “.” in this text.
  8. [Henry] I am looking for the famous line from this play, but all I can remember is that it contains the words “band” and “brothers”. Please help me find that line!
  9. Search through a Perl script (use one of your own) and print out all lines that are comments. If a line contains real code and then a comment, do not print it — just whole lines that are a comment and nothing else. Hint: Do not forget about indented lines.
  10. Match valid 10-digit North American Numbering Plan telephone numbers. [You knew this was coming, didn’t you?] Do the best you can, and clearly state your assumptions.

Reminders

Do the work yourself, consulting reasonable reference materials as needed; any reference material that gives you a complete or nearly complete solution to this problem or a similar one is not OK to use. Asking the instructors for help is OK, asking other students for help is not.

Hand In

A printout of your regular expressions, clearly labeled, on a single sheet of paper. Provide any necessary qualifications for your expressions, including example matches and non-matches. Be sure to put your own name in the initial comment block of the code. Identifying your work is important, or you may not receive appropriate credit.