Computer Sciences Department logo

CS 368-3 (2012 Summer) — Day 15 Homework

Due Thursday, August 9, at the start of class.

Goal

I have written a script. There is a chance that it contains security, performance, or correctness issues. Oh, let’s be realistic… it is full of horrible problems! Your job is to look at the script and fix every problem that you find. Consider this your final exam, there is a little bit of everything in here.

Tasks

The script is included below. It is another word frequency script, similar to the first part of the homework from Day 4 — it tallies the frequencies of words found in a file and writes to a new file the ones with frequency greater than 5.

#!/usr/bin/perl

use Text::Wrap qw/wrap/;
$Text::Wrap::columns = 80;

my $filename = $ARGV[0];
my $safename = $filename;
$safename =~ s/^[<> ]+//g;    # Make filename safe

open(my $fh, $filename);
@lines = <$fh>;
@wordlist = ();
foreach (@lines) { push @wordlist, $_; }

my @words;
foreach $line (@wordlist) {
    chomp($line);
      $line = lc($line);
        my $found = 0;
    foreach my $word_ref (@words) {
    if ($word_ref->[0] eq $line) {
        $found = 1;
        $word_ref->[1]++;
    }
    }
    if (not $found) {
        push(@words, [$line, 0]);
    }
}

my ($s, $m, $h, $mday, $mon, $year) = localtime();
my $outname = sprintf('%04d-%02d-%02d-%s', $year + 1900, $mon + 1, $mday, $filename);

open(my $out, ">$outname.new");
foreach my $word_ref (sort {$a->[0] cmp $b->[0]} @words) {
    my $freq = $word_ref->[1];
    next if $freq < 5;
        printf $out "%4d %s\n", $freq, $word_ref->[0];
}

# Safe file write
system("mv $outname $outname.bak");
system("mv $outname.new $outname");

print "Done!\n";
exit 0;

Download the input file here. Run the script like this:

perl homework-15-start.pl homework-15-input.txt

There is no output. Instead, the script writes to an output file (see the script for details).

Analyze the script and rewrite it to fix ALL of the problems that you find. Make it as good as you know how. Change whatever you need to, as long as the script still performs the same task (only better). Turn in your rewritten script.

Optional Extra

Prove that you made the script better. Different fixes might require different kinds of evidence.

Reminders

Do the work yourself, consulting reasonable reference materials as needed. Any resource that provides a complete solution or offers significant material assistance toward a solution not OK to use. Asking the instructor for help is OK, asking other students for help is not. All standard UW policies concerning student conduct (esp. UWS 14) and information technology apply to this course and assignment.

Hand In

A printout of your code, ideally on a single sheet of paper. Be sure to put your own name in the initial comment block. Identifying your work is important, or you may not receive appropriate credit.