# CS 368-2 (2012 Spring) — Day 3 Homework

Due Thursday, March 22, at the start of class.

## Goal

Write a Python script that gathers input data and calculates statistics.

Use the Python that you have learned to write a simple data analysis tool.

The script will ask the user for some data observations, each of which consists of one text label and its associated numeric value. For example, I entered U.S. states (labels) and their 2010 populations (values). After some input, the script might have the following data (populations are in thousands):

LabelValue
Wisconsin5687
Illinois12831
Michigan9884
Iowa3046

The script should cycle through the following steps:

1. Display a list of the current labels and values
2. Display some summary statistics about the data
3. Ask the user for another item and then the corresponding value
4. Store the label and value

A typical interaction might look like the following. The exact formatting is not required. In this sample interaction, the yellow highlighting shows what the user typed.

```OBSERVATIONS
<none>

Label? Wisconsin
Value? 5687

----------------------------------------

OBSERVATIONS
Wisconsin: 5687.00

STATISTICS
Count:       1
Sum:      5687.0
Mean:     5687.0

Label? Illinois
Value? 12831

----------------------------------------

OBSERVATIONS
Illinois: 12831.00
Wisconsin: 5687.00

STATISTICS
Count:       2
Sum:     18518.0
Mean:     9259.0

Label?```

How the user will quit the script? Maybe if the label input is the empty string? Maybe a special label or value that tells the script to quit? Pick something and implement it.

The program should not attempt to save its state. That is, when you quit the program and run it again later, it will start with no observations.

Tip: If the user enters the same item a second time, it should replace the original item; that is, you do not need to check to see if an item already exists, just store the data.

Tip: If the user tries to enter the same item more than once, but spells it differently, it will end up being a separate item. That is, your script does not need to be clever or fancy about item names — just accept what the user types.

## Extra Challenges

If the requirements above were easy, try one or more of the following challenges. No extra credit, just extra learning!

1. Calculate and display additional statistics: minimum value, maximum value, standard deviation from the mean, ….
2. Add the ability to delete items from the observations. How does the user indicate that an item should be deleted? Maybe just a blank value? What if that item is not in the data already?
3. [Hard:] Display a simple text histogram for your data. It is easiest to use a horizontal bar chart, perhaps something like this:
```RANGE    FREQUENCY
-------  ---------+---------+---------+
0M- 5M  XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
5M-10M  XXXXXXXXXXXXXXX
10M-15M  XXX
15M-20M  XX
20M-25M
25M-30M  X
30M-35M
35M-40M  X```
4. [Hard:] How could you display the observations in the order that the user entered them (assuming that it does not do so already)?

## Reminders

Start your script the right way! Here is a suggestion:

```#!/usr/bin/env python

"""Homework for CS 368-2 (2012 Spring)
Assigned on Day 03, 2012-03-20