CS537 Spring 2020, Project 1a

Updates

Administrivia

Unix utilities

In this assignment, you will build a set of linux utilities but much simpler versions of common used commands like ls, cat, grep etc. We say simpler because to be mildly put the original are quite complicated! We will call each of these utilities slightly different to avoid confusion - wis-grep, wis-tar, wis-untar.

Learning Objectives:

Summary of what gets turned in:

Before beginning: Read this lab tutorial; it has some useful tips for programming in the C environment. Also read the hints document to help you get started.

wis-grep

The first utility you will build is called wis-grep, a variant of the UNIX tool grep. This tool looks through a file, line by line, trying to find a user-specified search term in the line. If a line has the word within it, the line is printed out, otherwise it is not.

Here is how a user would look for the term foo in the file bar.txt:

	prompt> ./wis-grep foo bar.txt
	this line has foo in it
	so does this foolish line; do you see where?
	even this line, which has barfood in it, will be printed.

For hints on how to get started you can read more about how to open a file and read it in the hints document.

Details

wis-tar and wis-untar

The next two utilities you will build are simpler versions of tar and untar, which are commonly used UNIX utilities to combine (or expand) a collection of files into one file (one file into a collection of files). This functionality is useful in a number of scenarios e.g. offering a single file to download for software. (If you’ve heard the phrase tarball, that comes from using tar!)

The input to your wis-tar program will be the name of the tar file followed by a list of files that need to be archived (Fun Fact: The name tar comes from tape archives!). Example:

  prompt> echo abcd > a.txt # creates the file a.txt
  prompt> echo efgh > b.txt # creates the file b.txt
  prompt> ./wis-tar test.tar a.txt b.txt

wis-tar format

For the purpose of this assignment we will use a simple file format for our tar file. Our format will be

  file1 name [100 bytes in ASCII] 
  file1 size [8 bytes as binary]
  contents of file1 [in ASCII]
  file2 name [100 bytes]
  file2 size [8 bytes]
  contents of file2 [in ASCII]
  ...

Here are a few points that will help you with your implementation

  1. You can assume that the files provided as an input exist in the directory where the program is run from (no need to handle ways to store pesky path names).
  2. You can also assume the files provided as inputs only contain ASCII characters.
  3. You can read more about how to find the size of a file in hints document.

Details

EXAMPLE: Lets look at a complete example to make sure we understand the format and how to understand the contents of a valid tar file. In the following example we first create a text file which contains the string hey. So its size is 3 (Remember this!).

Next we run wis-tar to create a.tar as shown below. Finally we print the contents of a.tar using hexdump, a utility to print the contents of a binary file. The comments to the right explain the output of hexdump. Remember that the bytes are represented in hexadecimal format, so handy table like this will help you lookup the ASCII values for strings. Try to see if you can decode the contents based on the comments on the right!

➜  p1a cat a.txt
hey%
➜  p1a ./wis-tar a.tar a.txt
➜  p1a hexdump -v a.tar
0000000 2e61 7874 0074 0000 0000 0000 0000 0000  --> The first five bytes here contain the file name a.txt
0000010 0000 0000 0000 0000 0000 0000 0000 0000
0000020 0000 0000 0000 0000 0000 0000 0000 0000
0000030 0000 0000 0000 0000 0000 0000 0000 0000
0000040 0000 0000 0000 0000 0000 0000 0000 0000
0000050 0000 0000 0000 0000 0000 0000 0000 0000  ---> We have padded using \0 till we hit 100 bytes
0000060 0000 0000 0003 0000 0000 0000 6568 0079  ---> The byte containing 03 indicates the file size is 3. Note that this is not in ASCII!
----------------------------------------------------> The last three bytes contain the string hey, the contents of the file

As you can see from the above description the fields in our tar file are packed together and there are no new lines or other separators between them.

wis-untar

As the name suggests, the wis-untar program will do the reverse of the wis-tar program. Here you will take in the name of an archive created using wis-tar and the program will create files corresponding to those in the archive in the same directory where the program is run from. For example

  prompt> ./wis-untar test.tar
  prompt> ls # should contain a.txt and b.txt

Details

Acknowledgments

The assignment borrows content from assignment 1 of Prof. Remzi Arpaci-Dusseau’s course in Spring 2018.