Project 1a: Sorting within a RangeYou will write a simple sorting program. This program should be invoked as follows: shell% ./rangesort -i inputfile -o outputfile -l lowvalue -h highvalue The above line means the users typed in the name of the sorting program
Input files are generated by a program we give you called generate.c (good name, huh?). After running
kkRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR Your goal: to build a sorting program called Some DetailsUsing shell% gcc -o generate generate.c -Wall -Werror Note: you will also need the header file sort.h to compile this program. Then you run it: shell% ./generate -s 0 -n 100 -o /tmp/outfile There are three flags to The format of the file generated by the Another useful tool is dump.c . This program can be used to
dump the contents of a file generated by HintsIn your sorting program, you should just use If you want to figure out how big in the input file is before reading it
in, use the To sort the data, use any old sort that you'd like to use. An easy way to
go is to use the library routine To exit, call The routine If you don't know how to use these functions, use the man pages. For
example, typing You will need to decide when you want to discard keys that fall
outside of the specified range (i.e., discard a key if its Remember to keep keys that exactly match the low or highvalue (i.e., the range for selected keys is inclusive). Also remember that the specified range might include zero, one, or all of the keys in the original input set. Be sure that you handle the case where the lowvalue is equal to the highvalue (i.e., the sort should output all keys with their records that are exactly equal to lowvalue and highvalue). Assumptions and Errors
32-bit integer range. You may assume that the keys are unsigned 32-bit integers. File length: May be pretty long! However, there is no need to implement a fancy two-pass sort or anything like that; the data set will fit into memory. Selected records: Depending upon the input set and the specified low and high values, the sort may end up sorting zero, one, a few, many, or all of the keys in the original input data set. Do not make any assumptions about the number of keys that might fall within the low and high values (inclusive). Invalid files: If the user specifies an input or output file that you
cannot open (for whatever reason), the sort should EXACTLY print: Invalid range: If the user specifies a high value
that is less than the low value (or specifies values that are not
unsigned 32-bit integers), the sort should EXACTLY print: Too few or many arguments passed to program: If the user runs rangesort
without enough arguments, or in some other way passes incorrect flags and such to
rangesort, print Important: On any error code, you should print the error to the screen
using
History and an Optional Performance ContestThis sorting assignment derives from a yearly competition to make the fastest disk-to-disk sort in the world. See the sort home page for details. If you look closely, you will see that your professor was once -- yes, wait for it -- the fastest sorter in the world. To continue in this tradition, we will also be holding a sorting competition. The range sort that you need to implement for this assignment is a little trickier than the straight-forward sort, since you may need to discard keys (and need to decide if you should discard keys before or after the actual sort). For the contest, we will average together your program's runtime on three input sets:
Whoever wins the performance contest will win a soft-cover copy of the OSTEP text book. Read more about sorting, including perhaps the NOW-Sort paper , for some hints on how to make a sort run really fast. Or just use your common sense! Hint: you'll have to think a bit about hardware caches. Note: Your grade will not be impacted by the performance of your sorting program. Your grade will depend only on the correctness of your program. We will only measure the performance of assignments that sort correctly for all cases! General AdviceStart small, and get things working incrementally. For example, first get a program that simply reads in the input file, one line at a time, and prints out what it reads in. Then, slowly add features and test them as you go. Don't worry about performance until you have all of the functionality working correctly. Testing is critical. Testing your code to make sure it works is crucial. Write tests to see if your code handles all the cases you think it should. Be as comprehensive as you can be. Of course, when grading your projects, we will be. Thus, it is better if you find your bugs first, before we do. Keep old versions around. Keep copies of older versions of your program
around, as you may introduce bugs and not be able to easily undo them. A
simple way to do this is to keep copies around, by explicitly making copies of
the file at various points during development. For example, let's say you get
a simple version of Keep your source code in a private directory. An easy way to do this is
to log into your account and first change directories into |