COMP 171 Data Structures and Algorithms
2002 Fall Term,
MWF 12:00-12:50, Rm 2503
Dr. Dekai WU, Rm 3539, 2358-6989,
dekai@cs.ust.hk
In this programming assignment you will implement a number of different
sorting algorithms, and experimentally compare their performance. You will
perform timing experiments to measure the run time. In addition, you will record
the number of comparison operations and data movement operations required by
each algorithm.
You will have to hand in a physical printout of each of your programs along
with running time statistics. We will also electronically check your
programs.
As usual, please remember to submit your work before the deadline. If you
have any questions about this assignment, please send emails to shenyh@ust.hk.
DUE DATE
The deadline of the assignment is 12:00 noon, Wed 30 Oct 2002. Late
assignments will not be accepted, since ACS will not collect them.
GRADING
Each function written should be adequately documented so that a reader will
be able to understand what is being done. Your programs will be run on a
standard test data set.
This assignment will account for 80 points in your final score. The general
grading criteria for a program is as follows:
- 80% - Correctness
- 10% - Programming style
- 10% - Documentation
When your program does not run correctly on the test data, then your program
will be graded by visual examination and the maximum score you can then get for
Correctness is 30%.
ACS SUBMISSION
To avoid receiving zero credit for this assignment,
please pay careful attention to the following instructions. We will
electronically copy the programs from your account to ours for testing. To
ensure that the copying routine performs properly you must do the following.
- You must use your Computer Science department UNIX
account to submit your homework (do not use other campus account from
ITSC or other departments). If for some reason this is not possible please
come see one of the instructors.
- You should have already activated this account for Assignment 0. If you
have not, you can do it in the CSSystem home page (https://cssu3.cs.ust.hk:8443/pass.html).
- Your program must compile properly on your CS department account, using
the g++ compiler. Other C++ compilers or computers cannot be
used for your final submission.
- Create your homework directory "~/comp171/a1/" under your
CS department UNIX account.
- Put the source code for your functions in a single file
"sortlib.cpp" under your homework directory before the
deadline. Make sure you name your file exactly as specified.
- Make sure that there is no main() function in your
sortlib.cpp. As its name implies, sortlib.cpp is intended to
be used as a library of functions that some other main program can call.
- Make sure there are no other files in this directory. You
are encouraged to compile and run your program to check if it gives the right
result, but use another directory for your implementation and testing.
IMPLEMENTATION (60% of Correctness)
You are required to program four
sorting functions, each with the same arguments:
- qsort(int keys[], int n) // quicksort
- hsort(int keys[], int n) // heapsort
- msort(int keys[], int n) // mergesort
- rsort(int keys[], int n) // radix sort
keys is an array of n elements containing the input list to
be sorted. You may assume that each key is an integer 0 < k < 9999999.
(Remember that in C++, an array argument only passes the pointer to the base
address of the array, rather than the whole contents of the array.)
After calling the sort function, the elements in keys should be
sorted. Note that for quicksort, heapsort, and radix sort, you may sort the list
directly in the keys array (i.e., perform the sort in place). For
mergesort, however, you will need to copy keys to a separate auxiliary
array.
(Although you may want to keep these functions in separate files while
implementing and testing, remember to store them all in the same file
sortlib.cpp before turning in the assignment. Test them again
after combining into one file, to make sure nothing has been
accidentally broken!) Make sure there is no main() function in
sortlib.cpp!
ANALYSIS (40% of Correctness)
You will measure the performance of each
sorting algorithm in three ways.
- Run time --- you will time how long the algorithms take to run,
using the UNIX time command. To get accurate measurements, you will
need to call the same routine with the same data 10 or 100 or 1000 times, and
then divide. Here's an example of how to use the time command:
cssu31:dekai:158> time isortprog
4.260u 0.060s
0:04.33 99.7% 0+105k 0+0io 0pf+0w
The running time of the program is the user time + the system time, which
in this example equals 4.26 + 060 = 4.32 seconds.
- Comparisons --- you will insert code in your functions to
increment a counter every time a comparison is performed between any two
elements of the data.
- Data movements --- you will insert code in your functions to
increment a counter every time a data element is moved.
For each of these criteria, you should obtain measurements for n
being 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, and 16384. Use the standard C
random number generator rand() to produce unsorted lists (you will need
to #include <stdlib.h>). For these experiments we are not
interested in seeing the output.
Record the measurements in tables. Also create three graphs,
one for each criterion, plotting the measurements for each type of sort against
the input size to compare their performance.
TEST YOUR ALGORITHMS WITH THE TEST DRIVER
We will give you the following files to help you with your implementation and analysis:
The files containing the test arrays
We provide you with files containing arrays of different length to do your analysis. The length of an array is always an exponent of 2, ranging from 1024 to 524288. The files have the filename with the
style "array****.txt" in which "****" represents the length of the array, i. e. the file "array1024.txt" contains an array with the
length of 1024 and the file "array65536.txt" contains an array with the length of 65536.
"sortlib.hpp"
The file "sortlib.hpp" gives the name and parameters of the functions implementing the four required algorithms. You will use exactly the names and parameters for your functions defined in "sortlib.hpp".
"sort.cpp"
The file "sort.cpp" contains the "main" function you will use to test your sorting algorithms. It uses two arguments:
- Argument 1: Use argument 1 to select your sorting algorithm. Type 1 for quick sort, 2 for heap sort, 3 for merge sort and 4 for radix sort.
- Argument 2: The length of the array you want to test with. The length you can select ranges from 1 to 524288. If you want to test your algorithm with a longer length than 524288 you have to create the array yourself. We strongly suggest you select
lengths which are exponents of 2, so that when you select 1024 the test driver will read the array from file "array1024.txt" and when you select a length of 131072, the test driver will read the array from file "array131072.txt", etc. If you select
a length which is not an exponent of 2, the test driver will read an array with the length from the file which contains an array of the least length from all files than contain an array longer than the length you select, i. e. if you select a length
of 1099, the test driver will read numbers from the file "array2048.txt".
The "main" function works as follows:
- Check the two arguments and read in an array with the length you select.
- Sort the array with the sorting algorithm you select. For an array with a length less than 16384, the test driver will sort the array for 100 times to give you meaningful running time.
- Check if the array is sorted.
How to do the test?
Please follow the following steps to go on with the test:
- Write the code "#include "sortlib.hpp"" in your file "sortlib.cpp".
- Put the files "sort.cpp", "sortlib.hpp" and "sortlib.cpp" under the same directory.
- Use the command "g++ sort.cpp sortlib.cpp -o filename" to create the executable output file with the filename designated by you, i. e. for "g++ sort.cpp sortlib.cpp -o sort" you will get the executable file "sort".
- Use the command "time filename arg1 arg2" to calculate the running time for your algorithm, i. e. if you have the executable file "sort" and you want to use quick sort to sort an array with the length 1024, use the command "time sort 1 1024".
Calculate the running time
Use the command "time filename arg1 arg2" to calculate the running time of the program T1. Some factors affecting the final running time of your algorithm T:
- The time T2 it takes to read arrays from the files should be deducted from the running time.
- 1024 0.04s It takes the test driver 0.04 seconds to read an array with length 1024 from a file.
- 2048 0.06s
- 4096 0.07s
- 8192 0.08s
- 16384 0.09s
- 32768 0.13s
- 65536 0.20s
- 131072 0.34s
- 262144 0.67s
- 524288 1.28s
- If the array you select has a length less than 16384, the test driver sorts the array for 100 times.
- The final running time T for your algorithm sorting an array with length L.
- If L>16384, then T=T1-T2
- If L<16384, then T=(T1-T2)/100
RECAP
Write the four functions msort, qsort, hsort, and
rsort and store them as directed in comp171/a1/sortlib.cpp in
your CS UNIX account. Hand in hard copies of each of the programs along with
tables and graphs describing the timing data collected.
Dekai WU, SHEN Yihai
dekai@cs.ust.hk, shenyh@ust.hk
Last update:
2002.10.15