CSC 270
Homework 7

Assigned Nov. 13, due Dec. 1

What I want you to do

Write a C definition of a struct named StringList that holds a list of strings (i.e. character arrays) and can have elements added and deleted easily.
Write a main program that reads in a sequence of strings (i.e. character arrays) into a StringList (note that you do not know in advance how many there will be, and cannot assume any fixed limit in advance). It then computes the average length of the strings, and prints it out; it then sorts the list into alphabetical order, prints it out, finds the median element in alphabetical order, deletes it from the list, and prints out the new list along with its average length.

The assignment may be done in teams of two students, working together at one workstation much or all of the time. Each team should turn in one copy of the homework with both names on it. If more than two people turn in essentially the same program, I will consider it cheating and they will all be penalized, as stated in the syllabus.

Since this version of the program uses character arrays rather than C++'s string class, you'll have to be very careful with memory allocation: make sure that every piece of memory you allocate is deallocated, once; make sure that you don't change the contents of a string buffer if some other part of the program still thinks it's the same string as before; etc.

What's a StringList?

The tasks of reading in the list values, sorting them, and printing them should each be done by a function that takes a StringList parameter. Other functions dealing with StringLists should include

This still leaves you with a lot of possible implementations. The one I'd like you to try for this assignment is a shrinking-and-growing array:

Shrinking and growing array, naive version
In this approach, the StringList class actually contains an array, but the array is dynamically allocated. When an element is added to the StringList, it allocates a new array one element longer, copies the elements of the old array into the new one (with the new element inserted at the appropriate position), and de-allocates the old one. Likewise, when an element is deleted, it allocates a new array one element shorter, copies the elements of the old array (except the element being deleted) into the new one, and de-allocates the old one.
This approach makes it very easy to jump around in the list and swap elements (adjacent or not), but adding and deleting elements is inefficient.
Shrinking and growing array, smarter version
As above, except that rather than allocating a new array one element longer or shorter, and copying the whole thing, every time an element is added or removed, it allocates a new array several elements longer or shorter so that it doesn't have to do it as often. The disadvantage is that you need to keep track of how many elements are really in the array, as well as how large the array is right now.
For example, if the increment were 10, it would need to allocate and copy when the first element was added, but not again until the 11th (21st, etc.) element was added, and any insertions and deletions in between would require no dynamic allocation of memory. Another neat way to do it is to always grow the array by a factor of 2, and only shrink it when less than half of it is in use.
This approach makes inserting and deleting elements more complex, but significantly more efficient. Since the underlying data structure is still an array, you can still jump around in the list and swap elements (adjacent or not) very easily.

Although C structs don't enforce data-hiding, and you can't hide the implementation as thoroughly as you could in C++ or Java, you can still draw a clear line between the functions that deal with StringLists and the functions (including the main function) that deal with the user interface. The latter collection of functions should be able to use the StringList structure and its associated functions without needing to know how it's implemented. The former collection of functions, ideally, should do little or no I/O, but just deal with the data structure representing a list of strings.

How to sort the list?

There are lots of sorting algorithms in the world. The bubble-sort we used in homework 3 is fairly simple, but not very efficient. Get your program to work correctly using the bubble-sort from homework 3, and then, for extra credit, try Quicksort and/or Mergesort.

Quicksort

The idea here is to pick an element of the list -- called the "pivot" -- and divide the list into three categories: the elements less than the pivot, the elements equal to the pivot, and the elements greater than the pivot. Then you recursively sort the elements-less and the elements-greater and concatenate the three together.

In practice, the most common way to do this is a little different. You'll divide the list into three categories: the elements less than the pivot, the pivot itself, and the other elements greater than or equal to the pivot (which may include other copies of the pivot). This can be done most efficiently not by creating two new lists, but rather by swapping elements within the list so that all the elements less than the pivot come before all the elements greater than or equal to the pivot. The function that does this can simply return the dividing line, which will be the index of the pivot in the list.

To "recursively sort the elements-less", your sorting function has to be able to work on not just a whole list, but rather a portion of a list, specified by its starting and ending points (rather like a binary search function). You'll also need to "recursively sort the elements-greater", which is tricky because you don't have an "elements-greater" list, only an "elements-greater-or-equal". If you recursively sort this, you might get unlucky and find that it's the entire list, so you go into an infinite recursion. But you know where the pivot is, so you can recursively sort the elements after it and be guaranteed of shortening the list by at least one, thus avoiding infinite recursion.

Mergesort

Mergesort, too, involves dividing the whole list into smaller pieces, sorting the pieces recursively, and putting them back together. But rather than dividing the list into those less than, equal to, and greater than a pivot, we'll just divide the list into the first half and the second half. Sort each of them recursively (which again can be done most efficiently in place, by just keeping track of starting and ending points, rather than by creating several whole new lists). Now you have two sorted lists, and need to put them together. Imagine each one is a sorted deck of cards: to combine two sorted decks into one, you look at the top card in each deck and take the smaller of the two, then do that again, and again, until one or the other deck becomes empty, at which point you take the entire remaining deck.


Last modified:
Stephen Bloch / sbloch@adelphi.edu