CSC 270
Homework 4

C++ or Java version

What I want you to do

Write a definition of a class named StringList that holds a list of strings and can have elements added and deleted easily.
Write a main program that reads in a sequence of strings into a StringList (note that you do not know in advance how many there will be, and cannot assume any fixed limit in advance; the user will indicate that the list is done by typing a blank line). It then computes the average length of the strings, and prints it out; it then sorts the list into alphabetical order, prints it out, finds the median element in alphabetical order, deletes it from the list, and prints out the new list along with its average length.

What's a StringList?

The tasks of reading in the list values, sorting them, and printing them should each be done by a method of the StringList class. Other methods of the StringList class should include

This still leaves you with a lot of possible implementations (listed below); I want you to write your class definition in such a way as to hide the implementation, so the person using your class shouldn't need to know whether you're using singly-linked lists, doubly-linked lists, shrinking-and-growing arrays, etc.

Fixed-size array
This is tricky, since we don't know how big to make it until the user has finished typing in all the strings.
Shrinking and growing array, naive version
In this approach, the StringList class actually contains an array, but the array is dynamically allocated. When an element is added to the StringList, it allocates a new array one element longer, copies the elements of the old array into the new one (with the new element inserted at the appropriate position), and de-allocates the old one. Likewise, when an element is deleted, it allocates a new array one element shorter, copies the elements of the old array (except the element being deleted) into the new one, and de-allocates the old one.
This approach makes it very easy to jump around in the list and swap elements (adjacent or not), but adding and deleting elements is inefficient.
Shrinking and growing array, smarter version
As above, except that rather than allocating a new array one element longer or shorter, and copying the whole thing, every time an element is added or removed, it allocates a new array several elements longer or shorter so that it doesn't have to do it as often. The disadvantage is that you need to keep track of how many elements are really in the array, as well as how large the array is right now.
For example, if the increment were 10, it would need to allocate and copy when the first element was added, but not again until the 11th (21st, etc.) element was added, and any insertions and deletions in between would require no dynamic allocation of memory. Another neat way to do it is to always grow the array by a factor of 2, and only shrink it when less than half of it is in use.
This approach makes inserting and deleting elements more complex, but significantly more efficient. Since the underlying data structure is still an array, you can still jump around in the list and swap elements (adjacent or not) very easily.
Shrinking and growing array, even smarter
It turns out that the data structure becomes significantly more efficient if you always shrink or grow the array by a factor of 2. For example, if the array currently has a size of 10, 6 of whose slots are full, you can add 4 items before you need to grow it, at which point you grow it to 20. You can then add another 10 items before you need to grow it again, at which point you grow it to 40. In the other direction, let's say whenever the array is less than 1/4 full, you shrink it by half: thus if it's grown to 40, but then the user deletes a bunch of elements so only 10 elements are in use, you shrink it to 20.
Singly-linked list
either using the null pointer to indicate the end of the list, or using two subclasses EmptyList and NonEmptyList (aka Cons).
Note: a functional program can easily delete an element out of the middle, by rebuilding the rest of the list without that element, but this is more inefficient. Splicing an element out is more efficient, but difficult unless you know the predecessor of the element you're deleting. So...
Doubly-linked list
Rather than each element "knowing" only its own value and the rest of the list to its right, each element knows its own value, the part of the list to its right, and the part of the list to its left.
Somewhat more complex, but allows you to scan through the list in either direction, and in particular, you can delete any element by splicing together the list to its right and the list to its left. Similarly, you can easily swap two adjacent elements, because each one knows where the other one is.
In Java, ArrayList
The Java standard libraries include a class named ArrayList that basically implements the "shrinking and growing array, even smarter" described above. You can save a lot of work by using it rather than writing your own implementation. The StringList class then serves only as a "front end" for ArrayList, enforcing that all the things in the list are in fact strings, allowing only the specified operations, returning strings, and implementing a sort.

How to sort the list?

There are lots of sorting algorithms in the world. The bubble-sort we used in the previous problem is fairly simple, but not very efficient. Get your program to work correctly using the bubble-sort from the previous homework, and then try Quicksort and/or Mergesort.

Quicksort

The idea here is to pick an element of the list -- called the "pivot" -- and divide the list into three categories: the elements less than the pivot, the elements equal to the pivot, and the elements greater than the pivot. To do this, you may want to write a categorize method that "returns" three lists (by taking in three List & parameters and assigning them). Once this is done, recursively sort the elements-less list and the elements-greater list (the elements-equal list, having only one value in it, is already sorted). Then you can simply concatenate the sorted elements-less list, the elements-equal list, and the sorted elements-greater list and get the final answer.

Mergesort

Mergesort, too, involves dividing the whole list into smaller pieces, sorting the pieces recursively, and putting them back together. But rather than dividing the list into those less than, equal to, and greater than a pivot, we'll just divide the list into the first half and the second half. Sort each of them recursively. Now you have two sorted lists, and need to put them together. Imagine each one is a sorted deck of cards: to combine two sorted decks into one, you look at the top card in each deck and take the smaller of the two, then do that again, and again, until one or the other deck becomes empty, at which point you take the entire remaining deck.

C version:

You can't write a StringList class, because C doesn't have classes. However, you can define a struct to hold all the information you need to keep track of, and you can write a bunch of functions that take in such a struct as a parameter. Put all these functions in one source file (say, stringlist.c), with a header file (stringlist.h) that declares the struct and the functions; the main program can be written in a separate file, include-ing the header file, and linked together with the separately-compiled stringlist object file. Any implementation you could do in C++ or Java (except ArrayList), you should still be able to do in C; the implementation just won't be as thoroughly hidden.

Note: One way to implement the lists is with a global array. I don't recommend this, as it has the weird property that a main program can't have two different StringLists at the same time.

Scheme version:

The assignment described above is really antithetical to the "functional programming" paradigm, but if you want to play with Scheme vectors, try it that way. Scheme vectors are roughly equivalent to the Java ArrayList class, in that they already know how to grow and shrink.

Prolog version:

I dunno.


Last modified: 
Stephen Bloch / sbloch@adelphi.edu