CSC 270
Homework 4
C++ or Java version
What I want you to do
Write a definition of a class named StringList
that holds a list of strings and can have elements added and deleted
easily.
Write a main program that reads in a sequence of
strings into a StringList (note that you do not
know in advance how many there will be, and cannot assume any
fixed limit in advance; the user will indicate that the list is done by
typing a blank line). It then computes the average length of the
strings, and prints it out;
it then sorts the list into alphabetical order, prints it out,
finds the median element in alphabetical order,
deletes it from the list, and prints out the new list along with
its average length.
What's a StringList?
The tasks of reading in the list values, sorting them, and printing
them should each be done by a method of the StringList class. Other
methods of the StringList class should include
- a way to build an empty StringList
- a function that tells how many elements are in the StringList
- a way to add a specified element to a StringList (thus increasing its
size by 1)
- a way to find the n-th element of a StringList
- a way to delete a specified element of a StringList
- a way to concatenate two StringLists producing a third
This still leaves you with a lot of possible implementations (listed
below); I want you to write
your class definition in such a way as to hide the
implementation, so the person using your class shouldn't need
to know whether you're using singly-linked lists, doubly-linked lists,
shrinking-and-growing arrays, etc.
- Fixed-size array
- This is tricky, since we don't know how big to make it until the
user has finished typing in all the strings.
- Shrinking and growing array, naive version
- In this approach, the StringList class actually contains an array, but
the array is dynamically allocated. When an element is added to the
StringList, it allocates a new array one element longer, copies the
elements of the old array into the new one (with the new element
inserted at the appropriate position), and de-allocates the old
one. Likewise, when an element is deleted, it allocates a new array
one element shorter, copies the elements of the old array (except the
element being deleted) into the new one, and de-allocates the old one.
This approach makes it very easy to jump around in the list and
swap elements (adjacent or not), but adding and deleting elements is
inefficient.
- Shrinking and growing array, smarter version
- As above, except that rather than allocating a new array
one element longer or shorter, and copying the whole thing,
every time an element is added or removed,
it allocates a new array several elements
longer or shorter so that it doesn't have to do it as often. The
disadvantage is that you need to keep track of how many elements are
really in the array, as well as how large the array is
right now.
For example, if the increment were 10, it would need to allocate and copy
when the first element was added, but not again until the 11th (21st,
etc.) element was added, and any insertions and deletions in between would
require no dynamic allocation of memory. Another neat way to do it is to
always grow the array by a factor of 2, and only shrink it when less than
half of it is in use.
This approach makes inserting and deleting elements more complex,
but significantly more efficient. Since the underlying data structure
is still an array, you can still jump around in the list and swap
elements (adjacent or not) very easily.
- Shrinking and growing array, even smarter
- It turns out that the data structure becomes significantly more
efficient if you always shrink or grow the array by a factor of 2.
For example, if the array currently has a size of 10, 6 of whose slots
are full, you can add 4 items before you need to grow it, at which
point you grow it to 20. You can then add another 10 items before you
need to grow it again, at which point you grow it to 40. In the other
direction, let's say whenever the array is less than 1/4 full, you
shrink it by half: thus if it's grown to 40, but then the user
deletes a bunch of elements so only 10 elements are in use, you shrink
it to 20.
- Singly-linked list
- either using the null pointer to indicate the end of the list, or
using two subclasses EmptyList and NonEmptyList (aka Cons).
Note: a functional program can easily delete an
element out of the middle, by rebuilding the rest of the list without
that element, but this is more inefficient. Splicing an element out is
more efficient, but difficult unless you know the predecessor of the
element you're deleting. So...
- Doubly-linked list
- Rather than each element "knowing" only its own value and the rest
of the list to its right, each element knows its own value, the part of
the list to its right, and the part of the list to its left.
Somewhat more complex, but allows you to scan through the list in
either direction, and in particular, you can delete any element by
splicing together the list to its right and the list to its left.
Similarly, you can easily swap two adjacent elements, because each one
knows where the other one is.
- In Java,
ArrayList
- The Java standard libraries include a class named
ArrayList
that basically implements the "shrinking and
growing array, even smarter" described above. You can save a lot of
work by using it rather than writing your own implementation. The
StringList
class then serves only as a "front end" for
ArrayList
, enforcing that all the things in the list are
in fact strings, allowing only the specified operations, returning
strings, and implementing a sort.
How to sort the list?
There are lots of sorting algorithms in the world. The bubble-sort we
used in the previous problem is fairly simple, but not very efficient.
Get your program to work correctly using the bubble-sort from the
previous homework, and then try Quicksort and/or Mergesort.
Quicksort
The idea here is to pick an element of the list -- called the "pivot" --
and divide the list into three categories: the elements less than the pivot,
the elements equal to the pivot, and the elements greater than
the pivot. To do this, you may want to write a categorize
method that "returns" three lists (by taking in three List &
parameters and assigning them). Once this is done, recursively sort
the elements-less list and the elements-greater list (the elements-equal
list, having only one value in it, is already sorted). Then you can
simply concatenate the sorted elements-less list, the elements-equal
list, and the sorted elements-greater list and get the final answer.
Mergesort
Mergesort, too, involves dividing the whole list into smaller pieces,
sorting the pieces recursively, and putting them back together.
But rather than dividing the list into those less than, equal to, and
greater than a pivot, we'll just divide the list into the first half and
the second half. Sort each of them recursively. Now you have two
sorted lists, and need to put them together. Imagine each one is a
sorted deck of cards: to combine two sorted decks into one, you look at
the top card in each deck and take the smaller of the two, then do that
again, and again, until one or the other deck becomes empty, at which
point you take the entire remaining deck.
C version:
You can't write a StringList
class, because C doesn't
have classes. However, you can define a struct
to hold all
the information you need to keep track of, and you can write a bunch of
functions that take in such a struct
as a parameter. Put
all these functions in one source file (say, stringlist.c
),
with a header file (stringlist.h
) that declares the
struct and the functions; the main program can be written in a separate
file, include
-ing the header file, and linked together with
the separately-compiled stringlist object file. Any implementation you
could do in C++ or Java (except ArrayList
), you should
still be able to do in C; the implementation just won't be as thoroughly
hidden.
Note: One way to implement the lists is with a
global array. I don't recommend this, as it has the weird property that
a main program can't have two different StringLists at the same time.
Scheme version:
The assignment described above is really antithetical to the
"functional programming" paradigm, but if you want to play with Scheme
vectors, try it that way. Scheme vectors are roughly equivalent to the
Java ArrayList
class, in that they already know how to grow
and shrink.
Prolog version:
I dunno.
Last modified:
Stephen Bloch / sbloch@adelphi.edu