CSC 270
Homework 6

In this assignment, we'll implement a number of operations on mathematical sets: membership, union, intersection, difference, Cartesian product, etc.

C, C++, or Java version

What I want you to do

Define a data type StringSet that represents a set of strings. It should support the following operations:

There are a lot of possible implementations (listed below); I want you to write your class definition in such a way as to hide the implementation, so the person using your class shouldn't need to know whether you're using singly-linked lists, doubly-linked lists, etc.

Unsorted array of strings
Simple but not terribly efficient
Sorted array of strings
More hassle to maintain, but much more efficient to search (e.g. using binary search)
Unsorted linked list of strings
Sorted linked list of strings
In a linked list, sorting isn't so helpful because you can't really do a binary search on it efficiently anyway. Instead, you might try...
Binary search tree
A binary tree in which each non-leaf node has a "key", and all the items alphabetically less than the key are descended from the left child, while all the items alphabetically greater than the key are descended from the right child.
Bit-vector
If there are 100 possible set elements, a set is represented by an array of 100 booleans, each true if the corresponding element is in the set and false if not. This implementation can be extremely efficient, but it only works when you have a limited collection of possible elements, e.g. single characters. In this implementation, you won't be able to do the Cartesian product or the set of numerals up to a specified integer, but you can do the rest of the operations.
Hash table
This is more hassle to implement, but it can be an efficient way to handle sets whose "possible elements" don't come from a nice limited collection. Java provides a built in HashTable class, so you could start with that, and then perhaps try implementing your own.

Scheme version:

In Scheme, it's easier to allow anything to be in the set, and for Cartesian products to simply be two-element lists, or cons-pairs, rather than strings with parentheses and commas in them. Other than that, the task is basically the same. However, if you restrict yourself to strings, you can investigate the various sorted representations for efficiency.

Prolog version:

I would start by using Prolog's built-in list capabilities, storing symbols and numbers rather than strings. You'll need a two-place predicate contains that succeeds iff the list which is its first argument contains its second argument as an element; a three-place predicate intersect that unifies its third argument with the intersection of its first and second; and likewise for union, difference, and cartesian_product. I would use a two-place pair structure, or perhaps simply two-element lists, to represent the ordered pairs in a Cartesian product. I think this would be a fun assignment to do in Prolog.

If you wanted to make it more efficient, you could store things as a binary search tree instead of an unsorted list.


Last modified: 
Stephen Bloch / sbloch@adelphi.edu