Types of collections

Three basic kinds of collections are: sets, dictionaries, and sequences. Figure 1 shows the inheritance hierarchy for the primary collection classes.


Sequences

One type of sequence familiar to most programmers is an array. Collections can be thought of as polymorphic containers in that collections can contain different types of elements. CommonPoint collections are parameterized and must be instantiated with an argument indicating the general class of elements the collection is intended to manage. Collection elements must be instances of some class defined either by you or by the CommonPoint application system.

Although collections known as polymorphic arrays bear a similarity to arrays used in C and Pascal, they are different in subtle and substantial ways. For example polymorphic arrays are dynamic; they can grow in size automatically to accommodate new elements when they become full.

Dynamic arrays also provide bounds checking and other safety features which protect clients that try to access or change a collection element with an index value too small or too large for the collection. As with C arrays, indices for dynamic arrays start at 0. Array indices can be thought of as keys used to access values contained inside an array.

Dictionaries store key-value pairs

Arrays are only one of six concrete classes of collections provided by the CommonPoint application system. Associative arrays, often called dictionaries, allow clients to access elements by keys other than integers. Technically, arrays in C are keyed collections. You use an integer as a key to retrieve an element at a specified index. Dictionaries allow look up by keys other than integers. Often strings or symbols are used as keys for dictionaries. As an example of a dictionary, a compiler symbol table can be represented as a collection of pairs called key-value pairs. Pairs are two-element objects containing a lookup key and a value. In a symbol table, the key could be a string, and the value could be a structure containing type information and data associated with the key.

Dictionaries as lookup tables

For dictionary collections, the analogy with an English language dictionary is obvious. You can look up the definition of a word (or symbol) if you know how to spell the word. The definition can be thought of as the value associated with the word. Values in a symbol table might be functions, variable addresses, constant values, or text macros. The awk programming language makes use of associative arrays (dictionaries) to maintain collections of values associated with key strings.

One major difference between an English language dictionary and a dictionary collection is that there can be no duplicate keys in a dictionary collection. Allowing no duplicate keys in a symbol table makes sense as this avoids the ambiguity for the definition of a word. Ambiguity is tolerable in spoken language but would cause chaos in a programming language. Note that the idea of duplicate may also be ambiguous. Are two different variables which contain the same integer values considered to be duplicates? The default answer is no. Unless you specify otherwise, duplicate items are items which have the same pointer addresses in memory. As a programmer you can override this definition of duplicate for a particular collection by defining a comparison mechanism (a comparator) used by a specific collection instance.

Sets

The CommonPoint application system also provides a type of collection called a set. Sets like dictionaries, do not allow duplicate elements. In fact the primary definition of a set is a collection that does not allow duplicate members. Sets do not have keys, only values. If you wanted to make a list of unique words used in a text document, for example, you could iterate over the words in the document and add each word to a set collection. If the same word is placed in a set twice, the second word replaces the first. Thus when you are through processing each word in the document, each word in the original document appears only once in the set.

Another characteristic of sets is that they are unordered. When you scan through--or iterate over--a set, you are not likely to retrieve elements in the same order they are added to the set. The following sections will clarify the different characteristics of sets, dictionaries and sequences. You will also see sequences like queues, deques, and sorted sequences which are ordered collections. The ordering of elements in queues, deques (including stacks), and sorted sequences is different than the ordering of elements in arrays.

Order of collection elements

You can look at collections from a variety of viewpoints. It's important to keep in mind that a diagram of a class hierarchy does not always translate directly into a conceptual breakdown of similarities and differences in the behavior of classes. Often one class inherits from another because they share common protocol (compatible client interfaces) even though a child class might be used for a very different purpose than its parent. Conceptual breakdowns categorize classes by how they are used instead of by the interface they inherit from ancestors. Such categorical breakdowns provide an alternate way of looking at collections.

Figure 2 shows a breakdown of the conceptual behavior of different kinds of collection protocols supported by the CommonPoint application system. The diagram uses the general names of collection objects commonly associated with traditional data structures. You can follow the lines in this diagram as a decision tree to categorize collections according to use.


Deque protocol supports stacks and queues

Note that the familiar data structures, stack, queue, and deque (double-ended queues) are all implemented as deques. The definition of a double-ended queue is general enough that it can function as either a stack, a single-ended queue or a double-ended queue. Deques provide protocol appropriate to all three of these types of ordered sequences. When you are using deques as stacks you can use Push and Pop to add and remove elements. When you are using a deque as a queue you can use Add and Remove; elements are added at one end of the queue and removed from the other.

Ordered and unordered collections

Stacks, queues, and deques are considered externally ordered (sequenced) collections because they maintain internal elements according to the order in which they are added from an external point of view. Sets and dictionaries, however, are types of unordered collections. This distinction leads to the first branch of categories in the diagram's decision tree for determining the kind of collection suitable to a given task.

Figure 2 shows, that from a conceptual point of view, collections break down into two categories; they are either ordered, or unordered. In the CommonPoint application system ordered collections are called sequences and are more complex in nature than unordered collections. Once you have determined that you want a collection with a sense of order, the question is do you want the collection be ordered or sorted automatically according to some lexical or numerical comparison? If so, you want a sorted collection. If, on the other hand, you want a collection that is ordered according to how you put elements in the collection, you want either a queue, or an array. You would use a queue only if you need to retrieve elements from the beginning or end of the collection.

For example a stack is a special form of queue (often called a LIFO queue because the item which is the Last In to the queue will be the First Out when a request to retrieve the next element is made). Event queues and message queues are often implemented as FIFO queues because the order in which events are processed usually depends on the time of their occurrence. The item which is First In to a queue will be the First Out of the queue when a request is made to retrieve the next element.

Arrays and dictionaries are keyed collections

If you require random access to all collection elements, an array is your best choice. An array is considered a keyed collection because random access to array elements is made through an numeric key. Dictionaries are keyed collections too. But the keys in dictionaries are usually strings or symbols. Dictionaries are collections whose elements are maintained in unspecified order, however lookup of values in dictionaries is very fast because of the way elements are maintained internally.

Priority queues: internal order and external order

There is yet another kind of ordered collection not shown in the collection behavior diagram. This kind of collection is ordered partly by an internal (automatic) sorting mechanism, and partly by external manipulation (that is according to the order in which an element is added). This collection is called a priority queue.

In event processing systems, it is sometimes useful to prioritize events according to their urgency. For example a disk access may be given higher priority than a screen repaint. A system timer event would probably be given higher priority than a user menu command or keystroke. The events for such a prioritized queue would be sorted primarily by an internal mechanism (the rank of the event) and secondarily by external ordering (events of the same priority rank are processed in the order they arrive in the queue--that is the order in which the events occur in time).

Each of the CommonPoint classes implementing the behaviors described in the behavior diagram are covered in turn. First the general semantics for creating collections and for interacting with collections are discussed. The general protocol for collection management applies to all collection classes.

For an overview of general collection behavior, the sorting, ordering, and duplication characteristics of CommonPoint collection classes are listed in the following table. Only concrete classes are included.

Type of collection Ordered Allows Duplicates
TDequeOf by insertion sequence yes
TSortedSequenceOf automatically yes
TArrayOf by numeric index yes
TSetOf no no
TDictionaryOf no keys no, values yes
TPriorityQueueOf partially yes


[Contents] [Previous] [Next]
Click the icon to mail questions or corrections about this material to Taligent personnel.
Copyright©1995 Taligent,Inc. All rights reserved.

Generated with WebMaker