Peter Müller
Globewide Network Academy (GNA)
pmueller@uu-gna.mit.edu
In C++ generic data types are called class templates or just templates for short. A class template looks like a normal class definition, where some aspects are represented by placeholders. In the forthcoming list example we use this mechanism to generate lists for various data types:
template <class T> class List : ... { public: ... void append(const T data); ... };
In the first line we introduce the keyword template which starts every template declaration. The arguments of a template are enclosed in angle brackets.
Each argument specifies a placeholder in the following class definition. In our example, we want class List to be defined for various data types. One could say, that we want to define a class of lists. In this case the class of lists is defined by the type of objects they contain. We use the name T for the placeholder. We now use T at any place where normally the type of the actual objects are expected. For example, each list provides a method to append an element to it. We can now define this method as shown above with use of T.
An actual list definition must now specify the type of the list. If we stick to the class expression used before, we have to create a class instance. From this class instance we can then create ``real'' object instances:
List<int> integerList;
Here we create a class instance of a List which takes integers as its data elements. We specify the type enclosed in angle brackets. The compiler now applies the provided argument ``int'' and automatically generates a class definition where the placeholder T is replaced by int, for example, it generates the following method declaration for append():
void append(const int data);
Templates can take more than one argument to provide more placeholders. For example, to declare a dictionary class which provides access to its data elements by a key, one can think of the following declaration:
template <class K, class T> class Dictionary { ... public: ... K getKey(const T from); T getData(const K key); ... };
Here we use two placeholders to be able to use dictionaries for various key and data types.
Template arguments can also be used to generate parameterized class definitions. For example, a stack might be implemented by an array of data elements. The size of the array could be specified dynamically:
template <class T, int size> class Stack { T _store[size]; public: ... }; Stack<int,128> mystack;
In this example, mystack is a stack of integers using an array of 128 elements. However, in the following we will not use parameterized classes.
In the following discussion we distinguish between a data structure's shape and its traversing strategies. The first is the ``look'', which already provides plenty information about the building blocks of the data structure.
A traversing strategy defines the order in which elements of the data structure are to be visited. It makes sense to separate the shape from traversing strategies, because some data structures can be traversed using various strategies.
Traversing of a data structure is implemented using iterators. Iterators guarantee to visit each data item of their associated data structure in a well defined order. They must provide at least the following properties:
When doing something object-oriented, the first question to ask is
What are the basic building blocks of the item to implement?
Have a look at Figure 10.1, which shows a list consisting of four rectangles. Each rectangle has a bullet in its middle, the first three point to their right neighbour. Since the last rectangle have no right neighbour, there is no pointer.
Figure 10.1: Basic building blocks of a singly linked list.
First let's choose names for these building blocks. Talking of rectangles is not appropriate, because one can think of a figure using circles or triangles. Within the scope of graphs the name node is used. A node contains a pointer to its successor. Thus, the list in the figure consists of four nodes, each of which has exactly one pointer associated with it.
Three types of nodes can be distinguished:
Note that the nodes do not carry any content. This is because the bare data structure list consists only of nodes, which are strung together. Of course real applications need nodes, carrying some content. But in the sense of object-orientation this is a specialization of the nodes.
From the figure, we can see, that a list can only be used with one traversing strategy: forward cursor. Initially, the head will be the first current element. The successor function simply follows the pointer of the current node. The termination function checks the current element to be the tail.
Note that it is not possible to go back nor to start in the middle of the list. The latter would contradict the requirement, that each element must be visited.
The next question is, what are the operations offered by a list? A list only defines two well known nodes head and tail. Let's have a deeper look to them.
A new node can be put-in-front of the list such that:
Similarly, a new node can easily be appended to the tail:
The inverse function to put in front is delete-from-front:
You should be able to figure out why there is no cheap inverse append function.
Finally, there exist three other cheap primitives, whose meaning is straight forward. Thus, we will not examine them any further. However, we present them here for completeness:
The basic building block of a list is the node. Thus, let's first declare a class for it. A node has nothing more than a pointer to another node. Let's assume, that this neighbour is always on the right side.
Have a look at the following declaration of class Node.
class Node { Node *_right; public: Node(Node *right = NULL) : _right(right) {} Node(const Node &val) : _right(val._right) {} const Node *right() const { return _right; } Node *&right() { return _right; } Node &operator =(const Node &val) { _right = val._right; return *this; } const int operator ==(const Node &val) const { return _right == val._right; } const int operator !=(const Node &val) const { return !(*this == val); } };
A look to the first version of method right() contains a const just before the method body. When used in this position, const declares the method to be constant regarding the elements of the invoking object. Consequently, you are only allowed to use this mechanism in method declarations or definitions, respectively.
This type of const modifier is also used to check for overloading. Thus,
class Foo { ... int foo() const; int foo(); };
declare two different methods. The former is used in constant contexts whereas the second is used in variable contexts.
Although template class Node implements a simple node it seems to define plenty of functionality. We do this, because it is good practice to offer at least the following functionality for each defined data type:
The unequality operator ``!='' is implemented by using the definition of the equality operator. Recall, that this points to the invoking object, thus,
Node a, b; ... if (a != b) ...
would result in a call to operator !=() with this set to the address of a. We dereference this using the standard dereference operator ``*''. Now, *this is an object of class Node which is compared to another object using operator ==(). Consequently, the definition of operator ==() of class Node is used. Using the standard boolean NOT operator ``!'' we negate the result and obtain the truth value of operator !=().
The above methods should be available for each class you define. This ensures that you can use your objects as you would use any other objects, for example integers. If some of these methods make no sense for whatever reason, you should declare them in a private section of the class to explicitly mark them as not for public use. Otherwise the C++ compiler would substitute standard operators.
Obviously, real applications require the nodes to carry data. As mentioned above, this means to specialize the nodes. Data can be of any type, hence, we are using the template construct.
template <class T> class DataNode : public Node { T _data; public: DataNode(const T data, DataNode *right = NULL) : Node(right), _data(data) {} DataNode(const DataNode &val) : Node(val), _data(val._data) {} const DataNode *right() const { return((DataNode *) Node::right()); } DataNode *&right() { return((DataNode *&) Node::right()); } const T &data() const { return _data; } T &data() { return _data; } DataNode &operator =(const DataNode &val) { Node::operator =(val); _data = val._data; return *this; } const int operator ==(const DataNode &val) const { return( Node::operator ==(val) && _data == val._data); } const int operator !=(const DataNode &val) const { return !(*this == val); } };
The above template DataNode simply specializes class Node to carry data of any type. It adds functionality to access its data element and also offers the same set of standard functionality: Copy Constructor, operator =() and operator ==(). Note, how we reuse functionality already defined by class Node.
Now we are able to declare the list template. We also use the template mechanism here, because we want the list to carry data of arbitrary type. For example, we want to be able to define a list of integers. We start with an abstract class template ListBase which functions as the base class of all other lists. For example, doubly linked lists obviously share the same properties like singly linked lists.
template <class T> class ListBase { public: virtual ~ListBase() {} // Force destructor to be // virtual virtual void flush() = 0; virtual void putInFront(const T data) = 0; virtual void append(const T data) = 0; virtual void delFromFront() = 0; virtual const T &getFirst() const = 0; virtual T &getFirst() = 0; virtual const T &getLast() const = 0; virtual T &getLast() = 0; virtual const int isEmpty() const = 0; };
What we actually do is to describe the interface of every list by specifying the prototypes of required methods. We do that for every operation we have identified in section 10.3. Additionally, we also include a method flush() which allows us to delete all elements of a list.
For operations get-first and get-last we have declared two versions. One is for use in a constant context and the other in a variable context.
With this abstract class template we are able to actually define our list class template:
template <class T> class List : public ListBase<T> { DataNode<T> *_head, *_tail; public: List() : _head(NULL), _tail(NULL) {} List(const List &val) : _head(NULL), _tail(NULL) { *this = val; } virtual ~List() { flush(); } virtual void flush(); virtual void putInFront(const T data); virtual void append(const T data); virtual void delFromFront(); virtual const T &getFirst() const { return _head->data(); } virtual T &getFirst() { return _head->data(); } virtual const T &getLast() const { return _tail->data(); } virtual T &getLast() { return _tail->data(); } virtual int isEmpty() { return _head == NULL; } List &operator =(const List &val) { flush(); DataNode<T> *walkp = val._head; while (walkp) append(walkp->data()); return *this; } const int operator ==(const List &val) const { if (isEmpty() && val.isEmpty()) return 1; DataNode<T> *thisp = _head, *valp = val._head; while (thisp && valp) { if (thisp->data() != valp->data()) return 0; thisp = thisp->right(); valp = valp->right(); } return 1; } const int operator !=(const List &val) const { return !(*this == val); } friend class ListIterator<T>; };
The constructors initialize the list's elements _head and _tail to NULL which is the NUL pointer in C and C++. You should know how to implement the other methods from your programming experience. Here we only present the implementation of method putInFront():
template <class T> void List<T>::putInFront(const T data) { _head = new DataNode<T>(data, _head); if (!_tail) _tail = _head; } /* putInFront */
If we define methods of a class template outside of its declaration, we must also specify the template keyword. Again we use the new operator to create a new data node dynamically. This operator allows initialization of its created object with arguments enclosed in parenthesis. In the above example, new creates a new instance of class DataNode<T>. Consequently, the corresponding constructor is called.
Also notice how we use placeholder T. If we would create a class instance of class template List, say, List<int> this would also cause creation of a class instance of class template DataNode, viz DataNode<int>.
The last line of the class template declaration declares class template ListIterator to be a friend of List. We want to separately define the list's iterator. However, it is closely related, thus, we allow it to be a friend.
In section 10.2 we have introduced the concept of iterators to traverse through a data structure. Iterators must implement three properties:
Generally speaking, the iterator successively returns data associated with the current element. Obviously, there will be a method, say, current() which implements this functionality. The return type of this method depends on the type of data stored in the particular data structure. For example, when iterating over List<int> the return type should be int.
The successor function, say, succ(), uses additional information which is stored in structural elements of the data structure. In our list example, these are the nodes which carry the data and a pointer to their right neighbour. The type of the structural elements usually differs from that of the raw data. Consider again our List<int> where succ() must use DataNode<int> as structural elements.
Again we want to specify an abstract iterator class which defines properties of every iterator. The thoughts above lead to the following declaration:
template <class Data, class Element> class Iterator { protected: Element _start, _current; public: Iterator(const Element start) : _start(start), _current(start) {} Iterator(const Iterator &val) : _start(val._start), _current(val._current) {} virtual ~Iterator() {} virtual const Data current() const = 0; virtual void succ() = 0; virtual const int terminate() const = 0; virtual void rewind() { _current = _start; } Iterator &operator =(const Iterator &val) { _start = val._start; _current = val._current; return *this; } const int operator ==(const Iterator &val) const { return(_start == val._start && _current == val._current); } const int operator !=(const Iterator &val) const { return !(*this == val); } };
Again we use the template mechanism to allow the use of the iterator for any data structure which stores data of type Data and which uses structural elements of type Element. Each iterator ``knows'' a starting (structural) element and the current element. We make both accessible from derived classes because derived iterators need access to them to implement the following iterator properties. You should already understand how the constructors operate and why we force the destructor to be virtual.
Subsequently we specify three methods which should implement the three properties of an iterator. We also add a method rewind() which simply sets the current element to the start element. However, complex data structures (for example hash tables) might require more sophisticated rewind algorithms. For that reason we also specify this method to be virtual, allowing derived iterators to redefine it for their associated data structure.
The last step in the iterator implementation process is the declaration of the list iterator. This iterator is highly related to our class template List, for example, it is clear that the structural elements are class templates DataNode. The only ``open'' type is the one for the data. Once again, we use the template mechanism to provide list iterators for the different list types:
template <class T> class ListIterator : public Iterator<T, DataNode<T> *> { public: ListIterator(const List<T> &list) : Iterator<T, DataNode<T> *>(list._head) {} ListIterator(const ListIterator &val) : Iterator<T, DataNode<T> *>(val) {} virtual const T current() const { return _current->data(); } virtual void succ() { _current = _current->right; } virtual const int terminate() const { return _current == NULL; } T &operator ++(int) { T &tmp = _current->data(); succ(); return tmp; } ListIterator &operator =(const ListIterator &val) { Iterator<T, DataNode<T> *>::operator =(val); return *this; } };
The class template ListIterator is derived from Iterator. The type of data is, of course, the type for which the list iterator is declared, hence, we insert placeholder T for the iterator's data type Data. The iteration process is achieved with help of the structural elements of type DataNode. Obviously the starting element is the head of the list _head which is of type DataNode<T> *. We choose this type for the element type Element.
Note that the list iterator actually hides the details about the structural elements. This type highly depends on the implementation of the list. For example, if we would have chosen an array implementation, we may have used integers as structural elements where the current element is indicated by an array index.
The first constructor takes the list to traverse as its argument and initializes its iterator portion accordingly. As each ListIterator<T> is a friend of List<T> it has access to the list's private members. We use this to initialize the iterator to point to the head of the list.
We omit the destructor because we do not have any additional data members for the list iterator. Consequently, we do nothing special for it. However, the destructor of class template Iterator is called. Recall that we have to define this destructor to force derived classes to also have a virtual one.
The next methods just define the required three properties. Now that we have structural elements defined as DataNode<T> * we use them as follows:
We have also included an overloaded postincrement operator ``++''. To distinguish this operator from the preincrement operator, it takes an additional (anonymous) integer argument. As we only use this argument to declare a correct operator prototype and because we do not use the value of the argument, we omit the name of the argument.
The last method is the overloaded assignment operator for list iterators. Similar to previous assignment operators, we just reuse already defined assignments of superclasses; Iterator<T>::operator =() in this case.
The other methods and operators, namely rewind(), operator ==() and operator !=() are all inherited from class template Iterator.
The list template as introduced in previous sections can be used as follows:
int main() { List<int> list; int ix; for (ix = 0; ix < 10; ix++) list.append(ix); ListIterator<int> iter(list); while (!iter.terminate()) { printf("%d ", iter.current()); iter.succ(); } puts(""); return 0; }
As we have defined a postincrement operator for the list iterator, the loop can also be written as:
while (!iter.terminate()) print("%d ", iter++);
T &operator ++() { succ(); return _current->data(); }
What problems occur?
int remove(const T &data);to class template List. The method should delete the first occurrence of data in the list. The method should return 1 if it removed an element or 0 (zero) otherwise.
What functionality must data provide? Remember that it can be of any type, especially user defined classes!