Tree Container Library: Implementation: unique_tree

The Tree Container Library

A C++ generic tree container library which, not only works much like the STL, but is also compatible with the STL algorithms. The Tree Container Library is used worldwide in thousands of industries and organizations. It is completely free to use, and is licensed under the zlib/libpng license.

unique_tree differs slightly from the other two associative trees, in it's implementation. While unique_tree uses the many of the base class operations provided by associative_tree and basic_tree, it also overloads many of associative_tree's public operations. For example, unique_tree can not simply use the insert implementation provided by associative_tree, because unique_tree must check the entire tree structure for a duplication before performing the insertion. unique_tree must also maintain an additional member container, descendants for every node, which tracks all of it's descendant nodes. In addition to overloading many of associative_tree's operations, unique_tree provides some additional operations of it's own, like find_deep().

In unique_tree.h, a function object is first defined for the ordered iterators. More info on node iterators will be presented further below.

Like tree and multitree, unique_tree then provides a forward declaration of itself for the sake of the following dereference comparison function object. This comparison operator will be used for the internal container which will store child nodes. This container is a member of basic_tree, called children. The functor overloads the call operator, passing two arguments which are pointers to the tree_type. The call operator simply dereference's the pointers, and compares the stored_type object in the nodes. This function object is used in the third template parameter of the base class (basic_tree) derivation declaration, discussed further below.

The unique_tree accepts three template parameters, rather than two which are accepted by tree and multitree. The first two template parameters are used the same as with tree and multitree.

  • stored_type
    The stored_type template parameter represents the object type which will be stored in every node.
  • node_compare_type
    The node_compare_type provides the comparison operation for the stored_type.
  • node_order_compare_type
    The node_order_compare_type represents the alternate comparison operation for the stored_type, which is used by unique_tree's ordered_iterators.

The derivation declaration in the class declaration passes the same three template parameters to basic_tree as tree and multitree. basic_tree takes three template parameters, supplied by tree and multitree.

  • stored_type
    The stored type which is the first template parameter for unique_tree is passed on as the first template parameter for associative_tree, and then to basic_tree.
  • unique_tree<stored_type, node_compare_type, node_order_compare_type>
    This is the actual fully specified tree/multitree, to inform associative_tree and basic_tree the type of class that is deriving from them.
  • std::set<unique_tree<stored_type, node_compare_type, node_order_compare_type>*, unique_tree_deref_less<stored_type, node_compare_type, node_order_compare_type> >
    This third template parameter for associative_tree. It defines the container used to store the child nodes. Note that a second template parameter for std::set is used to provide the comparison operator for the set. This is the function object comparison object described further above.

Before the constructors and destructor are defined, there are a number of typedefs declared. The first typedef declares the type of itself, the fully specified unique_tree type, tree_type. The second and third typedef declare key_compare and value_compare, needed for STL container compatibility. The forth typedef declares the final base tree type, basic_tree_type. The fifth typedef declares the fully specified base associative_tree type. The sixth typedef declares the container type used by basic_tree, container_type. Next, basic_tree is declared to be a friend of unique_tree. The next typedef is a fully specialized dereference comparison function object, deref_key_compare. Next, two typedefs declare the ordered iterators for unique_tree, const_ordered_iterator and ordered_iterator

The first constructor defined is the default constructor. It delegates responsibility to associative_tree's default constructor. It also initializes the orphan pointer to null, and the allow_orphan flag to false. The second constructor accepts a stored_type object, which it also passes to an associative_tree's constructor, and performs the other initializations as the default constructor. The third constructor declares the copy constructor, which is implemented and commented in the inl file. The destructor is then defined inline.

The public interface is declared/defined next. Most of the operations here are implemented and commented in the inl file, and will be discussed briefly. The first insert() operation is the default insert() operation used also by tree and multitree, which accepts a single parameter, a stored_type object. Like the following insert and set operations, unique_tree must go through some extra work before inserting the node, to insure that the inserted node is not already somewhere in the entire tree structure. unique_tree uses the node_compare_type comparison operation to determine this. Then, if it is OK to insert the new node, unique_tree must do some more work to update the descendant container of the parent, and all the grandparent nodes, up to the root, with the pointer to the newly inserted child node. The descendant containers in each node are used to find any node anywhere in the tree structure, if searched from the root. This is explained further below in the discussion of the find_deep() operation.

The second insert operation is much like the first, but uses the first iterator parameter pos, as a hint to where to insert the node.

The third insert() operation accepts a tree_type parameter. It's used to insert a subtree into a node. Again, unique_tree must insure that no node in the subtree will be a duplicate of an existing node that's currently in the tree. It must also insure that all parents and grandparents to the inserted subtree have their descendant container updated with pointers to all the newly inserted nodes in the subtree. The forth insert operation is much like the third, but uses pos as a hint for inserting the node and it's descendants.

The swap operation is declared next, and is implemented in the inl file.

The fifth insert() operation is particular to the unique_tree. It allows the specification of the parent node in which to insert the child node. unique_tree will attempt to find the parent node, which will be found if it's a descendant of the node in which the insert operation was called. If found, the node will be inserted in the parent. If not found, there are two possibilities:

  • The node will not be inserted if allow_orphans is disabled.
  • The node will be inserted in an 'orphan state' if allow_orphans is enabled.

In the second case, the parent node is inserted in a member tree container, pOrphans, until a later possible insertion of it's parent node would bring it back out of orphans and place it (and it's children) in the normal tree structure. For this reason, there is a lot of overhead to the insert(parent, child) operation. It is very handy, though, for use in situations where the order of node insertion can not guarantee that all parent nodes will be inserted before their child nodes. To understand how the operation of orphans is implemented, refer to the insert(parent, child) implementation and comments in the inl file.

The last insert operation accepts a range of input iterators, which will be inserted. These input iterators can be iterators of other TCL containers, or iterators of STL containers. If the iterators are of TCL containers, they can be either child iterators or descendant iterators.

Next, the find_deep() operations are declared. One is const and the other non-const. These operations find a node if it exists as any of the descendants of the node in which the operation is called. Like the insert operations, these operations are implemented and commented in the inl file.

The next group of operations return ordered_iterators. There is even a find_ordered() operation, which works like the find() operation, but returns an ordered_iterator rather than an iterator. The erase() and clear() operations work much like those in tree and multitree, but again, these operations must insure the integrity of the descendant and ordered_children containers of the affected parent and grandparent nodes.

allow_orphans() can be used for getting or setting the allow_orphan status. Note that this flag is applied to the whole tree structure, not just to a particular node. I'll discuss below how this is implemented. get_orphans() allows access to a read only pointer to the orphan tree structure. is_orphan() indicates if a given node is in an orphan state, or 'normal' state. It determines this with a little trickery. All nodes in the orphan tree contain elements in the children container, but no elements in the ordered_children container. So, if a given node is a node in the orphan container, the root element will have at least one element in the children container, and no elements in the ordered_children container.

There are four private functions in unique_tree used for implementation purposes, which are discussed below.
insert(stored_type*) inserts a node from a stored_type pointer rather a reference to a stored type.
inform_grandparents(tree_type* pNew_child, tree_type* pParent) adds pNew_child to pParent's descendant container, and to all the ancestor node's descendant containers.
bool check_for_duplicate(const stored_type& element, const tree_type* pParent) checks the whole tree structure for a node which already contains element, by going to the root node using pParent, and doing a find_node(). It must perform this test before inserting or setting any node with a stored_type object.
const tree_type* get_root() simply returns the root node pointer of the node that it's called from.

Finally, we come to the data members of unique_tree. There are four data members.

  • mutable std::set<tree_type*, deref_key_compare > descendents;
    descendants contains all pointers to a nodes children, their children, and so on. This member allows unique_tree to narrow down and find any node which is a descendant of itself. The root node, by nature, will contain a pointer to every other node in the whole tree structure, in it's descendants member. Every time a node is added or removed from a parent node, that parent node, and all the parents node's ancestor nodes must have their descendants member updated.
  • std::multiset<tree_type*, deref_ordered_compare> ordered_children;
    ordered_children contains the same child pointers as it's base class, basic_tree's, children container, but the container may use a different comparison operator, hence the elements may have a different order. This order is dictated by unique_tree's third template parameter, which defaults to unique_tree's second template parameter if not explicitly supplied.
  • mutable tree_type* pOrphans;
    pOrphans is the container for orphan nodes (if used) for the tree structure. This pointer is only used in the current root node of the tree structure, so there is a small wasted overhead for each node. The pointer could have been made static, to remove the problem of wasted memory, but then all unique_tree's of the declared type would share the same orphans, and confusion would result for any application using more than one unique_tree of the same type. This member uses lazy instantiation, and is not created on the heap unless/until it's needed.
  • mutable bool allowing_orphans;
    allow_orphans is also only used for the root node of a unique_tree container, which means there is a little more wasted space the size of an int for each node in the tree. The reason this member is not made static is the same reason as mentioned above. This member defaults to false, so if orphan behavior is desired, this member must be explicitly set with allow_orphans(bool).