Basic Tree Terminologies


WHY TREES?
There are many basic data structures that can be used to solve application problems. The array is a good static data structure that can be accessed randomly and is fairly easy to implement. Linked Lists on the other hand is dynamic and is ideal for an application that requires frequent operations such as add, delete, and update. One drawback of the linked list is that data access is sequential. Then there are other specialized data structures like, stacks and queues that allow us to solve complicated problems (eg: Maze traversal) using these restricted data structures. One other data structure is the hash table that allows users to program applications that require frequent search and updates. They can be done in O(1) in a hash table.
One of the disadvantages of using an array or linked list to store data is the time necessary to search for an item. Since both the arrays and Linked Lists are linear structures the time required to search a “linear” list is proportional to the size of the data set. For example, if the size of the data set is n, then the number of comparisons needed to find (or not find) an item may be as bad as some multiple of n. So imagine doing the search on a linked list (or array) with n = 106 nodes. Even on a machine that can do a million comparisons per second, searching form items will take roughly m seconds. This not acceptable in today’s world where the speed at which we complete operations is extremely important. Time is money. Therefore it seems that better (more efficient) data structures are needed to store and search data. 
Divide and Conquer yields algorithms whose execution has a tree structure. Sometimes we build data structures that are also trees. It is probably not surprising that divide and conquer is the natural way to build algorithms that use such trees as inputs.

DEFINITION:
A tree is a hierarchical collection of nodes(also called vertices or points). One of the nodes, known as the root, is at the top of the hierarchy. Each node can have at most one link coming into it. The node where the link originates is called the parent node. The root node has no parent. The links leaving(under) a node (any number of links are allowed) point to child nodes. Trees are recursive structures. Each child node is itself the root node of a subtree. At the bottom of the tree are leaf nodes, which have no children. A tree can be empty with no nodes or a tree is a structure consisting of one node called the root and zero or one or more subtrees. 
More formally, a tree can be de ned as either the empty tree or a node with a list of successor trees.

PECULIARITIES:

  • Root - the top node in a tree (node A in the above example).
  • Node - the item of information.
  • Parent - the converse notion of the child.
  • Siblings - nodes with the same parent (B, C, D are siblings of A, and E, F are the siblings of B).
  • Children nodes – roots of the subtrees of a node, X, are the children of X.
  • Descendant - a node reachable by repeated proceeding from parent to child.
  • Ancestor - a node reachable by repeated proceeding from child to parent A, C, and G are the ancestors of K.
  • Leaf or Terminal node or External node - a node with no children (degree zero)(E, J, K, H, and I).
  • Nonterminal nodes – nodes other than terminal nodes.
  • Internal node - a node with at least one child.
  • Degree - number of subtrees of a node.
  • Edge - the connection between one node to another (all links in the figure).
  • Path - a sequence of nodes and edges connecting a node with a descendant.
  • Level - The level of a node is defined by 1 + the number of connections between the node and the root. The maximum number of nodes at any level is 2^n. (B, C, and D are the same level). The root node is at level zero.
  • Height - The height of a node is the length of the maximum downward path(length) between the node and a leaf. The height of B is 2 (B-F-J).
  • The height of the tree - It is the maximum height among all the nodes in the tree and depth of the tree is the maximum depth among all the nodes in the tree. For a given tree, depth and height return the same value. But for individual nodes, we may get different results.
  • The depth of a node - It is the length of the path from the root to the node (depth of G is 2, A-C-G).
  • The size of a node is the number of descendants it has including itself (the size of the
    subtree C is 3).
  • Forest - A forest is a set of n ≥ 0 disjoint trees. If we remove the root of a tree we get a forest.
SKEW TREES:
If every node in a tree has only one child (except leaf nodes) then we call such trees skew trees. If every node has an only left child then we call them left skew trees. Similarly, if every node has only the right child then we call them right skew trees.

PROPERTIES:
  • One node is distinguished as a root;
  • Every node (exclude a root) is connected by a directed edge from exactly one other node; A direction is: parent ~> children


A is a parent of B, C, D,
B is called the child of A.
on the other hand, B is a parent of E, F, K

GENERAL TREE:
A general tree is a tree where each node may have zero or more children (a binary tree is a specialized case of a general tree). General trees are used to model applications such as file systems.



PROTOTYPE CODE FOR GENERAL TREE:
Since each node in a tree can have an arbitrary number of children, and that number is not known in advance, the general tree can be implemented using a first child/next sibling method. Each node will have TWO pointers: one to the leftmost child, and one to the rightmost sibling. The following code illustrates this
public class TNode
{
private Object  data;
private MyLinkedList siblings;
private TNode my_left_child;
public TNode(Object n){data=n; siblings=NULL;myLeftChild=NULL;}
}


 

Post a Comment

0 Comments