Understand and implement trees

So far, our exploration of data structures has only touched on the linear part. Whether we use arrays, linked lists, stacks, or queues, all are linear data structures. We have seen the complexity of linear data structure operations, and most of the time, the complexity of inserts and deletes can be expressed as O(1). Search is a little bit complicated, order n complexity. The only exception is PHP arrays, which are actually hash tables, which can reach O(1) complexity if the indexes or keys are managed in such a way. To solve this problem, we can use hierarchical data structures instead of linear data structures. Hierarchical data can solve many problems that are difficult to solve with linear data structures.

Whenever we talk about family trees, organizational structures, and network connectivity graphs, we’re really talking about layered data. A tree is a special abstract data type (ADT). Unlike a linked list, a tree is hierarchical.

Definition and characteristics of trees

A tree is a hierarchical collection of nodes or vertices connected by edges. The tree cannot have loops, and only edges exist between a node and its descendents or children. Two children of the same parent cannot have any edges between them. Each node can have a parent unless it is the top node, also known as the root node. Each tree can have only one root node. Each node can have zero or more children. In the figure below, A is the root node and B, C, and D are the children of A. We can also say that A is the parent of B, C, and D. B, C, and D are called siblings because they are from the same parent node A.

A node without any children is called a leaf. In the previous figure, K, L, F, G, M, I, and J are leaf nodes. Leaf nodes are also called external nodes or terminal nodes. Nodes other than the root have at least one child and are called internal nodes. Here, B, C, D, E, and H are internal nodes. Here are some other common terms used to describe tree data structures:

Descendant: This is a node that can be reached from the parent by a repeating program. For example, M is a descendant of C in the previous figure.

Ancestor: This is a node that can be repeated from a child node to a parent node. For example, B is the ancestor of L.

Degree: The total number of children of a particular parent node is called its degree. In our example, A has 3 degrees, B has 1 degree, C has 3 degrees, and D has 2 degrees.

Path: The sequence of nodes and edges from the source node to the target node is called the path between two nodes. The length of a path is the number of nodes in the path. The path from A to M is A-C-H-m, and the length of the path is 4.

Node height: The height of a node is determined by the number of edges between the node and the deepest node. For example, node B has a height of 2.

Hierarchy: Hierarchies represent the generation of nodes. If the parent is at level N, the children will be at level N+ 1. Therefore, the level is defined by the number of edges between the node and the root.

A. in the zero layer

B, C and D are layer 1

E, F, G, H, I, J are two layers

K, L, and M are on the third floor.

Tree height: The height of a tree is defined by the height of its root node. The height of the tree is 3.

Subtree: In a tree structure, each child recursively forms a subtree. In other words, a tree is made up of many subtrees. For example, B and E, K and L form a subtree, E, K and L form a subtree, and they are identified in each different shadow.

Depth: The depth of a node is determined by the number of edges between the node and the root node. For example, the depth of H is 2 and the depth of L is 3.

Forest: A forest is a group or more of disjointed trees.

Traversal: This represents the process of accessing nodes in a particular order.

Key: used for searching and represents the value of the node.

Use PHP to implement the tree

So far, we’ve looked at the different properties of trees. If we compare trees with real-life examples, we find that organizational structures or genealogical trees can be represented by numbers. For an organizational structure, one root node can be the CEO of the company, followed by EMPLOYEES at the CXO level, followed by employees at other levels. Here, we do not restrict any degree to a particular node. This means that a node can have more than one child. Thus, here is a node structure where we can define the node properties, its parent node, and its children:

class TreeNode
{
    public $data = null;

    public $children = [];

    public function __construct(string $data = null)
    {
        $this->data = $data;
    }

    public function addChildren(TreeNode $treeNode)
    {
        $this->children[] = $treeNode; }}Copy the code

We can see that we have declared two public properties: data and children. We also have a method to add children to a specific node. Here, we’re just adding new child nodes to the end of the array. A tree is a recursive structure, and it will help us build the tree recursively and traverse the tree recursively.

Now that we have the nodes, let’s build a tree structure that will define the root node of the tree as well as traverse the entire tree. Therefore, the basic tree structure will look like this:

class Tree
{
    public $root = null;

    public function __construct(TreeNode $treeNode)
    {
        $this->root = $treeNode;
    }

    public function traverse(TreeNode $treeNode, int $level = 0)
    {
        if ($treeNode) {
            echo str_repeat(The '-', $level) . $treeNode->data . PHP_EOL;

            foreach ($treeNode->children as $child) {
                $this->traverse($child, $level + 1); }}}}Copy the code

The preceding code shows a simple tree class that allows us to store a reference to the root node or to traverse the tree from any node. In the traversal section, we visit each child node, and then immediately recursively call the traversal method to get the children of the current node. We print a dash (-) at the beginning of the node name with a level so that we can easily understand the child data.

require './TreeNode.php';

$ceo = new TreeNode('ceo');

$tree = new Tree($ceo);

$cfo = new TreeNode('cfo');
$cto = new TreeNode('cto');
$cmo = new TreeNode('cmo');
$coo = new TreeNode('coo');

$ceo->addChildren($cfo);
$ceo->addChildren($cto);
$ceo->addChildren($cmo);
$ceo->addChildren($coo);

$seniorArchitect = new TreeNode("Senior Architect");
$softwareEngineer = new TreeNode("SoftwareEngineer");

$userInterfaceDesigner = new TreeNode("userInterface Designer");
$qualityAssuranceEngineer = new TreeNode("qualityAssurance Engineer");


$cto->addChildren($seniorArchitect);
$seniorArchitect->addChildren($softwareEngineer);

$cto->addChildren($userInterfaceDesigner);
$cto->addChildren($qualityAssuranceEngineer);

$tree->traverse($tree->root);
Copy the code

The final output looks something like this, and you can see the full code here

Different types of trees

There are many different types of trees in the code world, so let’s look at them.

Binary tree

A binary tree is a basic tree structure in which each node has at most two children.

Binary search tree

A binary search tree (BST) is a special type of binary tree in which nodes are stored in a sort manner, that is, at any given point, the node value must be greater than or equal to the left child value and less than the right child value. Every node has to satisfy this property, and this is a binary search tree. The binary search tree algorithm is always better than the linear search algorithm, and its time complexity is O(n), which we will explain in detail later.

Self-balancing binary tree

Self-balanced binary search tree or height-balanced binary search tree is a special type of binary search tree, which tries to keep the height or hierarchy of the tree as small as possible through automatic adjustment. The image below shows a binary search tree on the left and a self-balancing binary search tree on the right:

A highly balanced binary tree is always a better choice because it helps search operations faster than regular BST. There are different implementations of self-balanced or highly balanced binary search trees. Some common ones are as follows:

  • AA tree

  • AVL tree

  • Red and black tree

  • Scapegoat tree

  • octree

  • 2-3 tree

  • Treap

We’ll introduce them later in the series, so stay tuned.

More content

PHP basic data structures topic series directory address: github.com/… PHP syntax is used to summarize basic data structures and algorithms. There are also some basic knowledge that is easily overlooked in our daily PHP development and some practical suggestions on specification, deployment and optimization in modern PHP development, as well as in-depth research on the characteristics of the Javascript language.