Today’s special topic introduces an advanced data structure: segment trees. This is something that came to mind when I was writing the Tracing GC thinking about partner systems in memory management. We learn to have power to go to school, if do not understand calculate, need not die knock this kind of thing with not very frequent.

In programming practice, we often encounter some queries on the interval, modify the requirements. To support these operations, a data structure called a line segment tree is introduced. Line segment trees have the following characteristics:

  1. A line segment tree is a highly balanced binary tree. It may be full or complete, but it is not necessary.
  2. Each node of a line segment tree represents an interval. The interval represented by the parent is the sum of two children. The intervals represented by siblings do not overlap each other. The root represents the entire interval

As shown in FIG. 5.1, the root node represents the interval [0,7], its left child represents the interval [0,3], and its right child represents the interval [4,7]. This is a direct dichotomy of the interval represented by the parent node. Although this kind of line segment tree is the most common, the interval length of the two children can actually be changed at will. The nice thing about this change is that the tree is highly balanced, the height of the tree is log(n), which means that the length of the path from the root to any leaf is log(n).


What are the benefits of such a data structure? Let’s look at a problem like this. Given an array of N numbers of data[N] and Q queries. Each query’s input (a, b) is a pair of integers that print out the difference between the maximum and minimum values of data[a] through data[b]. For example, N = 6, Q = 3, an input sample is


The outputs are 6,3,0.

If you do this directly, each query will find the maximum and minimum values in the interval, and then compute the difference. Obviously this approach is very inefficient. Now that we have a line segment tree, we can record the maximum and minimum values of the interval represented by each node in the line segment tree. Never mind how the tree is constructed. If you have a constructed line segment tree, and you want to find the maximum and minimum values of a certain interval, you only need O(logn) complexity to get the result.

Taking the interval [0,4] as an example, we only need to compare the maximum and minimum values on the node [0,3] and the two nodes [4,4] to get the maximum and minimum values on the whole interval [0,4]. If the query result is the maximum and minimum values of the entire interval, return the maximum and minimum values of the root node. As shown in figure 5.2. It can be seen that line segment tree greatly improves the efficiency of query. So how do you build such a line segment tree?


A line segment tree is essentially a binary tree, so the method of building a line segment tree is the same as the method of building a binary tree, which adopts the recursive method to build the tree. The only difference is the data of nodes. Each node stores the interval it represents and the maximum and minimum values on this interval. In fact, the interval represented by itself can be omitted, as we will see later, by cleverly setting the parameters of the recursive program, the interval represented by each node is omitted.

After recursively constructing two children of a node, we can update the maximum and minimum values of the node by comparing the maximum and minimum values of the two children when we go back to that node. If this node is a leaf node, then there’s only one number in the interval that this node corresponds to, so the maximum and minimum are both going to be this number. Build a line segment tree for the example shown in the problem, and the final result is shown in Figure 5.3.


struct node
{
	int max, min;
	node * pleft, * pright;
};

struct node line_tree[N * 2 + 10];
int data[N + 10];

int n;

int max, min;

int nodeCnt = 0;

node * getNewNode()
{
	return &line_tree[nodeCnt++];
}

node * buildTree(int left, int right)
{
	node * root = getNewNode();
	root->max = -1;
	root->min = MAX;

	if (left < right)
	{
		int mid = (left + right) >> 1;
		root->pleft = buildTree(left, mid);
		root->pright = buildTree(mid + 1, right);

		if (root->pleft->max > root->max)
			root->max = root->pleft->max;
		if (root->pright->max > root->max)
			root->max = root->pright->max;
		if (root->pleft->min < root->min)
			root->min = root->pleft->min;
		if (root->pright->min < root->min)
			root->min = root->pright->min;
	}
	else
	{
		root->min = data[left];
		root->max = data[left];
	}
	return root;
}

void question(node * root, int left, int right, int lleft, int lright)
{
	int mid;
	if (max > root->max && min < root->min)
		return;

	if (lleft == left && lright == right)
	{
		if (root->max > max)
			max = root->max;
		if (root->min < min)
			min = root->min;

		return;
	}

	mid = (left + right) >> 1;
	if (mid >= lleft)
	{
		if (mid >= lright)
			question(root->pleft, left, mid, lleft, lright);
		else
		{
			question(root->pleft, left, mid, lleft, mid);
			question(root->pright, mid + 1, right, mid + 1, lright);
		}
	}
	else
		question(root->pright, mid + 1, right, lleft, lright);
}

int main()
{
	int tmp, Q;
	int a, b;

	scanf_s("%d %d", &n, &Q);
	for (int i = 1; i <= n; i++)
		scanf_s("%d", data + i);

	node * root = buildTree(1, n);

	for (int i = 0; i < Q; i++)
	{
		scanf_s("%d %d", &a, &b);
		max = -1;
		min = MAX;
		question(root, 1, n, a, b);
		printf("%d\n", max - min);
	}

	return 0;
}
Copy the code

BuildTree is the same function that builds a binary tree recursively. The only difference is that we update the Max and min values of this node after going back to this node.

Here we focus on the implementation of question. Root represents the node queried previously, lleft and lright represent the left and right ends of the query interval respectively, and left and right represent the left and right ends of the current node interval respectively. If lleft is equal to left and lright is equal to right, it means that the interval being queried matches exactly the interval represented by the current node. Otherwise, the range to be queried depends on whether the child falls to the left or right of the current node.

When falling on the right child, the query interval should be completely in the right half of the current node interval, that is, lleft is greater than mid. At this point, the recursive query on the right child can be done. When it falls on the left child, the interval to be queried is completely in the left half of the current node interval, that is, lright is less than mid, and the left child is recursively queried. If the query interval spans two children, split the query interval between mid and two children.

Program analysis, it can be seen that although the line segment tree is complex, its root is still a binary tree. And that’s why, in the section on binary trees, we talked about binary trees being kind of the building blocks of data structures. If you have a deep understanding of the structure of binary trees, it is easy to master line segment trees.

A lesson:

The lesson:

Course Catalogue: Course catalogue