I'm going to talk about a problem that most data structures and algorithms don't cover in textbooks

Most books on data structures and algorithms, when they talk about some kind of data structure or algorithm, they use integers as the type of data they’re dealing with. In fact, in real software development, the data stored in data structures and processed by algorithms are often “objects” rather than simple integers. The “object” here is well understood as the “class and object” object in the programming language. For example, in the following lines of Java code, we use a red-black tree to store the Order object.

// The order is stored in a red-black tree, where key is the order ID and value is the order object
private TreeMap<Long, Order> id2OrderMap = new TreeMap<>(); 
 
public void add(Order order) { // Add an order
 id2OrderMap.put(order.id, order);
}
 
private class Order { / / order
 public long id;
 public long userId;
 public long createTime;
 / /... more fields...
}
Copy the code

You may wonder, however, if in real software development, data structures and algorithms deal with objects, why the book is all about integers instead of objects?

We know that data structures and algorithms are designed to solve “faster” and “less expensive” problems. Faster means faster code execution and less storage.

“Faster,” you can think of it as “faster find,” of course, add, delete, change, and some statistical operations, but it’s pretty much the same, you understand “find,” and everything else. When we look for the desired data in a set of data, we always look for it by a certain key.

What do you think of the key here? Let’s take the example of the order we just ordered. We want to quickly find orders in this group according to the order ID. Then we take the ID of the order as a key and organize it into red-black tree data structure according to the key. Red-black trees store the key value and the memory address of the order object, which itself is stored in a separate memory space (as in Java, in the heap).

For your convenience, I drew a picture, you can compare it with the text description.

So you might say, well, what if I want to look up an order by its user ID? What to do at this point?

You can create another red-black tree with the user ID as the key. In fact, this is the multiple index structure we mentioned in our previous article. If translated into Java code, it looks like this.

// Two indexes
private TreeMap<Long, Order> id2OrderMap = new TreeMap<>();
private TreeMap<Long, Order> userId2OrderMap = new TreeMap<>();
 
public void add(Order order) { // Order is created externally with only one copy in memory
 id2OrderMap.put(order.id, order);
 userId2OrderMap.put(order.id, order);
}
 
private class Order {
 public long id;
 public long userId;
 public long createTime;
 / /... more fields...
}
Copy the code

If I were to graph it, it would look something like this. You can see that there is only one order object and two index structures, one based on the order ID as the key and the other based on the order user ID as the key.

You might say, isn’t having two indexes a waste of memory? Actually, no.

As you already know from my previous lecture, indexes store only the key value and the memory address of the object. If the key value is a small integer and the object is a large object (the data itself), the memory used by the index is much smaller than the memory used by the object.

Once again, if an order object contains many fields and the size is 1KB, the size of each node (corresponding to an order object) in a red-black tree index is much smaller than 1KB (you can do the calculation yourself). As a result, the memory footprint is much smaller than the memory required to store the order data itself.

That is to say, in real software development, if you store and work with large objects, the amount of memory the index takes up is negligible. Of course, this has to weigh the actual situation, not mechanically mechanically.

Going back to the question at the beginning, if data structures and algorithms are objects in real software development, why are most data structures and algorithms taught as integer types?

Because when we build the data structure, we build it by the key, and when the algorithm executes, it processes the data by the key. Key values are generally integers (and sometimes strings, which are treated the same), so let’s use the integer type as an example, which is straightforward enough.

Zheng Wang, a former Google engineer, is the author of “The Beauty of Data Structures and Algorithms” and “The Beauty of Design Patterns”. Wechat official account: Xiao Zhengge, follow wechat official account reply PDF, get 100+ pages of Google engineers’ algorithm learning and interview experience sharing.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

I’m going to talk about a problem that most data structures and algorithms don’t cover in textbooks

I’m going to talk about a problem that most data structures and algorithms don’t cover in textbooks

Related Posts

Programmers need to know the core knowledge of compression algorithms

Half a month after the interview, I wrote down these summaries

Styled – components used