We started with the class structure. In previous articles we explored ISA and class_data_bits_t, so in today’s article we’re going to explore cache_t, and see what that structure does. What is the system designed to do with that structure? With that in mind we begin today’s exploration.

Knowledge supplement

  • Arrays: Arrays are collections used to store multiple data of the same type. It has the following advantages and disadvantages:

Advantages: Access to a subscript content is very convenient, fast

Disadvantages: The operation of inserting and deleting the array is time-consuming

  • Linked list: A linked list is a non-sequential and non-sequential storage structure on physical storage units. The logical order of data elements is realized by the link order of Pointers in the list. It has the following advantages and disadvantages:

Advantages: Easy to insert or delete elements of a node

Disadvantages: It takes time to find the elements of a node at a certain location

  • Hash table: a data structure that is accessed directly based on key values. It has the following advantages and disadvantages:

Advantages: 1, access to an element is fast 2, insert and delete operation is also very convenient

Disadvantages: need to go through a series of operations more complex

Cache_t analysis

1. First let’s look at our next class. What does it store in its cache_t? The result is as follows:

2. It is difficult to determine the data we need to study from the figure above, so what should we do? Let’s look at the definition of the cache_t structure:

3. Cache_t has a member _bucketsAndMaybeMask, and then a union that has two members, _originalPreoptCache, and a structure. Then we have the _maybeMask, _flags, and _occupied members. We’re not sure which one is the occupied one, so what should we do?

4. In addition to its members, we also have some of its functions, but let’s look at these functions and see if we can find the results we need.

A lot of messy things, a bit of a puzzle, but we calm down to watch, roughly so that we see always around a bucket_t processing data, suddenly a bold idea comes to mind, is this what we need? Curious, we clicked in and found:

5, _imp and _sel, we all know that IMP and SEL are paired, and this contains the method selector and the method implementation, so is this what we need? So let’s take a look at the runtime optimizations from WWDC2020. Let’s take a look at the runtime optimizations from WWDC2020. Let’s take a look at the runtime optimizations from WWDC2020, and let’s get back to our problem.

6. Then we obtained this structure diagram through apple’s video explanation and some simple analysis. How to verify its correctness? We previously shifted 32 bytes to access class_data_bits_t, so we only need to shift 16 bytes to access cache_t. The procedure is as follows:

7. It’s not what we thought it would be, so we can be sure that it’s not the right way to value it. We’ve been exploring method lists and property lists and so on and so forth by calling functions to read data. With a bit of hope, we went to the cache_t structure and, alas, we found the following code:

So shall we run the code again and call the function to see if the return value is what we want? Here are the results:

It seems to be right, but why is there no data? Remember that our breakpoint was added when the class was created, and no method has been called yet, so let’s redo the breakpoint and go through the process to see what happens:

Why is it the same? Here we used to start this article supplement knowledge hash table, and we are the buckets is using hash table to store, so we access the data is empty, because it does not necessarily say would be stored in the first place, then we continue to look at other position of storage, whether such as what I said is a hash table to store? Here is my LLDB debugging process:

11. After a series of attempts, I see data in the fourth position, but what does this data represent? How do we get it? Bucket_t = bucket_t;

12. Let’s first try the effect of sel(), as follows:

13. The above results show that there is no problem with our exploration process, so let’s continue to search for our sayHello method, and the results are as follows:

14, after repeated attempts, we finally found our sayHello method, so we go to get imp to see, the results are as follows:

15. At this point we are done with the LLDB validation, and then we will validate the cache by writing some similar code ourselves, using a custom structure.

Custom code validationcache_t

1. First, we restore part of our own code through the above process, as shown in the figure below:

2, then we run the code to see the print effect, as shown in the figure:

3. By using the above custom structure, we force it into a custom type, and then print out the memory in the cache, which also proves that we are on the right direction. Back to the original problem, the cache stores the method called by the object and the implementation of the method. Each SEL and IMP mapping is saved.

conclusion

Through this article, we can see the system’s design logic for a caching mechanism, which opens up a new way of thinking about setting up custom caches or other architectures in our development.