RadixTree data structure is used in Redis, it is an improved version of trie tree, this article will not parse all methods. Only core method annotations are provided. And an introduction to the structure itself.

RadixTree itself is also a K/V storage structure. Compared with hash buckets (corresponding to Java hashMap, Redis dict), inserting and reading takes more time. O(logn) is the same as a normal tree structure. The hash bucket’s insert/read time complexity can simply be considered O(1)(without considering hash conflicts). But it saves memory overhead. In contrast to a dictionary tree some shared strings are compressed into a node. The dictionary tree stores only one character per node and cannot be shared. The number of addresses and memory overhead of the radix tree is smaller than that of the dictionary tree.

Lazy, no drawing. Here to borrow another blog chart to display base tree structure mysql.taobao.org/monthly/201…

In fact, the fastest way to look at a data structure is to understand it graphically, why each field is there and what it means. Now let’s sort out the structure of radix.

The core point is the raxNode->data attribute. There are two types of nodes, which are distinguished by iscompr. One type of node, which I’ll call a compression node, stores a string internally, its length indicated by data->size. The other one is bifurcation nodes, like “AB “,” AC “,” AD “, then there will be a node. Data ->size = [b, C,d] data->size = [b, C,d] The nodes starting with B, C and D (which may be bifurcation nodes or compression nodes) under the child nodes exist two special bifurcation nodes, one is empty node and the other is single-character bifurcation node. (In fact, the single-character node can be either a bifurcation node or a compression node 2 with a length of 1 character. RaxNode ->iskey means that all strings from the head node to the parent node are one key. This is important for understanding the algorithm. Why iskey can only represent the parent. In the case of bifurcation nodes, if the iskey of the node represents the string formed above the node is a key. You can’t really tell which string you’re referring to in the case of multiple branches. For example, if [b,c,d]. Iskey is true, it cannot be determined which key is ab, AC, or AD.

Figure out what happens to insert and remove in normal situations, and then work with code comments.

Insert: 1. After normal insert, an empty node is created and iskey is set to true, mainly to indicate that the string from the parent node to the head node is a key. 2. If there is no shared string at all, a bifurcation node is generated under the head node, and the string itself is split. The first unshared string is pulled out and placed in the bifurcation node, because a single character can represent a different situation. The rest of the string becomes a compressed node if it is used only by itself, with only one node to store the rest of the characters, followed by an empty node (same as the first point). 3. If you have any part before with a string of sharing, then find the first string is not Shared, and will be used in match before compression node split (matching parts can continue to share, on the first part does not match the merged into one bifurcation nodes, the remainder of the merged into a compressed node, insert a blank nodes) finally

Removed: 1. When a key has been removed, if a node is compressed, it represents the compression part of this node exclusive, can be removed together, if I found the parent node is a bifurcation nodes, the need to judge the branch number, if the quantity is 1, so after removing the node bifurcation node itself also don’t have to exist, to continue upward to remove… 2. If the last node is a bifurcation node, reduce the number of bifurcation nodes. If the number of branches is 1, remove the whole bifurcation node and continue to judge whether to touch the node upward. It’s the logic of repeating 1,2.

The above information is based on recollection only, may differ from the actual code. A comment on the core method is posted below.

int raxGenericInsert(rax *rax, unsigned char *s, size_t len, void *data, void **old, int overwrite) {
    size_t i;
    int j = 0; /* Split position. If raxLowWalk() stops in a compressed
                  node, the index 'j' represents the char we stopped within the
                  compressed node, that is, the position where to split the
                  node for insertion. */

    raxNode *h, **parentlink;

    debugf("### Insert %.*s with value %p\n", (int)len, s, data);
    // &parentlink是一个三级指针
    // 这里寻找匹配的字符串前缀,并返回下标
    i = raxLowWalk(rax,s,len,&h,&parentlink,&j,NULL);

    /* If i == len we walked following the whole string. If we are not
     * in the middle of a compressed node, the string is either already
     * inserted or this middle node is currently not a key, but can represent
     * our key. We have just to reallocate the node and make space for the
     * data pointer.
     * 代表字符串完全匹配
     * 此时h指向的是子节点
     * rax中key是否存在需要通过判断子节点的 iskey标识 下面会有相关的判断逻辑
     * !h->iscompr || j == 0 这段逻辑的意思是 如果子节点不是压缩节点(是一个分岔节点 那么iskey只能表达之前所有链路上的字符是一个key)
     * 如果子节点是一个压缩节点 同时j==0 就代表压缩节点不需要分裂(此时字符已经完全匹配,子节点不受影响),iskey属性就是描述之前所有链路上的字符是一个key
     * 当字符串与某个压缩节点的部分前缀匹配时就需要进行分裂
     * */
    if (i == len && (!h->iscompr || j == 0 /* not in the middle if j is 0 */)) {
        debugf("### Insert: node representing key exists\n");
        /* Make space for the value pointer if needed.
         * 如果子节点iskey == false 代表i作为key还没有存储到rax中
         * 或者key存在 但是之前未设置数据
         * */
        if (!h->iskey || (h->isnull && overwrite)) {
            // 分配更多的内存呢 以便存储data
            h = raxReallocForData(h,data);
            // 更新parentlink指针对应的值
            if (h) memcpy(parentlink,&h,sizeof(h));
        }
        // 对应raxReallocForData 无法分配足够内存的情况
        if (h == NULL) {
            errno = ENOMEM;
            return 0;
        }

        /*
         * Update the existing key if there is already one.
         * 此时字符串完全匹配 并且子节点对应的iskey为true 代表在rax中该字符串已经以key的形式存在了
         * */
        if (h->iskey) {
            // 如果传入了 old指针 就获取该key原先绑定的值
            if (old) *old = raxGetData(h);
            // 如果允许覆盖数据 进行覆盖
            if (overwrite) raxSetData(h,data);
            errno = 0;
            return 0; /* Element already exists. */
        }

        /* Otherwise set the node as a key. Note that raxSetData()
         * will set h->iskey.
         * 此时字符串完全匹配 但是字符串还没有成为key 需要修改子节点的标识位 并设置value(data存在的情况下)
         * */
        raxSetData(h,data);
        // 因为有一个新的key产生 所以element+1
        rax->numele++;
        return 1; /* Element inserted. */
    }

    /* ------------------------- ALGORITHM 1 ---------------------------
     * 代表本次与某个压缩前缀不匹配 需要分裂
     * 这种情况代表会产生一个分岔节点 同时将新字符串包装成节点连接到分岔节点上 (压缩节点被分裂后的部分字符串也会变成新的节点并连接到分岔节点)
     * */
    if (h->iscompr && i != len) {
        debugf("ALGO 1: Stopped at compressed node %.*s (%p)\n",
            h->size, h->data, (void*)h);
        debugf("Still to insert: %.*s\n", (int)(len-i), s+i);
        debugf("Splitting at %d: '%c'\n", j, ((char*)h->data)[j]);
        debugf("Other (key) letter is '%c'\n", s[i]);

        /* 1: Save next pointer. */
        // 找到该节点下最后一个子节点 压缩节点实际上只有一个子节点
        raxNode **childfield = raxNodeLastChildPtr(h);
        // 将子节点数据暂存在这里
        raxNode *next;
        memcpy(&next,childfield,sizeof(next));
        debugf("Next is %p\n", (void*)next);
        debugf("iskey %d\n", h->iskey);
        if (h->iskey) {
            debugf("key value is %p\n", raxGetData(h));
        }

        /* Set the length of the additional nodes we will need. */
        // 记录从哪里开始分裂
        size_t trimmedlen = j;
        // 剩下多少长度 需要向下沉淀
        size_t postfixlen = h->size - j - 1;
        // 代表压缩节点对应共享前缀的第一个字符就不匹配 这里应该要生成一个分岔节点
        int split_node_is_key = !trimmedlen && h->iskey && !h->isnull;
        size_t nodesize;

        /* 2: Create the split node. Also allocate the other nodes we'll need
         *    ASAP, so that it will be simpler to handle OOM.
         *    开始生成分岔节点
         *    */
        raxNode *splitnode = raxNewNode(1, split_node_is_key);
        raxNode *trimmed = NULL;
        raxNode *postfix = NULL;

        // 分配足够的空间
        if (trimmedlen) {
            nodesize = sizeof(raxNode)+trimmedlen+raxPadding(trimmedlen)+
                       sizeof(raxNode*);
            if (h->iskey && !h->isnull) nodesize += sizeof(void*);
            trimmed = rax_malloc(nodesize);
        }

        if (postfixlen) {
            nodesize = sizeof(raxNode)+postfixlen+raxPadding(postfixlen)+
                       sizeof(raxNode*);
            postfix = rax_malloc(nodesize);
        }

        /* OOM? Abort now that the tree is untouched. */
        // 处理内存不够的情况
        if (splitnode == NULL ||
            (trimmedlen && trimmed == NULL) ||
            (postfixlen && postfix == NULL))
        {
            rax_free(splitnode);
            rax_free(trimmed);
            rax_free(postfix);
            errno = ENOMEM;
            return 0;
        }

        // 可以看到从j开始的数据被赋值到了新的node上
        // 此时原来的节点下游就应该连接一个分岔节点  分岔节点下面一个关联新的字符串生成的节点 另一个关联splitnode
        splitnode->data[0] = h->data[j];

        // 当j为0时 比较简单 将0的字符与新插入的未匹配上的第一位字符组合成一个分岔节点
        if (j == 0) {
            /* 3a: Replace the old node with the split node. */
            // 将节点上原本维护的数据转移到子节点  iskey属性不用修改 因为这个属性是表示上个节点整条链路上的字符是一个key
            if (h->iskey) {
                void *ndata = raxGetData(h);
                raxSetData(splitnode,ndata);
            }
            memcpy(parentlink,&splitnode,sizeof(splitnode));
        } else {
            /* 3b: Trim the compressed node. */
            // 这里要分裂成3个节点
            // 第一个节点是与新字符串还有部分共享前缀 作为一个缩小的压缩节点
            // 第二个节点是分岔节点 从2个字符串不匹配的地方开始
            // 第三个节点是原节点分割后挂载在分岔节点下的子节点
            trimmed->size = j;
            // 将第一部分的数据取出来 设置到trimmed上
            memcpy(trimmed->data,h->data,j);
            // 如果共享长度为1 会被看作一个单岔路分岔节点 而不是一个压缩节点
            trimmed->iscompr = j > 1 ? 1 : 0;
            // 这些属性都是为了表示父节点的 直接继承就好
            trimmed->iskey = h->iskey;
            trimmed->isnull = h->isnull;
            // 将父节点key对应的value 转移到第一个节点上
            if (h->iskey && !h->isnull) {
                void *ndata = raxGetData(h);
                raxSetData(trimmed,ndata);
            }
            // 将第一个节点的尾部连接到 第二个节点 也就是分岔节点上
            raxNode **cp = raxNodeLastChildPtr(trimmed);
            memcpy(cp,&splitnode,sizeof(splitnode));
            // 更新parentlink指向的内存数据
            memcpy(parentlink,&trimmed,sizeof(trimmed));
            parentlink = cp; /* Set parentlink to splitnode parent. */
            rax->numnodes++;
        }

        // 在分岔节点上追加子节点 以及将本次传入的key剩余部分连接到分岔节点的逻辑可以共用 所以没有放在上面的if else中
        /* 4: Create the postfix node: what remains of the original
         * compressed node after the split. */
        // 代表有后缀数据
        if (postfixlen) {
            /* 4a: create a postfix node. */
            // 因为上层节点继承了原来的iskey属性 这里就不需要设置了(该节点的iskey对应分岔节点中某一分支是否是一个key)
            postfix->iskey = 0;
            postfix->isnull = 0;
            postfix->size = postfixlen;
            // 同样当剩余长度=1时简化成一个分岔节点 
            postfix->iscompr = postfixlen > 1;
            // 将后续的字符串拷贝到节点上
            memcpy(postfix->data,h->data+j+1,postfixlen);
            // 将原本共享前缀的子节点信息挂载到新的后缀节点上
            raxNode **cp = raxNodeLastChildPtr(postfix);
            memcpy(cp,&next,sizeof(next));
            rax->numnodes++;
        } else {
            /* 4b: just use next as postfix node. */
            // 刚好在共享前缀的最后一个字符没匹配上 就不需要分裂出新的节点了
            postfix = next;
        }

        /* 5: Set splitnode first child as the postfix node. */
        // 将下个节点转移到分岔节点下
        raxNode **splitchild = raxNodeLastChildPtr(splitnode);
        memcpy(splitchild,&postfix,sizeof(postfix));

        /* 6. Continue insertion: this will cause the splitnode to
         * get a new child (the non common character at the currently
         * inserted key). */
        // 此时原来的压缩节点就可以被释放了
        rax_free(h);
        // h此时指向分岔节点 此时本次要插入的key还没有挂载到分岔节点上
        h = splitnode;
        // 代表本次新插入的字符串已经用完 只要分裂原来的压缩节点就可以
    } else if (h->iscompr && i == len) {
    /* ------------------------- ALGORITHM 2 --------------------------- */
        debugf("ALGO 2: Stopped at compressed node %.*s (%p) j = %d\n",
            h->size, h->data, (void*)h, j);

        /* Allocate postfix & trimmed nodes ASAP to fail for OOM gracefully. */

        // 生成后缀节点
        size_t postfixlen = h->size - j;
        size_t nodesize = sizeof(raxNode)+postfixlen+raxPadding(postfixlen)+
                          sizeof(raxNode*);
        if (data != NULL) nodesize += sizeof(void*);
        raxNode *postfix = rax_malloc(nodesize);

        // 修改原来节点的大小
        nodesize = sizeof(raxNode)+j+raxPadding(j)+sizeof(raxNode*);
        if (h->iskey && !h->isnull) nodesize += sizeof(void*);
        raxNode *trimmed = rax_malloc(nodesize);

        if (postfix == NULL || trimmed == NULL) {
            rax_free(postfix);
            rax_free(trimmed);
            errno = ENOMEM;
            return 0;
        }

        /* 1: Save next pointer. */
        // 先存储当前压缩节点的下个节点
        raxNode **childfield = raxNodeLastChildPtr(h);
        raxNode *next;
        memcpy(&next,childfield,sizeof(next));

        /* 2: Create the postfix node. */
        // 将相关属性赋值到分裂出来的节点上
        // postfix->iskey = 1 代表上个节点的整条字符串链路是一个key (也就是本次插入的字符串)
        postfix->size = postfixlen;
        postfix->iscompr = postfixlen > 1;
        postfix->iskey = 1;
        postfix->isnull = 0;
        memcpy(postfix->data,h->data+j,postfixlen);
        raxSetData(postfix,data);
        raxNode **cp = raxNodeLastChildPtr(postfix);
        memcpy(cp,&next,sizeof(next));
        rax->numnodes++;

        /* 3: Trim the compressed node. */
        // 当前压缩节点要变小
        trimmed->size = j;
        trimmed->iscompr = j > 1;
        trimmed->iskey = 0;
        trimmed->isnull = 0;
        memcpy(trimmed->data,h->data,j);
        memcpy(parentlink,&trimmed,sizeof(trimmed));
        // 将原压缩节点携带的数据转移到 新压缩节点上
        if (h->iskey) {
            void *aux = raxGetData(h);
            raxSetData(trimmed,aux);
        }

        /* Fix the trimmed node child pointer to point to
         * the postfix node. */
        // 将分裂出来的节点挂载到新压缩节点上
        cp = raxNodeLastChildPtr(trimmed);
        memcpy(cp,&postfix,sizeof(postfix));

        /* Finish! We don't need to continue with the insertion
         * algorithm for ALGO 2. The key is already inserted. */
        rax->numele++;
        rax_free(h);
        return 1; /* Key inserted. */
    }

    /* We walked the radix tree as far as we could, but still there are left
     * chars in our string. We need to insert the missing nodes. */
    // 这里是尝试将多出来的部分包装成节点 并挂载到rax树下
    while(i < len) {
        raxNode *child;

        /* If this node is going to have a single child, and there
         * are other characters, so that that would result in a chain
         * of single-childed nodes, turn it into a compressed node. */
        // 此时h已经确保是一个分岔节点(这里空节点也被认为是一个分岔节点 只存在分岔节点和压缩节点2种) 详见ALGORITHM 1
        // 代表分岔节点此时还没有挂载子节点
        if (h->size == 0 && len-i > 1) {
            debugf("Inserting compressed node\n");
            size_t comprsize = len-i;
            if (comprsize > RAX_NODE_MAX_SIZE)
                comprsize = RAX_NODE_MAX_SIZE;
            // 如果此时分岔节点的长度为0 代表下面还没有挂载任何子节点 那么直接将分岔节点转换成压缩节点
            // 并且同时会生成该压缩节点的子节点
            raxNode *newh = raxCompressNode(h,s+i,comprsize,&child);
            if (newh == NULL) goto oom;
            h = newh;
            // 将最新的节点挂载到父节点下
            memcpy(parentlink,&h,sizeof(h));
            parentlink = raxNodeLastChildPtr(h);
            i += comprsize;
            // 需要将当前字符串拆解 将第一个字符设置到分岔节点上 剩余字符串包装成新节点 并挂载到分岔节点下
        } else {
            debugf("Inserting normal node\n");
            raxNode **new_parentlink;
            // 此时已经将新的第一个字符插入到分岔节点下了
            raxNode *newh = raxAddChild(h,s[i],&child,&new_parentlink);
            if (newh == NULL) goto oom;
            h = newh;
            // 更新原父节点的数据
            memcpy(parentlink,&h,sizeof(h));
            parentlink = new_parentlink;
            i++;
        }
        rax->numnodes++;
        h = child;
    }
    // 将本次的data数据插入到最后的节点上
    raxNode *newh = raxReallocForData(h,data);
    if (newh == NULL) goto oom;
    h = newh;
    if (!h->iskey) rax->numele++;
    // 在set过程中 会间接将h.iskey设置成true
    raxSetData(h,data);
    // 更新子节点信息
    memcpy(parentlink,&h,sizeof(h));
    return 1; /* Element inserted. */
Copy the code

Among the insert methods is a core method, raxLowWalk, which compares the string passed in to characters existing in the current tree and returns the matching length. Also point h to the last matching node. Note: in the case of a perfect match, h will automatically point to the next node. You can know whether the key corresponding to the string already exists by judging the iskey attribute of the node.

Let’s look at the logic of the match

static inline size_t raxLowWalk(rax *rax, unsigned char *s, size_t len, raxNode **stopnode, raxNode ***plink, Int *splitpos, raxStack *ts) {raxNode *h = rax->head; RaxNode **parentlink = &rax->head; Size_t I = 0; Size_t j = 0; size_t j = 0; /* Position in the node children (or bytes if compressed).*/ // h->size Share prefix length or number of children of this node // I < len While (h->size && I < len) {debugNode ("Lookup current node",h); Unsigned char *v = h->data; unsigned char *v = h->data; If (h->iscompr) {if (h->iscompr) {for (j = 0; j < h->size && i < len; J ++, I ++) {// if (v[j]! = s[i]) break; } // Split the prefix if (j! = h->size) break; } else { /* Even when h->size is large, linear scan provides good * performances compared to other approaches that are in theory * more sounding, Like performing a binary search. * If the node is uncompressed the bifurcation corresponds to [A,b,c], which continues by matching a character and finding the corresponding child pointer * when there are no elements inside [] * */ for (j = 0; j < h->size; j++) { if (v[j] == s[i]) break; If (j == h->size) break; if (j == h->size) break; i++; } // This means that the prefix is matched and the next node needs to be found. Also, if the match fails on a node, the pointer passed in from the external node points to the last node to be matched (the successful match points to the child node) If (ts) raxStackPush(ts,h); if (ts) raxStackPush(ts,h); RaxNode **children = raxNodeFirstChildPtr(h); /* Save stack of parent nodes. If (h->iscompr) j = 0; if (h->iscompr) j = 0; /* children+j memcpy(&h,children+j,sizeof(h)); /* child is at index 0. Parentlink = children+j; parentlink = children+j; // start matching the next node j = 0; /* If the new node is non compressed and we do not iterate again (since i == len) set the split position to 0 to signal this node represents the searched key. */ } debugnode("Lookup stop node is",h); If (stopNode) * stopNode = h; if (stopNode) * stopNode = h; if (plink) *plink = parentlink; If (splitpos && h->iscompr) *splitpos = j; if (splitpos && h->iscompr) *splitpos = j; return i; }Copy the code

Here is a TS object, which is a stack structure that stores the nodes traversed during the matching process, so that the parent node can be traced backwards and used for removal.

Finally, look at the remove method

int raxRemove(rax *Remove(rax, unsigned char *s, size_t len, void **old) {
    raxNode *h;
    raxStack ts;

    debugf("### Delete: %.*s\n", (int)len, s);
    // 删除操作需要借助 stack对象
    raxStackInit(&ts);
    int splitpos = 0;
    // 首先确保当前字符对应的key存在于rax中
    size_t i = raxLowWalk(rax,s,len,&h,NULL,&splitpos,&ts);

    // 代表key不存在 与之前get的判断逻辑一样
    if (i != len || (h->iscompr && splitpos != 0) || !h->iskey) {
        // 此时释放之前存储的节点
        raxStackFree(&ts);
        return 0;
    }
    // 代表需要获取旧数据
    if (old) *old = raxGetData(h);
    // 当成功找到key时 返回的h节点指向的是子节点 清除iskey标识
    h->iskey = 0;
    rax->numele--;

    /* If this node has no children, the deletion needs to reclaim the
     * no longer used nodes. This is an iterative process that needs to
     * walk the three upward, deleting all the nodes with just one child
     * that are not keys, until the head of the rax is reached or the first
     * node with more than one child is found. */

    int trycompress = 0; /* Will be set to 1 if we should try to optimize the
                            tree resulting from the deletion. */

    // 检测是否需要对rax结构进行变形
    // 子节点长度为0 此时子节点已经没有存在的必要了(既没有存储共享前缀信息 也没有存储分岔信息)
    if (h->size == 0) {
        debugf("Key deleted in node without children. Cleanup needed.\n");
        raxNode *child = NULL;
        // 循环检测是否满足移除条件  直到回到header节点
        while(h != rax->head) {
            child = h;
            debugf("Freeing child %p [%.*s] key:%d\n", (void*)child,
                (int)child->size, (char*)child->data, child->iskey);
            rax_free(child);
            rax->numnodes--;
            // 弹出之前访问过的节点  raxNode之间是一个单向结构 所以需要借助一个额外的栈对象
            h = raxStackPop(&ts);
             /* If this node has more then one child, or actually holds
              * a key, stop here. */
             // 当遇到了iskey为true 此时本节点不能移除 需要作为父节点的标记节点  或者本节点是一个分岔节点 并且还有其他分支
            if (h->iskey || (!h->iscompr && h->size != 1)) break;
        }
        // 代表至少移除了一个节点
        if (child) {
            debugf("Unlinking child %p from parent %p\n",
                (void*)child, (void*)h);
            // 此时需要将本节点从rax结构中剥离 也就是做一些清理工作 比如此时节点是一个分岔节点 需要减少size 以及缩小data
            raxNode *new = raxRemoveChild(h,child);
            // 如果节点本身没有变化 代表内存重分配失败
            if (new != h) {
                raxNode *parent = raxStackPeek(&ts);
                raxNode **parentlink;
                if (parent == NULL) {
                    parentlink = &rax->head;
                } else {
                    parentlink = raxFindParentLink(parent,h);
                }
                // 更新父节点下指针信息
                memcpy(parentlink,&new,sizeof(new));
            }

            /* If after the removal the node has just a single child
             * and is not a key, we need to try to compress it.
             * 此时分岔节点下如果只有一条分支了 可以尝试将它与父节点进行合并 注意这里要确保iskey为0
             * */
            if (new->size == 1 && new->iskey == 0) {
                trycompress = 1;
                h = new;
            }
        }
        // 子节点只有一条分支或者是压缩节点时 可以尝试与父节点进行合并
    } else if (h->size == 1) {
        /* If the node had just one child, after the removal of the key
         * further compression with adjacent nodes is potentially possible. */
        trycompress = 1;
    }

 
    // 尝试将本节点与父节点压缩
    if (trycompress) {
        debugf("After removing %.*s:\n", (int)len, s);
        debugnode("Compression may be needed",h);
        debugf("Seek start node\n");

        /* Try to reach the upper node that is compressible.
         * At the end of the loop 'h' will point to the first node we
         * can try to compress and 'parent' to its parent. */
        raxNode *parent;
        // 不断向上寻找 直到发现一个无法被合并的节点
        while(1) {
            parent = raxStackPop(&ts);
            if (!parent || parent->iskey ||
                (!parent->iscompr && parent->size != 1)) break;
            h = parent;
            debugnode("Going up to",h);
        }
        raxNode *start = h; /* Compression starting node. */

        /* Scan chain of nodes we can compress. */
        // 此时h可能是一个单分支的分岔节点 也可能是一个压缩节点  但是都能确保此时h只有一个子节点
        size_t comprsize = h->size;
        int nodes = 1;
        while(h->size != 0) {
            raxNode **cp = raxNodeLastChildPtr(h);
            // 使用子节点数据覆盖h此时的数据
            memcpy(&h,cp,sizeof(h));
            if (h->iskey || (!h->iscompr && h->size != 1)) break;
            /* Stop here if going to the next node would result into
             * a compressed node larger than h->size can hold. */
            if (comprsize + h->size > RAX_NODE_MAX_SIZE) break;
            nodes++;
            // 每当发现有一个满足条件的node 就将size累加到comprsize  代表总计压缩了多少长度
            comprsize += h->size;
        }
        // 代表至少有一个节点满足合并条件
        if (nodes > 1) {
            /* If we can compress, create the new node and populate it. */
            // 计算最新的node长度
            size_t nodesize =
                sizeof(raxNode)+comprsize+raxPadding(comprsize)+sizeof(raxNode*);
            raxNode *new = rax_malloc(nodesize);
            /* An out of memory here just means we cannot optimize this
             * node, but the tree is left in a consistent state. */
            if (new == NULL) {
                raxStackFree(&ts);
                return 1;
            }
            new->iskey = 0;
            new->isnull = 0;
            new->iscompr = 1;
            new->size = comprsize;
            // 这里先假定会插入这个新的节点 将nodes+1 在下面会开始拷贝数据 同时减少nodes
            rax->numnodes++;

            /* Scan again, this time to populate the new node content and
             * to fix the new node child pointer. At the same time we free
             * all the nodes that we'll no longer use. */
            // 主要用来跟踪data的偏移量
            comprsize = 0;
            h = start;
            while(h->size != 0) {
                memcpy(new->data+comprsize,h->data,h->size);
                comprsize += h->size;
                raxNode **cp = raxNodeLastChildPtr(h);
                raxNode *tofree = h;
                memcpy(&h,cp,sizeof(h));
                rax_free(tofree); rax->numnodes--;
                if (h->iskey || (!h->iscompr && h->size != 1)) break;
            }
            debugnode("New node",new);

            /* Now 'h' points to the first node that we still need to use,
             * so our new node child pointer will point to it. */
            // 此时h指向的是未参与合并的节点 挂载到新节点下
            raxNode **cp = raxNodeLastChildPtr(new);
            memcpy(cp,&h,sizeof(h));

            /* Fix parent link. */
            // 将新节点挂载到未参与合并的父节点下
            if (parent) {
                raxNode **parentlink = raxFindParentLink(parent,start);
                memcpy(parentlink,&new,sizeof(new));
            } else {
                rax->head = new;
            }

            debugf("Compressed %d nodes, %d total bytes\n",
                nodes, (int)comprsize);
        }
    }
    // 释放整个stack结构的内存
    raxStackFree(&ts);
    return 1;
}
Copy the code

These are the core methods, and the rest of them, like traversal,get, and set, you can look at them yourself. Some comments may have errors, because there is no modification, but only for reference. If there is any mistake, please discuss peacefully and forgive me.

For the full note :github.com/a137872798/…