Based on TS implementation, mainly because you can write fewer comments.

The preparatory work

Data structure definition

// Raw data units, similar to rows in a database table.
type TItem = { pid: number.id: number, [key: string] :any }
// A set of raw data, similar to a relational database table.
type TItemList = TItem[]
// The target data structure is hierarchical with children.
type TItemMap = TItem & { children: TItemMap[] }
// Target data set.
type TItemMapList = TItemMap[]
Copy the code

The Mock data

Write a data generator that generates a specified amount of data for later performance testing:

const genItems = (count: number = 1000, group: number = 4) = > {
    const arr: TItem[] = []
    let i = 0
    while(count >= ++i) {
        arr.push({
            id: i,
            pid: (i - 1) / group >> 0.ts: Date.now(),
            hash: Math.random()
        })
    }
    return arr.sort(() = > Math.random() - 0.5)}Copy the code
// Generate 11 data, each data up to 3 child nodes, sorted out of order
// genItems(11, 3)[{"id": 5."pid": 1."ts": 1626177721826."hash": 0.7183450458733935
  },
  {
    "id": 3."pid": 0."ts": 1626177721826."hash": 0.24706757764501863
  },
  {
    "id": 2."pid": 0."ts": 1626177721826."hash": 0.7232386518898348
  },
  // ...
]
Copy the code

The recursive implementation

In general, when developing for JS, recursion should be a priority if you can abstract out certain operating rules.

The implementation code

const mapping_r = (items: TItemList, startId: number = 0) :TItemMapList= > 
    items
        .filter(({ pid }) = > pid === startId)
        .map(({ id, ... rest }) = >({ id, ... rest,children: mapping_r(items, id)
        }))
Copy the code

In this scenario, you can see that every time you call mapping_r, you need to filter the full data once, so the overall complexity is O(n^2).

Performance test method

const timer = (
    name: string,
    total: number,
    count: number,
    mapping: (items: TItemList) => TItemMapList
) = > {
    console.log('='.repeat(16), name, '='.repeat(16))
    for(let i = 0; i < count; i++) {
        const items = genItems(total)
        console.time(`${name}_${total}_ the first${i + 1}Wheel `)
        const result = mapping(items)
        console.timeEnd(`${name}_${total}_ the first${i + 1}Wheel `)}}Copy the code

performance

timer('Recursive implementation'.1000.4, mapping_r)
timer('Recursive implementation'.2000.4, mapping_r)
timer('Recursive implementation'.5000.4, mapping_r)
timer('Recursive implementation'.10000.4, mapping_r)
Copy the code

The results

= = = = = = = = = = = = = = = = recursive implementation = = = = = = = = = = = = = = = = recursive implementation _1000_ round 1:24.209 ms recursive implementation _1000_ round 2:19.346 ms recursive implementation _1000_ round 3: 17.443 ms recursive implementation _1000_ round 4:6.561 ms = = = = = = = = = = = = = = = = recursive implementation = = = = = = = = = = = = = = = = recursive implementation _2000_ round 1:18.954 ms recursive implementation _2000_ second round: 19.262 ms recursive implementation _2000_ round 3:18.627 ms recursive implementation _2000_ round 4:17.186 ms = = = = = = = = = = = = = = = = recursive implementation = = = = = = = = = = = = = = = = recursive implementation _5000_ round 1: 85.696 ms recursive implementation _5000_ round 2:84.699 ms recursive implementation _5000_ round 3:82.379 ms recursive implementation _5000_ round 4: 86.283 ms = = = = = = = = = = = = = = = = recursive implementation = = = = = = = = = = = = = = = = recursive implementation _10000_ round 1:297.386 ms recursive implementation _10000_ round 2:478.334 ms recursive implementation _10000_ round 3: 356.48ms Recursive implementation _10000_ Round 4:356.114msCopy the code

At 5000 data items, performance degrades significantly; At 10,000, this performance becomes unacceptable.

Here’s the thing: Performance problems are not recursive.

Hash table implementation

The implementation code

const mapping_l = (items: TItem[], startId: number = 0): TItemMap[] => {
    if(items.length < 1) {
        return[]}else if(items.length === 1) {
        return [{ ...items[0].children] : []}}const mapper = {} as { [key: string]: TItemMap | { children: TItemMap[] } }
    const tmp: number[] = []
    items.forEach(item= > {
        constm = { ... item,children: []}if(m.id in mapper) {
            // placeholder mergemapper[m.id] = { ... item,children: [ ...mapper[m.id].children ]}
        } else {
            mapper[m.id] = m
        }
        if(! (m.pidin mapper)) {
            // The parent node is occupied
            mapper[m.pid] = { children: [] }
        }
        mapper[m.pid].children.push(m)
        if(m.pid === startId) {
            tmp.push(m.id)
        }
    })
    return tmp.map(id= > mapper[id]) as TItemMap[]
}
Copy the code

The whole process is only done once. The key value of mapper can be regarded as O(1), and the overall complexity is O(n).

Here’s a trick: When traversing items in an orderly fashion, it’s possible that the child node comes before the parent; At this point, a structure is created

mapper[m.pid] = { children: []}Copy the code

The key of the parent node is occupied; When the parent node appears, the two are merged.

performance

= = = = = = = = = = = = = = = = hash implementation = = = = = = = = = = = = = = = = hash _1000_ first1Round:0.759ms hash implementation _1000_th2Round:0.601ms hash implementation _1000_ the first3Round:0.585MS hash implementation _1000_first4Round:0.851 ms = = = = = = = = = = = = = = = = hash implementation = = = = = = = = = = = = = = = = hash _2000_ first1Round:1.032MS hash implementation _2000_ first2Round:0.727MS hash implementation _2000_ first3Round:2.025MS hash implementation _2000_ first4Round:1.348 ms = = = = = = = = = = = = = = = = hash implementation = = = = = = = = = = = = = = = = hash _5000_ first1Round:2.14ms hash implementation _5000_first2Round:2.304ms hash implementation _5000_first3Round:1.588MS hash implementation _5000_first4Round:1.547 ms = = = = = = = = = = = = = = = = hash implementation = = = = = = = = = = = = = = = = hash _10000_ first1Round:4.215MS hash implementation _10000_ the first2Round:3.57ms hash implementation _10000_ the first3Round:2.712ms hash implementation _10000_ the first4Round:5.879ms
Copy the code

Compared to the recursive implementation above, the performance difference is really quite large; Moreover, as the amount of data increases, the time consumption increases linearly.

extension

Ts compilation and running

When it came to performance testing, I initially used TS-Node directly for testing, but it felt very unreliable.

So NPM initializes the directory where the file (flat.ts) resides, installs typescript, and executes:

npx tsc flat.ts && node flat.js
Copy the code

NPX TSC flat.ts generates compiled JS files in the current directory and runs the monitor on Node.

Parametric operation

To facilitate comparison, we can run with parameters each time. In the code, we try to get the argument:

const [total = 1000, count = 4] = process.argv.slice(2)
timer('Recursive implementation'.Number(total), Number(count), mapping_r)
timer(Hash implementation.Number(total), Number(count), mapping_l)
Copy the code

Change the running script:

% npx tsc flat.ts && node flat.js 1000= = = = = = = = = = = = = = = = recursive implementation = = = = = = = = = = = = = = = = recursive implementation _1000_ round 1:18.47 ms recursive implementation _1000_ round 2:16.042 ms recursive implementation _1000_ round 3: 16.605 ms recursive implementation _1000_ round 4:7.061 ms = = = = = = = = = = = = = = = = hash implementation = = = = = = = = = = = = = = = = hash implementation _1000_ round 1:0.701 ms hash implementation _1000_ second round: 0.687ms hash implementation _1000_ round 3:0.561ms hash implementation _1000_ round 4:0.895ms
% npx tsc flat.ts && node flat.js 2000= = = = = = = = = = = = = = = = recursive implementation = = = = = = = = = = = = = = = = recursive implementation _2000_ round 1:58.678 ms recursive implementation _2000_ round 2:21.475 ms recursive implementation _2000_ round 3: 20.842 ms recursive implementation _2000_ round 4:14.497 ms = = = = = = = = = = = = = = = = hash implementation = = = = = = = = = = = = = = = = hash implementation _2000_ round 1:1.181 ms hash implementation _2000_ second round: 1.256ms hash implementation _2000_ round 3:1.877ms hash implementation _2000_ round 4:2.078ms
% npx tsc flat.ts && node flat.js 5000 2= = = = = = = = = = = = = = = = recursive implementation = = = = = = = = = = = = = = = = recursive implementation _5000_ round 1:177.188 ms recursive implementation _5000_ second round: 71.924 ms = = = = = = = = = = = = = = = = hash implementation = = = = = = = = = = = = = = = = hash implementation _5000_ round 1:4.273 ms hash implementation _5000_ round 2:6.698 msCopy the code

This way, the time ratio is much clearer, and it feels cool.

Thank you

After interviewing more than a dozen advanced front ends, I couldn’t even write a flat data structure to turn Tree.

The hash implementation in this article references the code above.

But I have to make fun of the title of this article… : (

The above.