preface

This article is 2895 words and takes about 12 minutes to read.

Summary: This article summarizes 10 common methods for array deduplication and compares them.

  • Public account: “front-end advanced learning”, reply to “666”, get a package of front-end technology books

Smoke past all forget, heart selfless heaven and earth wide.

The body of the

Array de-duplication is not a common requirement for the front end, and the back end is generally done, but it is an interesting question, and often appears in the interview to test the interviewer’s mastery of JS. In this paper, from the perspective of data types to think about the problem of array de-duplication, the first solution is to solve the array only the base data type, and then the object de-duplication. First, our test data:

var meta = [
    0.'0'.true.false.'true'.'false'.null.undefined.Infinity,
    {},
    [],
    function(){},
    { a: 1.b: 2 },
    { b: 2.a: 1},];var meta2 = [
    NaN.NaN.Infinity,
    {},
    [],
    function(){},
    { a: 1.b: 2 },
    { b: 2.a: 1},];var sourceArr = [...meta, ... Array(1000000)
    .fill({})
    .map(() = > meta[Math.floor(Math.random() * meta.length)]), ... meta2];Copy the code

All sourceArr references in the following are above variables. The sourceArr contains 100,008 pieces of data. Note NaN, which is the only value in JS that is not strictly equal to itself.

Then our goal is to get the sourceArr array from above:

// Array of length 14
[false."true".Infinity.true.0, [] {},"false"."0".null.undefined, {a: 1.b: 2}, NaN.function(){}]
Copy the code

Underlying data types

1. ES6 in the Set

This is a common method in ES6. For simple base data type deduplication, you can use this method directly by extending the + Set operator:

console.time(ES6 Set time:);
var res = [...new Set(sourceArr)];
console.timeEnd(ES6 Set time:);
// Set time in ES6:28.736328125ms
console.log(res);
/ / print the array length 20: [false, "true", Infinity, true, 0, [], [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN, function(){}, function(){}]
Copy the code

Or use array. from + Set:

console.time(ES6 Set time:);
var res = Array.from(new Set(sourceArr));
console.timeEnd(ES6 Set time:);
// Set time in ES6:28.538818359375ms
console.log(res);
/ / print the array length 20: [false, "true", Infinity, true, 0, [], [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN, function(){}, function(){}]
Copy the code

** Advantages: ** simple and convenient, can distinguish NaN;

** Disadvantages: ** cannot recognize the same object and array;

This method is recommended for simple scenarios.

2. Use the indexOf

Use the built-in indexOf method to find:

function unique(arr) {
    if (!Array.isArray(arr)) return;
    var result = [];
    for (var i = 0; i < arr.length; i++) {
        if (array.indexOf(arr[i]) === -1) {
            result.push(arr[i])
        }
    }
    return result;
}
console.time(IndexOf method time:);
var res = unique(sourceArr);
console.timeEnd(IndexOf method time:);
// indexOf method time: 23.376953125ms
console.log(res);
/ / print the array length 21: [false, "true", Infinity, true, 0, [], [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN,NaN, function(){}, function(){}]
Copy the code

Advantages: ES5 common methods, high compatibility, easy to understand;

Disadvantages: Inability to distinguish nans; Need special treatment;

It can be used in an ES6 environment or below.

3. Use the includes method

Similar to indexOf, but includes is a new API for ES7:

function unique(arr) {
    if (!Array.isArray(arr)) return;
    var result = [];
    for (var i = 0; i < arr.length; i++) {
        if(! result.includes(arr[i])) { result.push(arr[i]) } }return result;
}
console.time('includes method time: ');
var res = unique(sourceArr);
console.timeEnd('includes method time: ');
// Includes method time-consuming: 32.412841796875ms
console.log(res);
/ / print the array length 20: [false, "true", Infinity, true, 0, [], [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN, function(){}, function(){}]
Copy the code

Advantages: Nans can be distinguished;

Disadvantages: The ES version is more demanding and time-consuming than the indexOf method;

4. Use filter and indexOf methods

Filter elements by checking whether the current index is equal to the found index:

function unique(arr) {
   	if (!Array.isArray(arr)) return;
    return arr.filter(function(item, index, arr) {
        // Current element, the first index in the original array == current index value, otherwise return current element
        return arr.indexOf(item, 0) === index;
    });
}
console.time('Filter and indexOf method time:');
var res = unique(sourceArr);
console.timeEnd('Filter and indexOf method time:');
// Includes method time-consuming: 24.135009765625ms
console.log(res);
/ / print the array length 19: [false, "true", Infinity, true, 0, [], [] and {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, function(){}, function(){}]
Copy the code

Advantages: the use of high order function code greatly shortened;

Disadvantages: NaN is ignored because indexOf cannot find NaN.

This approach is elegant and requires very little code, but it’s still a fly in the face of using the Set structure for de-duplication.

5. Use the reduce + includes

Also clever use of two higher-order functions:

var unique = (arr) = >  {
   if (!Array.isArray(arr)) return;
   return arr.reduce((prev,cur) = > prev.includes(cur) ? prev : [...prev,cur],[]);
}
var res = unique(sourceArr);
console.time('Reduce and includes methods take:');
var res = unique(sourceArr);
console.timeEnd('Reduce and includes methods take:');
// The reduce and includes methods take 100.47802734375ms
console.log(res);
/ / print the array length 20: [false, "true", Infinity, true, 0, [], [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN, function(){}, function(){}]
Copy the code

Advantages: the use of high order function code greatly shortened;

Disadvantages: THE ES version is demanding and slow;

It’s also very elegant, but if you can use this method, you can also use the Set structure.

6. Use the Map structure

Using map:

function unique(arr) {
  if (!Array.isArray(arr)) return;
  let map = new Map(a);let result = [];
  for (let i = 0; i < arr.length; i++) {
    if(map .has(arr[i])) {
      map.set(arr[i], true); 
    } else { 
      map.set(arr[i], false); result.push(arr[i]); }}return result;
}
console.time('Map structure time: ');
var res = unique(sourceArr);
console.timeEnd('Map structure time: ');
// Map structure time: : 41.483154296875ms
console.log(res);
/ / print the array length 20: [false, "true", Infinity, true, 0, [], [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN, function(){}, function(){}]
Copy the code

It is not recommended to use the Set structure because it takes longer to remove weight.

7. Double nesting, use splice to delete duplicate elements

This is also a common way to double-traverse an array and pick out duplicate elements:

function unique(arr){    
    if (!Array.isArray(arr)) return;        
    for(var i = 0; i < arr.length; i++) {
        for(var j = i + 1; j<  arr.length; j++) {
            if(Object.is(arr[i], arr[j])) {// The first is the same as the second, and the splice method deletes the second
                arr.splice(j,1); j--; }}}return arr;
}
console.time('Double nested method time:');
var res = unique(sourceArr);
console.timeEnd('Double nested method time:');
// The double-layer nesting method takes: 41500.452880859375ms
console.log(res);
/ / print the array length 20: [false, "true", Infinity, true, 0, [], [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN, function(){}, function(){}]
Copy the code

Advantages: High compatibility.

Disadvantages: Low performance and high time complexity.

Not recommended.

8. Use the sort method

Sort the array, then sort through the array and sort out the elements that are not the same as their neighbors:

 function unique(arr) {
   if (!Array.isArray(arr)) return;
   arr = arr.sort((a, b) = > a - b);
   var result = [arr[0]].for (var i = 1; i < arr.length; i++) {
     if(arr[i] ! == arr[i-1]) { result.push(arr[i]); }}return result;
 }
console.time('sort method time: ');
var res = unique(sourceArr);
console.timeEnd('sort method time: ');
// Sort method takes: 936.071044921875ms
console.log(res);
// The array length is 357770, the rest is omitted
// print :(357770) [Array(0), Array(0), 0...
Copy the code

Advantages: None;

Disadvantages: Time-consuming, not controllable data after sorting;

It is not recommended because sorting by sort cannot sort numeric type 0 and string type ‘0’, resulting in a large amount of redundant data.

The above method is only for the base data type, for the object array function is not considered, let’s look at how to duplicate the same object.

Object

The following implementation is similar to using the Map structure in that the key of the object is not duplicated

9. Use hasOwnProperty and Filter

Using the filter and hasOwnProperty methods:

function unique(arr) {
  	if (!Array.isArray(arr)) return;
    var obj = {};
    return arr.filter(function(item, index, arr) {
        return obj.hasOwnProperty(typeof item + item) ? false : (obj[typeof item + item] = true)})}console.time('hasOwnProperty method time: ');
var res = unique(sourceArr);
console.timeEnd('hasOwnProperty method time: ');
// hasOwnProperty method time: : 258.528076171875ms
console.log(res);
// Print array length 13: [false, "true", Infinity, true, 0, [], {}, "false", "0", null, undefined, NaN, function(){}]
Copy the code

Advantages: simple code, can distinguish the same object array function;

Cons: High versioning requirements, low performance because you need to find the entire prototype chain;

{a: 1, b: 2} and {} are treated as the same data. {a: 1, b: 2} are treated as the same data. Therefore, this method also has disadvantages.

10. Take advantage of non-duplicate object keys

This approach is similar to using the Map structure, but the key composition is different:

function unique(arr) {
    if (!Array.isArray(arr)) return;
    var result = [];
     var  obj = {};
    for (var i = 0; i < arr.length; i++) {
        var key = typeof arr[i] + JSON.stringify(arr[i]) + arr[i];
        if(! obj[key]) { result.push(arr[i]); obj[key] =1;
        } else{ obj[key]++; }}return result;
}
console.time('Object method time:');
var res = unique(sourceArr);
console.timeEnd('Object method time:');
// Object method time: : 585.744873046875ms
console.log(res);
/ / print the array length 15: [false, "true", Infinity, true, 0, [] and {b: 2, a: 1}, {}, "false", "0", null, and undefined, {a: 1, b: 2}, NaN, function(){}]
Copy the code

This method is mature and removes duplicate arrays and duplicate objects, but for objects like {a: 1, b: 2} and {b: 2, a: Json.stringify () yields {“a”:1,”b”:2} and {“b”:2,”a”:1}. Add a method to determine whether objects are equal or not, rewrite as follows:

function isObject(obj) {
    return Object.prototype.toString.call(obj) === '[object Object]';
}
function unique(arr) {
    if (!Array.isArray(arr)) return;
    var result = [];
     var  obj = {};
    for (var i = 0; i < arr.length; i++) {
      	// Add objects and arrays here
        if (Array.isArray(arr[i])) {
            arr[i] = arr[i].sort((a, b) = > a - b);
        }
        if (isObject(arr[i])) {
            let newObj = {}
            Object.keys(arr[i]).sort().map(key= > {
                newObj[key]= arr[i][key];
            });
            arr[i] = newObj;
        }
        var key = typeof arr[i] + JSON.stringify(arr[i]) + arr[i];
        if(! obj[key]) { result.push(arr[i]); obj[key] =1;
        } else{ obj[key]++; }}return result;
}
console.time('Object method time:');
var res = unique(sourceArr);
console.timeEnd('Object method time:');
// Object method time: 793.142822265625ms
console.log(res);
/ / print the array length 14: [false, "true", Infinity, true, 0, [] and {b: 2, a: 1}, {}, "false", "0", null, and undefined, NaN, function () {}]
Copy the code

conclusion

methods advantages disadvantages
Set in the ES6 Simple and elegant, fast Basic types are recommended. Version requirements are high, object arrays and are not supportedNaN
Use the indexOf ES5 The following common methods are highly compatible and easy to understand Unable to distinguish betweenNaN; Require special treatment
Use the includes method Can distinguish betweenNaN The ES version is demanding, andindexOfThe method is time-consuming
Use the filter and indexOf methods Greatly shortened using higher-order function code; Due to theindexOfUnable to findNaN, soNaNBe ignored.
By using the reduce + includes Greatly shortened using higher-order function code; ES7 can only be used above, the speed is slow;
Using Map structure No obvious advantages ES6 above,
Double nesting, using splice to remove duplicate elements Compatibility of high Low performance, high time complexity, if not usedObject.isYou need to be rightNaNSpecial processing, extremely slow speed.
Using sort method There is no It takes a long time, and the data is not controllable after sorting.
Use hasOwnProperty and Filter : simple code, can distinguish the same object array function Versioning is high because you have to find the entire prototype chain and therefore performance is low;
Take advantage of the non-repetitive nature of object keys Elegant, wide range of data Object Recommended. The code is complicated.

The ability is limited, the level is general, welcome to erratum, greatly appreciated.

To subscribe for more articles, follow the public account “Front-end Advanced Learning” and reply to “666” for a package of front-end technology books