Offer to come, dig friends take it! I am participating in the 2022 Spring Recruit series activities – click on the task to see the details of the activities.

1. Title Description

Paths is given a list of directory information, including directory paths and all files in the directory and their contents. Please follow the paths back to all duplicate files in the file system. The answers can be returned in any order.

A set of duplicate files consists of at least two files that have exactly the same content.

Enter information about a single directory in the list in the following format:

“root/d1/d2/… /dm f1.txt(f1_content) f2.txt(f2_content) … Fn.txt (fn_content)” this means that in the directory root/d1/d2/… /dm, n files (f1.txt, f2.txt… F1_content, f2_content… Fn_content). Note: n >= 1 and m >= 0. If m = 0, the directory is the root directory.

The output is a list of repeated file path groups. Each group consists of file paths for all files with the same content. The file path is a string with the following format:

“Directory_path/file_name. TXT” example 1

Paths = ["root/a 1.txt(abcd) 2. TXT (efgh)","root/c 3. TXT (abcd)","root/c/d 4. TXT (efgh)","root 4. TXT (efgh)"] [["root/a/2.txt","root/c/d/4.txt","root/4.txt"],["root/a/1.txt","root/c/3.txt"]]Copy the code

Example 2

Paths = ["root/a 1.txt(abcd) 2. TXT (efgh)","root/c 3. TXT (abcd)","root/c/d 4. TXT (efgh)"] [["root/a/2.txt","root/c/d/4.txt"],["root/a/1.txt","root/c/3.txt"]]Copy the code

Second, train of thought analysis

In this case, the input format is a Paths array. Each item in the array contains the root path of the directory, a list of all files in the directory, and information about the file content. Such as: [” root/a 1. TXT (abcd) 2. TXT (efgh) “, “root/c 3. TXT (abcd)”, “root/c/d 4. TXT (efgh)”, “root 4. TXT (efgh)”] said is:

  • TXT and 2. TXT. The content of the 1. TXT file is ‘abcd’ and the content of the 2. TXT file is ‘efgh’.
  • TXT file ‘efgh’ in root/c/d;
  • The root directory has a file 4.txt with the content of ‘efgh’.

What we need to find is a list of files with the same content. That is, we need to find different files with the same content and put their full path into a list. As we see here, it’s not hard to see that we can use a Map hash table to solve this problem.

  • 1, define a hash table contentMap to store the file list, key is the file content, value is the file path list;
let contentMap = {};
Copy the code
  • 2. Go through the input parameter Paths and read the files under each directory;
for(let i = 0; i < paths.length; i++){
    let path = paths[i].split(' ');
    for(let j = 1; j < path.length; j++){
        let file = path[j].split(/\(|\)/g);
        const arr = contentMap[file[1]] | | []; arr.push(path[0] + '/' + file[0]);
        contentMap[file[1]] = arr; }}Copy the code
  • 3. Place the contents of each directory into a contentMap hash table.

let file = path[j].split(/\(|\)/g); Through/(|)/can separate the file path and file content.

let path = paths[i].split(' ');
for(let j = 1; j < path.length; j++){
    let file = path[j].split(/\(|\)/g);
    const arr = contentMap[file[1]] | | []; arr.push(path[0] + '/' + file[0]);
    contentMap[file[1]] = arr;
}
Copy the code
  • 4. Iterate through the hash table contentMap and store more than 2 file paths in the returned result.
let res = [];
for(let k in contentMap){
    if(contentMap[k].length > 1){ res.push(contentMap[k]); }}Copy the code

AC code

/ * * *@param {string[]} paths
 * @return {string[][]}* /
var findDuplicate = function(paths) {
    let contentMap = {};
    for(let i = 0; i < paths.length; i++){
        let path = paths[i].split(' ');
        for(let j = 1; j < path.length; j++){
            let file = path[j].split(/\(|\)/g);
            const arr = contentMap[file[1]] | | []; arr.push(path[0] + '/' + file[0]);
            contentMap[file[1]] = arr; }}let res = [];
    for(let k in contentMap){
        if(contentMap[k].length > 1){ res.push(contentMap[k]); }}return res;
};
Copy the code