The whole process

1. Select the file, slice the file, calculate the file hash, and check whether the file has been partially sliced

2. If yes, return the list of uploaded slices, compare all slices, get the list of unuploaded slices, and upload. Otherwise upload all slices.

4. After receiving the slice, the server creates a folder and names it with file hash. Slice files are stored in folders and named by slice serial number.

3. When all slices are uploaded successfully, send a request to merge slices, merge slices on the server, and delete the hash folder.

File directory:

  • Routes: routing
  • Static: pages and static resource files. Uploaded and downloaded files are stored in files
  • App.js: project entry file

Uploading file slices

Selecting a file returns an array of File objects. The file class inherits from Blob. So it also has a slice method, file.slice(start,end) returns a new Blob object that contains data within the specified range of the active Blob object. We use the slice method to slice large files and upload them. The same site can send up to six HTTP requests simultaneously, which means it can send up to six slices at the same time, which in theory should be faster than just sending a large file (it turns out to be only a little faster, but it would be much better to continue it if it broke in mid-stream).

  • 1. Generate slices
function createFileChunk(file, size=CHUNK_SIZE) {
   var fileChunkList = []
   var cur = 0,i=0 
   while (cur < file.size) {
      fileChunkList.push({
         file: file.slice(cur, cur + size),
         idx: i
      })
      cur += size
      i++
   }
   return fileChunkList
}
Copy the code
  • 2. Compute the file hash

Hash is the unique identifier of a file. You can use the hash to query the upload status of a file. The spark-MD5 library is used to generate hashes. Large files need to be read in fragments for hash calculation. If a large file is read at one time, the page may become jammed or even crash.

Add the read file content to the hash calculation of spark-MD5 until the file is read, and return the final hash code.

function genHash(){
  return new Promise(function(resolve,reject){
     var chunkList = createFileChunk(file),
	 chunks = chunkList.length,
    	 currentChunk = 0,
    	 spark = new SparkMD5.ArrayBuffer(),
         fileReader = new FileReader()
       
    function loadNext() {
        fileReader.readAsArrayBuffer(chunkList[currentChunk].file);
    }  
    fileReader.onload = function(e){
      	spark.append(e.target.result)
      	currentChunk++
      	if (currentChunk < chunks) {
          	loadNext()
      	}else {
          	var md5 = spark.end()
          	resolve(md5)
      	}
  }
  fileReader.onerror = function (err) {
    console.warn('oops, something went wrong.')
    reject(err)
  }
    loadNext()
  })
}
Copy the code
  • 3. Query the file status

After obtaining the file hash, check whether there is a folder named hash in the background. If so, list all files under the folder and get the list of uploaded slices. If not, the list of uploaded slices is empty.

Ajax('POST','http://localhost:9000/checkFile',{
    name: FileMd5
}).then(function(res){
    var chunks = res.data.chunks
    ...
})
Copy the code

  • 4. Send slices

The uploaded sections were compared with all sections to obtain the sections to be uploaded.

Var chunks = res.data.chunks // slices that already exist on the server side, If (chunks.length > 0){var newList = [] if(chunks.length > 0){var newList = [] chunks.sort(function(a,b){return a.idx-b.idx}) for(var i=0,j=0; i<chunkList.length; i++){ if(chunks[j] == chunkList[i].idx){ j++ }else{ newList.push(chunkList[i]) } } list = newList }Copy the code

The promise. all method is used to upload slices. If all slices are uploaded successfully, the success callback function will be executed; if any slice upload fails, the failure callback function will be executed.

function addRequest(chunkList,hash){ return chunkList.map(function(chunk){ var formData = new FormData() formData.append('data', chunk.file) formData.append('index', chunk.idx) formData.append('hash', hash) return uploadAjax('POST','http://localhost:9000/uploadBigFile',formData,{ progressCallback: Function (loaded,total){chunk.loaded = loaded}})} function(loaded,total){chunk.loaded = loaded}})} Promise.all(addRequest(list,FileMd5)).then(function(res){ mergeRequest(FileMd5,fileName) },function(err){})Copy the code
  • 5. After uploading all slices successfully, send a request for merging files. On the server side, after the slices are merged into a file, delete the slice file.
function mergeRequest(hash,name){
    Ajax('POST','http://localhost:9000/mergeFile',{
         hash: hash,
         name: name
    }).then(function(res){//...})
}
Copy the code

The server side

The server uses koA2 and KOA-Router.

app.js

const Koa = require('koa'); const path = require('path'); const koaBody = require('koa-body'); const static = require('koa-static'); const home = require('./routes/home.js') const app = new Koa(); App.use (async (CTX, next) => {fileFilter(CTX) await next()}) app.use(koaBody({//ctx.request. //ctx.query is used to get the parameters of the get request; Ctx.request. files File Access multipart: true, // Support file upload formidable: {maxFileSize: formidable: 2 * 1024 * 1024 * 1024, // The maximum size of a single file to be uploaded is 2G multipart: true}})); app.use(static(path.join(__dirname,'./static'))); App.use (home.routes(), home.allowedMethods()) app.on("error",(err, CTX)=>{// Trap error log console.log(new Date(),":",err); }); app.listen(9000, () => { console.log('server is listen in 9000'); }); Function fileFilter(CTX){const url = ctx.request.url const p = /^\/files\// if(p.test(url)){ ctx.set('Accept-Ranges', 'bytes') ctx.set('Content-Disposition', 'attachment') } }Copy the code

The uploaded file slices are stored in the Files /temp folder, and the merged files are stored in the Files folder.

  • Example Query the file upload status. Check whether there is a hash folder in the files/temp folder. If yes, return uploaded slices in the folder.
router.post('/checkFile', Async (CTX) => {var {name} = ctx.request.body // Name var chunks with hash = [] var files = fs.readdirSync(resolveFilePathTemp('')) files.forEach( file => { if(file == name){ var filePath = resolveFilePathTemp(file); var stat = fs.statSync(filePath); if(stat.isDirectory()){ chunks = fs.readdirSync(resolveFilePathTemp(file)) } } }) ctx.body = { code: 400001, data: { chunks: chunks } } });Copy the code
  • Receive slices. Queries folders with the hash file, creates a hash folder if none exists, and stores slices into the folder.
router.post('/uploadBigFile', async (ctx) => { 
    var { index, hash } = ctx.request.body
    var file = ctx.request.files.data
    await saveFragmentFile(file,hash,index)
    ctx.body = {
        code: 400001,
        message: '上传成功'
    }
});
//处理分片上传文件
async function saveFragmentFile(file,hash,index){
    return new Promise((resolve,reject) =>{
        var exist = fs.existsSync(resolveFilePathTemp(hash))
        if(!exist){
            fs.mkdirSync(resolveFilePathTemp(hash))
        }
        var writeFilePath = resolveFilePathTemp(`${hash}/${index}`)
        readStream = fs.createReadStream(file.path);
        writeStream = fs.createWriteStream(writeFilePath);
        readStream.pipe(writeStream);
        readStream.on("end",() => {
            resolve()
        })
        readStream.on("error",() => {
            reject()
        })
    })
};
Copy the code

  • Merge slices. Hash to find the folder to merge, where files may not arrive in sequence by slice number, so sort by filename first and then write to the destination file in turn. Delete the hash folder after the merge.
router.post('/mergeFile', async (ctx) => { var { hash,name } = ctx.request.body var files = fs.readdirSync(resolveFilePathTemp(hash)).sort(function(a,b){return a-b}) var dirs = files.map((item) => { return resolveFilePathTemp(`${hash}/${item}`) }) await mergeFile(dirs, ResolveFilePath (name)) deleteDir(resolveFilePathTemp(hash)) ctx.body = {code: 400001, message: 'upload successfully'}}); Async function mergeFile(dirs, writePath){const fileWriteStream = fs.createWritestream (writePath); const fileWriteStream = fs.createWritestream (writePath); return new Promise((resolve) => { mergeFileRecursive(dirs, fileWriteStream,resolve) }) } function mergeFileRecursive(dirs, fileWriteStream,resolve){ if (! dirs.length) { fileWriteStream.end() resolve() return } const currentFile = dirs.shift() const currentReadStream = fs.createReadStream(currentFile) currentReadStream.pipe(fileWriteStream, { end: false }); currentReadStream.on('end', function() { mergeFileRecursive(dirs, fileWriteStream,resolve); }); }Copy the code

Upload progress

Because multiple slices are uploaded concurrently, xhr.upload.onprogress listening for each slice cannot directly get the upload progress of the entire file. Here, we listen to the xhr.upload.onprogress of each slice, add the uploaded part of each slice, and add the loaded property to the object to which each slice belongs. This property is then read every second for each slice object, summed and divided by the size of the entire file to get the upload progress.

var interval
function setProgress(){
    interval = setInterval(function(){
        var loadedSum = 0
        chunkList.forEach(item => {
           if(item.loaded){
               loadedSum += item.loaded
           }
        });
        var percent = loadedSum/fileSize >= 1 ? 1: loadedSum/fileSize
        progressBar.innerText = (percent * 100).toFixed(2) + '%'
    },1000)
}
function clearProgress(){
    clearInterval(interval)
}
Copy the code

Webworker generated hash

Generating file hash is a time-consuming operation that causes js threads to block. If hash is generated by fragments, the larger the size of the fragments, the more obvious the blocking. So we try to use Webworker to start another thread to process.Large files take a long time to compute the hash, so instead of clicking the upload button to compute the hash, we start the hash in the onchange event of the file input field. So if the file in the file field changes, you need to abort the current hash calculation and start again.

importScripts("./spark-md5.min.js")
var fileReader = new FileReader(),
    chunkList,chunks,currentChunk,spark

function loadNext() {
    fileReader.readAsArrayBuffer(chunkList[currentChunk].file);
}
fileReader.onload = function(e){
    console.log('load')
    spark.append(e.target.result)
    currentChunk++
    if (currentChunk < chunks) {
        loadNext()
    }else {
        var md5 = spark.end()
        self.postMessage(md5)
    }
}
fileReader.onerror = function (err) {
    console.warn('oops, something went wrong.')
    self.postMessage(null);
}
fileReader.onabort = function (err) {
    console.log('abort')
}
self.addEventListener('message', function (e) {
    console.log(e.data)
    if(e.data.changeChunk){
        fileReader.abort()
    }
    chunkList = e.data.chunkList
    chunks = chunkList.length,
    currentChunk = 0,
    spark = new SparkMD5.ArrayBuffer()
    loadNext()

}, false);
Copy the code

download

  • Methods a

HTML /.jpg/.mp4 will be tab-rendered, and.rar/.zip will be downloaded directly.

If you want all resources to be downloaded directly, regardless of the media type, you need to add content-Disposition: Attachment to the response header.

app.use(async (ctx, next) => {
  fileFilter(ctx)
  await next()
})
function fileFilter(ctx){
  const url = ctx.request.url
  const p = /^\/files\//
  if(p.test(url)){
    ctx.set('Accept-Ranges', 'bytes')
    ctx.set('Content-Disposition', 'attachment')
  }
}
Copy the code

Click the A TAB to download the file directly.

  • Way 2

If the user wants to listen for the download progress and make some customization in the progress, the above approach is not possible, use Ajax, and then listen for the xhr.onProgress event.

var xhr = new XMLHttpRequest() xhr.onreadystatechange = function () { if(xhr.readyState == 4 && xhr.status == 200){ / / XHR. Response for binary stream var blob = new blob ([XHR. Response] and {type: downloadOption fileType}); var href = window.URL.createObjectURL(blob); var link = document.createElement('a'); link.href = href; link.download = downloadOption.fileName; link.click(); if(downloadOption.successCallback){ downloadOption.successCallback() } } } xhr.onprogress = function (ev) { if (ev.lengthComputable && downloadOption.progressCallback) { downloadOption.progressCallback(ev.loaded,ev.total) } }Copy the code

In the latter case, as shown above, if the page is accidentally closed during the download process, the download will stop. So in large file systems, we choose the first option.

Project address: github.com/alasolala/f…