The fourth installment of the KOA source code reading deals with providing file data to the interface requester.

Koa, which is composed by koa, is composed by koa, which is composed by koa, which is composed by koA

Dealing with static files is a tedious task, because static files come from the server, and you can’t give away all permissions to the interface to read them. The verification of various paths and the matching of permissions are all things that need to be taken into account. Koa-send and KOA-Static are the middleware to help us handle these tedious tasks. Koa-send is the base of KOa-static. It can be seen in the NPM interface that koA-send is included in the static dependencies.

Koa-send is primarily used to make it easier to process static files, and unlike middleware such as Koa-Router, it is not injected directly into app.use as a function. Instead, the call is made in some middleware, passing in the Context of the current request and the location of the file, and then implementing the functionality.

GitHub address of koA-Send

Native file reading and transmission

In Node, if you use the native FS module for file data transfer, it looks something like this:

const fs      = require('fs')
const Koa     = require('koa')
const Router  = require('koa-router')

const app     = new Koa()
const router  = new Router()
const file    = './test.log'
const port    = 12306

router.get('/log', ctx => {
  const data = fs.readFileSync(file).toString()
  ctx.body = data
})

app.use(router.routes())
app.listen(port, () => console.log(` Server run as http://127.0.0.1:${port}`))
Copy the code

Or usecreateReadStreamInstead ofreadFileSyncIt also works, and the difference will be mentioned below

This simple example works on a single file, whereas if we are reading multiple files, it may even be passed through the interface argument. So it is difficult to guarantee that the file is real, and we may need to add some permission Settings to prevent sensitive files from being returned by the interface.

router.get('/file', ctx => {
  const { fileName } = ctx.query
  const path = path.resolve('./XXX', fileName)
  // Filter hidden files
  if (path.startsWith('. ')) {
    ctx.status = 404
    return
  }

  // Check whether the file exists
  if(! fs.existsSync(path)) { ctx.status =404
    return
  }

  // balabala

  const rs = fs.createReadStream(path)
  ctx.body = rs // KoA does something for the stream type. See the previous KOA article for details
})
Copy the code

With the addition of various logical judgments, reading static files is much safer, but this is only done within a router. If you have multiple interfaces that read static files, there’s a lot of repetitive logic, so it’s a good idea to boil it down to a public function.

Koa – the send way

That’s what KoA-Send does, providing a very well-packaged middleware for working with static files. Here are two basic examples:

const path = require('path')
const send = require('koa-send')

// Get files in a path
router.get('/file'.async ctx => {
  await send(ctx, ctx.query.path, {
    root: path.resolve(__dirname, './public')})})// Fetch for a file
router.get('/index'.async ctx => {
  await send(ctx, './public/index.log')})Copy the code

Suppose our directory structure looks like this, with simple-send-js as the execution file:

. ├ ─ ─ public │ ├ ─ ─ a. og │ ├ ─ ─ b.l og │ └ ─ ─ but the log └ ─ ─ simple - send. JsCopy the code

Use the/file? Path =XXX can easily access files under public. And /index to get the contents of the /public/index.log file.

Functionality provided by KOA-send

Koa-send provides a number of user-friendly options, with about ten other options available in addition to the usual root:

options type default desc
maxage Number 0 Sets the number of milliseconds that the browser can cache

The correspondingHeader: Cache-Control: max-age=XXX
immutable Boolean false Notifies the browser that the resource corresponding to the URL is immutable and can be cached indefinitely

The correspondingHeader: Cache-Control: max-age=XXX, immutable
hidden Boolean false Whether reading hidden files is supported

.Files at the beginning are called hidden files
root String Sets the root directory of the static file path. Any files outside this directory are forbidden to access.
index String Set a default file name that takes effect when accessing a directory and is automatically concatenated to the end of the path(Here is a little egg)
gzip Boolean true If the access interface is supported by the clientgzipAnd exist.gzSuffixes of the same file case will be passed.gzfile
brotli Boolean true Logic same as above, if supportedbrotliAnd there are.brFile with the same name suffix
format Boolean true The end of the path will not be required after being enabled/./pathand/path/It represents a path(only inpathIs a directory in effect)
extensions Array false If you pass an array, you try to pass all of the elements in the arrayitemMatches as the suffix of the file, and reads the file as it matches
setHeaders Function Used to manually specify someHeadersIt doesn’t mean much

The specific performance of parameters

There are parameters that work wonders, some that affect the Header, and some that optimize performance, like gzip and Brotli options.

The main logic of KOA-send can be broken down into these pieces:

  1. pathCheck path validity
  2. gzipThe application of compression logic
  3. File suffix, default entry file match
  4. Reading file data

At the beginning of the function there is logic like this:

const resolvePath = require('resolve-path')
const {
  parse
} = require('path')

async function send (ctx, path. opts = {}) {
  const trailingSlash = path[path.length - 1= = ='/'
  const index = opts.index

  // The initial values of various parameters are omitted here

  path = path.substr(parse(path).root.length)

  // ...

  // normalize path
  path = decode(path) // Call 'decodeURIComponent' internally
  // This means that passing in an escaped path is also acceptable

  if (index && trailingSlash) path += index

  path = resolvePath(root, path)

  // hidden file support, ignore
  if(! hidden && isHidden(root, path))return
}

function isHidden (root, path) {
  path = path.substr(root.length).split(sep)
  for (let i = 0; i < path.length; i++) {
    if (path[i][0= = ='. ') return true
  }
  return false
}
Copy the code

Path to check

The first is to determine whether the path passed in is a directory (ending with/is considered a directory). If it is a directory and there is a valid index parameter, index is concatenated to the end of path. This is something like this:

send(ctx, './public/', {
  index: 'index.js'
})

// ./public/index.js
Copy the code

Resolve-path is a path processing package that filters abnormal paths, such as path//file and /etc/xxx, and returns absolute paths after processing.

IsHidden is used to determine whether hidden files need to be filtered. Because files that start with. Are considered hidden files, the same as directories. The beginning is also considered hidden, hence the implementation of the isHidden function.

In fact, I personally think this can be solved by using a re. Why split it into arrays?

function isHidden (root, path) {
  path = path.substr(root.length)

  return new RegExp(`${sep}\ \. `).test(path)
}
Copy the code

It has been submitted to the communityPR.

Start compression and handle folders

After executing the above code, we have a valid path, and resolvePath will throw an exception if the path is invalid. The next thing we need to do is to check if there are any available compressed files. And content-encoding modifications (to turn on compression).

Suffixes matching:

if (extensions && !/ \ [^ /] * $/.exec(path)) {
  const list = [].concat(extensions)
  for (let i = 0; i < list.length; i++) {
    let ext = list[i]
    if (typeofext ! = ='string') {
      throw new TypeError('option extensions must be array of strings or false')}if (!/ ^ \. /.exec(ext)) ext = '. ' + ext
    if (await fs.exists(path + ext)) {
      path = path + ext
      break}}}Copy the code

You can see that the traversal is done in exactly the order in which we called Send and did it. Compatibility of symbols. That is, all such calls are valid:

await send(ctx, 'path', {
  extensions: ['.js'.'ts'.'.tsx']})Copy the code

If it matches the real file after suffixing, it is considered a valid path and breaks, which is what the documentation says: First found is served.

After this part of the operation is complete, the directory is checked to determine whether the current path is a directory:

let stats
try {
  stats = await fs.stat(path)

  if (stats.isDirectory()) {
    if (format && index) {
      path += '/' + index
      stats = await fs.stat(path)
    } else {
      return}}}catch (err) {
  const notfound = ['ENOENT'.'ENAMETOOLONG'.'ENOTDIR']
  if (notfound.includes(err.code)) {
    throw createError(404, err)
  }
  err.status = 500
  throw err
}
Copy the code

A little egg

One interesting thing to note is that if you find that the current path is a directory and you specify format explicitly, you will try to concatenate index again. This is the Easter egg, when our public path structure looks like this:

├ ─ garbage ─ index# Actual file hello
Copy the code

We can get the lowest level file data in a simple way:

router.get('/surprises'.async ctx => {
  await send(ctx, '/', {
    root: './public'.index: 'index'})})/ / > curl http://127.0.0.1:12306/surprises
// hello
Copy the code

TrailingSlash concatenates the index if it ends in /, and concatenates the index again if the current path matches a directory. So a simple/plus index argument can get /index/index directly. A little Easter egg, which is rarely played in real development

The final read file operation

Finally came to the logical processing of file reading, first is to call the operation of setHeaders.

Because of the filtering above, the path here is not the same as the path you passed in when you called send. It’s not necessary to do this in the setHeaders function, though, because you can see that the actual path is returned at the end of the function. And doing it after is too late because the headers are already sent. This is not a worry because koA’s return data is put into ctx.body, and body parsing is not processed until all middleware has executed. This means that the middleware will not start sending the HTTP request body until all of the middleware has executed. Setting the Header is valid until then.

if (setHeaders) setHeaders(ctx.res, path, stats)

// stream
ctx.set('Content-Length', stats.size)
if(! ctx.response.get('Last-Modified')) ctx.set('Last-Modified', stats.mtime.toUTCString())
if(! ctx.response.get('Cache-Control')) {
  const directives = ['max-age=' + (maxage / 1000 | 0)]
  if (immutable) {
    directives.push('immutable')
  }
  ctx.set('Cache-Control', directives.join(', '))}if(! ctx.type) ctx.type = type(path, encodingExt)// The data type returned by the interface. By default, the file suffix is retrieved
ctx.body = fs.createReadStream(path)

return path
Copy the code

As well as maxage and immutable above are valid here, but note that koa-send does not override cache-control values if they already exist.

The difference between using Stream and readFile

As you can see at the end of the body assignment, Stream is used instead of readFile. Using Stream for transmission has at least two benefits:

  1. First, if it is a large file, it will be temporarily stored in memory after reading, andtoStringThere’s a length limit. If it’s a huge file,toStringThe call will throw an exception.
  2. In the first way, the file is read after all the data has been read and then returned to the interface caller. During the reading, the interface is always inWaitState, and no data is returned.

You can do a Demo like this:

const http      = require('http')
const fs        = require('fs')
const filePath  = './test.log'
  
http.createServer((req, res) = > {
  if (req.url === '/') {
    res.end('<html></html>')}else if (req.url === '/sync') {
    const data = fs.readFileSync(filePath).toString()

    res.end(data)
  } else if (req.url === '/pipe') {
    const rs = fs.createReadStream(filePath)

    rs.pipe(res)
  } else {
    res.end('404')
  }
}).listen(12306, () = >console.log('server run as http://127.0.0.1:12306))
Copy the code

First go to http://127.0.0.1:12306/ to get an empty page (mostly lazy with CORS), then call two fetches on the console to get the comparison result:

It can be seen that while the downlink transmission time is almost the same, the mode of readFileSync will increase a certain Waiting time, and this time is when the server is reading the file. The length of time depends on the size of the file read and the performance of the machine.

koa-static

Koa-static is a shallow encapsulation based on KOA-SEND. As you can see from the above example, the send method needs to be called in the middleware itself. Specifying parameters such as path for send manually is also a repetitive operation, so koA-static encapsulates this logic. Let us register a middleware directly to handle static files without worrying about reading parameters and so on:

const Koa = require('koa')
const app = new Koa()
app.use(require('koa-static')(root, opts))
Copy the code

Opts are passed transparently into KOA-send, but the root in opTS is overridden with the first parameter root. And added some details:

  • Add one by defaultindex.html
if(opts.index ! = =false) opts.index = opts.index || 'index.html'
Copy the code
  • Default only forHEADandGETTwo kinds ofMETHOD
if (ctx.method === 'HEAD' || ctx.method === 'GET') {
  // ...
}
Copy the code
  • Add adeferOption to determine whether to execute other middleware first.

If defer is false, send is executed first, and static files are matched first. Otherwise, it will wait until the rest of the middleware executes first to make sure that the other middleware is not processing the request before looking for the corresponding static resource. Just specify root and let KOA-static do the rest, and we don’t have to worry about how static resources should be handled.

summary

Koa-send and KoA-Static are two very lightweight middleware. There is no complex logic in itself, but some repeated logic is refined into middleware. But it does reduce a lot of the daily development tasks and allows people to focus more on the business rather than the side features.