preface

After years of using Express.js, I finally had time to learn express in depth and compare it to koA2’s implementation.

To be honest, express.js is pretty good until I read the source code, both in terms of API design and use. But after reading the Express code this time, I may have changed my mind.

Express.js has a sophisticated middleware design, but by current JS standards, this sophisticated design is now too complex. The layers of callback and recursion are really hard to read without a certain amount of time. What about the code for KOA2? You can almost comment on it in two words: lean and tough! Just a few files, using the latest JS standard, is a good implementation of middleware, code read at a glance.

As always, after reading this article, we still have a simple demo: Express-VS-KOA

1. A simple display of express and KOA usage

If you use express.js to start a simple server, it looks like this:

const express = require('express')

const app = express()
const router = express.Router()

app.use(async (req, res, next) => {
  console.log('I am the first middleware')
  next()
  console.log('first middleware end calling')
})
app.use((req, res, next) => {
  console.log('I am the second middleware')
  next()
  console.log('second middleware end calling')
})

router.get('/api/test1', async(req, res, next) => {
  console.log('I am the router middleware => /api/test1')
  res.status(200).send('hello')
})

router.get('/api/testerror', (req, res, next) => {
  console.log('I am the router middleware => /api/testerror')
  throw new Error('I am error.')
})

app.use('/', router)

app.use(async(err, req, res, next) => {
  if (err) {
    console.log('last middleware catch error', err)
    res.status(500).send('server Error')
    return
  }
  console.log('I am the last middleware')
  next()
  console.log('last middleware end calling')
})

app.listen(3000)
console.log('server listening at port 3000')
Copy the code

To convert to the equivalent KOA2, the usage looks like this:

const koa = require('koa')
const Router = require('koa-router')

const app = new koa()
const router = Router()

app.use(async(ctx, next) => {
  console.log('I am the first middleware')
  await next()
  console.log('first middleware end calling')
})

app.use(async (ctx, next) => {
  console.log('I am the second middleware')
  await next()
  console.log('second middleware end calling')
})

router.get('/api/test1', async(ctx, next) => {
  console.log('I am the router middleware => /api/test1')
  ctx.body = 'hello'
})

router.get('/api/testerror', async(ctx, next) => {
  throw new Error('I am error.')
})

app.use(router.routes())

app.listen(3000)
console.log('server listening at port 3000')

Copy the code

If you are still interested in how native Nodejs startup servers are used, you can refer to this file in the demo: Node.js

The differences between the two are shown in the table below:

koa(Router = require(‘koa-router’)) Express (assuming you don’t use methods like app.get)
Initialize the const app = new koa() const app = express()
Instantiating routes const router = Router() const router = express.Router()
App-level middleware app.use app.use
Routing level middleware router.get router.get
Routing middleware mount app.use(router.routes()) app.use(‘/’, router)
Listen on port app.listen(3000) app.listen(3000)

The table above shows the difference in usage, and the koA syntax is the new standard from initialization. There are also some differences in mounting routing middleware because of the different internal implementation mechanisms. Everything else is pretty much the same.

Then, our focus is on the realization of the middleware of both.

2. Implementation principle of Express. js middleware

Let’s start with a demo that shows the weakness of express.js middleware in dealing with certain issues. The demo code is as follows:

const express = require('express')

const app = express()

const sleep = (mseconds) => new Promise((resolve) => setTimeout(() => {
  console.log('sleep timeout... ')
  resolve()
}, mseconds))

app.use(async (req, res, next) => {
  console.log('I am the first middleware')
  const startTime = Date.now()
  console.log(`================ start ${req.method} ${req.url}`, { query: req.query, body: req.body });
  next()
  const cost = Date.now() - startTime
  console.log(`================ end ${req.method} ${req.url} ${res.statusCode} - ${cost} ms`)
})
app.use((req, res, next) => {
  console.log('I am the second middleware')
  next()
  console.log('second middleware end calling')
})

app.get('/api/test1', async(req, res, next) => {
  console.log('I am the router middleware => /api/test1')
  await sleep(2000)
  res.status(200).send('hello')
})

app.use(async(err, req, res, next) => {
  if (err) {
    console.log('last middleware catch error', err)
    res.status(500).send('server Error')
    return
  }
  console.log('I am the last middleware')
  await sleep(2000)
  next()
  console.log('last middleware end calling')
})

app.listen(3000)
console.log('server listening at port 3000')

Copy the code

What will the demo print when requesting/API /test1?

I am the first middleware
================ start GET /api/test1
I am the second middleware
I am the router middleware => /api/test1
second middleware end calling
================ end GET /api/test1 200 - 3 ms
sleep timeout...
Copy the code

If you are familiar with the reason for this print, you are familiar with the express.js middleware implementation.

Let’s take a look at the printout of the first demo:

I am the first middleware
I am the second middleware
I am the router middleware => /api/test1
second middleware end calling
first middleware end calling
Copy the code

This print works as expected, but why didn’t the demo print work as expected? The only difference is that the second demo added asynchronous processing. With asynchronous processing, the whole process gets messed up. Because our expected execution flow looks like this:

I am the first middleware
================ start GET /api/test1
I am the second middleware
I am the router middleware => /api/test1
sleep timeout...
second middleware end calling
================ end GET /api/test1 200 - 3 ms
Copy the code

So what causes this? We can find the answer in the following analysis.

2.1 Express mounts middleware

To understand the implementation, we need to know how many ways express.js can mount middleware into it. Are you familiar with express.js? The children’s shoes that know can silently list in mind.

HTTP Method refers to HTTP request methods, such as Get, Post, Put, etc.

  • app.use
  • app.[HTTP Method]
  • app.all
  • app.param
  • router.all
  • router.use
  • router.param
  • router.[HTTP Method]

2.2. Initialize express middleware

The express code relies on several variables (instances) : APP, Router, Layer, route. The relationship between these instances determines the formation of a data model after middleware initialization, as shown in the following picture:

In this figure, there are two Layer instances, which are mounted in different places. For example, express.js, we can find a more graphic example through debugging:

Together, let’s talk about Express middleware initialization. For convenience, we call Figure 1 the initialization model diagram and Figure 2 the initialization instance diagram

Looking at the above two pictures, we throw out the following questions. To understand the problem is to understand initialization.

  • Why are there two Layer instances of initialization model diagram?
  • When does the route field exist in the Layer instance of the initialization model diagram?
  • Why are there 7 middleware mounted in the initialization instance diagram?
  • The route field in circle 2 and circle 3 is different, and the name is different. Why?
  • There is also a Layer instance in circle 4 of the diagram. Is the Layer instance at this time different from the Layer instance above?

First, we will output the concept that a Layer instance is a mapping entity between path and Handle, and each Layer is a middleware.

In this case, it is possible to have nested middleware in our middleware, and in this case express manipulates Layer. We mount middleware in two cases:

  1. useapp.use,router.useTo mount
    • app.useAfter a bunch of processing, it’s finally calledrouter.usethe
  2. useapp.all,app.[Http Method],app.route,router.all,router.[Http Method],router.routeTo mount
    • app.all,app.[Http Method],app.route,router.all,router.[Http Method]After a bunch of processing, it’s finally calledrouter.routethe

So we focus on the router.use and router.route methods.

2.2.1, the router. Use

The core of this method is:

for (var i = 0; i < callbacks.length; i++) {
  var fn = callbacks[i];

  if(typeof fn ! = ='function') {
    throw new TypeError('Router.use() requires a middleware function but got a ' + gettype(fn))
  }

  // add the middleware
  debug('use %o %s', path, fn.name || '<anonymous>')

  var layer = new Layer(path, {
    sensitive: this.caseSensitive,
    strict: false,
    end: false}, fn); // Notice that the route field is set to undefined layer.route = undefined; this.stack.push(layer); }Copy the code

At this time, the generated Layer instances correspond to the multiple Layer instances indicated in FIG. 1 of the initialization model. At this time, taking express.js as an example, we look at all the Layer instances in the initialization instance graph circle 1, and find that in addition to our customized middleware (5 in total), there are also two built-in instances of the system. The names of the Layer that initializes the instance graph are query and expressInit. Both are initialized using the LazyRouter method in [application.js] :

app.lazyrouter = function lazyrouter() {
  if(! this._router) { this._router = new Router({caseSensitive: this.enabled('case sensitive routing'),
      strict: this.enabled('strict routing')}); this._router.use(query(this.get('query parser fn'))); Use (this._router.use(middleware.init(this))); Router. use {router.use}};Copy the code

So that answers our third question. 7 middleware, 2 system built-in middleware, 3 APP level intermediate middleware, 2 routing level middleware

2.2.2, the router. The route

[Http Method], app.route, router. All, router.[Http Method], router. So in our demo, express.js uses app.get twice, and finally calls router.route.

proto.route = function route(path) {
  var route = new Route(path);

  var layer = new Layer(path, {
    sensitive: this.caseSensitive,
    strict: this.strict,
    end: true
  }, route.dispatch.bind(route));

  layer.route = route;

  this.stack.push(layer);
  return route;
};
Copy the code

The only difference between this simple implementation and the previous one is the addition of a new Route. By comparing the two, we can answer several questions above:

  • Why are there two Layer instances of initialization model diagram? The second Layer instance is mounted under the Route instance, since the invocation method determines the Layer instance.
  • When does the route field exist in the Layer instance of the initialization model diagram? userouter.routeWill exist
  • The route field in circle 2 and circle 3 is different, and the name is different. Why? Layer in circle 2 because we’re using arrow functions, there’s no function name, so name isanonymous, but circle 3 because of the use ofrouter.route, so its uniform callback function isroute.dispath, so its function name is uniformbound dispatch, and whether the route field of both is assigned is clear at a glance

Last question, since route has its own Layer after it is instantiated, where is it initialized? Initialize core code:

// router/route.js/Route.prototype[method]
for (var i = 0; i < handles.length; i++) {
    var handle = handles[i];

    if(typeof handle ! = ='function') {
      var type = toString.call(handle);
      var msg = 'Route.' + method + '() requires a callback function but got a ' + type
      throw new Error(msg);
    }

    debug('%s %o', method, this.path)

    var layer = Layer('/', {}, handle);
    layer.method = method;

    this.methods[method] = true;
    this.stack.push(layer);
  }
Copy the code

As you can see, the newly created Route instance maintains a path mapping to handles for multiple methods. The handle corresponding to each method is a layer, and the path is /. That’s an easy answer to the last question.

At this point, go back to the initialization model diagram, I believe you can understand it ~

2.3 Execution logic of Express middleware

The entire middleware execution logic, both the outer Layer and the route instance Layer, is in the form of recursive calls. A very important function next() implements all this. Here is a flow chart to help you understand this:

Let’s implement the Express.js code in a different form so that you can completely understand the process.

For simplicity, we removed two default middleware and one routing middleware from the system mount. The final effect is:

((req, res) => {
  console.log('I am the first middleware');
  ((req, res) => {
    console.log('I am the second middleware');
    (async(req, res) => {
      console.log('I am the router middleware => /api/test1');
      await sleep(2000)
      res.status(200).send('hello')
    })(req, res)
    console.log('second middleware end calling');
  })(req, res)
  console.log('first middleware end calling')
})(req, res)
Copy the code

Because there is no processing of await or promise, when there is an asynchronous function in the middleware, due to the whole design of next, it will not wait for the asynchronous function resolve, so we see that the print of sleep function is put at the end. And the request times that the first middleware wanted to record are no longer accurate

But one thing to be clear is that although printing is weird, it will not affect the whole request because the response is after we await it, so whether the request ends depends on whether we call res.send or not

At this point, I hope you can be familiar with the implementation of the entire Express middleware process, more details suggest to look at the source code, this exquisite design is really not clear this article. The purpose of this article is to make sure you have something to say during the interview

Next, we analyze the awesome Koa2, which does not need to spend so much space to say, because it is really easy to understand the wife.

3. Koa2 middleware

The main processing logic for koA2 middleware is in koa-compose, which is just a function thing:

function compose (middleware) {
  if(! Array.isArray(middleware)) throw new TypeError('Middleware stack must be an array! ')
  for (const fn of middleware) {
    if(typeof fn ! = ='function') throw new TypeError('Middleware must be composed of functions! ')
  }

  /**
   * @param {Object} context
   * @return {Promise}
   * @api public
   */

  return function (context, next) {
    // last called middleware #
    let index = -1
    return dispatch(0)
    function dispatch (i) {
      if (i <= index) return Promise.reject(new Error('next() called multiple times'))
      index = i
      let fn = middleware[i]
      if (i === middleware.length) fn = next
      if(! fn)return Promise.resolve()
      try {
        return Promise.resolve(fn(context, dispatch.bind(null, i + 1)));
      } catch (err) {
        return Promise.reject(err)
      }
    }
  }
}
Copy the code

Every middleware call to next() is actually this:

dispatch.bind(null, i + 1)
Copy the code

Again, using the nature of closures and recursions, we execute one by one, and each execution returns a promise, so we end up printing exactly what we expect. Whether or not the routing middleware is called is not koA2’s responsibility, leaving the job to the KOA-Router so that KOA2 can maintain its lean and robust style.

Let’s also post the execution flow of KOA middleware:

The last

With this article, you will no longer be afraid to be asked the difference between Express and KOA

reference

  1. koa
  2. express
  3. http