Koa2 builds signaling server, JS can also handle video calls!

“This article is participating in the technical topic essay node.js advanced road, click to see details”

Hello, I’m Yang Chenggong.

The last article introduced what WebRTC is, what steps of its communication process, and set up a local communication Demo, and finally spoke about the idea of one-to-many implementation, original address: audio and video communication plus food — WebRTC to explore the end

In this article, we introduce the communication between two ends of LAN, using signaling server to transmit SDP. We didn’t go into detail about the signaling server, just two variables to simulate the connection.

In actual application scenarios, a signaling server is essentially a WeSocket server. Two clients must establish a WeSocket connection with this server before sending messages to each other.

However, the signaling server does more than just send SDP. Multi-channel communication We usually communicate with one or several people, and need to group all connections, which is the concept of “room” in audio and video communication. Another function of the signaling server is to maintain the binding relationship between client connections and rooms.

So this article, based on node.js Koa2 framework, with you to implement a signaling server.

Outline the preview

The content of this article includes the following aspects:

Talk about signaling
Koa met ws
How do I maintain connection objects?
Originator implementation
Receiving end implementation
Ready, messengers on the move!
Join a study group

Talk about signaling

In our last article, two clients on a LAN need to exchange information several times to establish a WebRTC peer connection. The sending of information is initiated by each end, and the other end monitors the receiving of events, so the implementation scheme is WebSocket.

The process of exchanging SDP remotely based on WebSocket is called signaling.

In fact, WebRTC does not specify how signaling should be implemented. That is, signaling is not part of the WebRTC communication specification. For example, if we implement two RTCPeerConnection instances on the same page, the whole connection process does not need signaling. Since both SDP’s are defined on the same page, we can simply fetch variables.

However, in the case of multiple clients, both parties need to obtain each other’s SDP, hence the signaling.

Koa met ws

When we build a signaling server with Node.js, there are two key parts:

Framework: Koa2
Module: the ws

Node.js development needs to choose a suitable framework, previously used Express, this time try Koa2. However, they are not very different, maybe some APIS or NPM packages are different, and the basic structure is almost the same.

The WS module is a very simple and pure WebSocket implementation, containing both client and server. In this article, I detailed the use of WS modules and how to integrate them with the Express framework in NodeJS Landing WebSocket practice. Those who do not know WS modules can read this article.

Here we go straight to building the Koa2 structure and introducing the WS module.

Koa project structure setup

First, initialize the project and install:

$ npm init && yarn add koa ws
Copy the code

After creation, generate package.json files and add three folders to the sibling directory:

routers: Stores a separate routing file
utils: Stores utility functions
config: Stores configuration files

Write the most important entry file, the basic structure is as follows:

const Koa = require('koa')
const app = new Koa()

app.use(ctx= > {
  ctx.body = 'Hello World'
})

server.listen(9800.() = > {
  console.log(`listen to http://localhost:9800`)})Copy the code

See, it’s basically the same as Express. After instantiation, set up a route, listen on a port, and a simple Web server starts up.

The big difference here is in their middleware functions. Middleware functions are the callbacks passed in when app.use or app.get are used. Read more about middleware here.

The parameters of the middleware function contain two key chunks of information, the request and the response. In Express, the two objects are represented by two parameters. In KOA, the two objects are combined and represented by only one parameter.

Express is expressed as follows:

app.get('/test'.(req, res, next) = > {
  // req is the request object, get the request information
  // res is the response object used to respond to data
  // next goes to the next middleware
  let { query } = req
  res.status(200).send(query)
})
Copy the code

Koa looks like this:

app.get('/test'.(ctx, next) = > {
  // ctx.request is the request object, which gets the request information
  // ctx.response is the response object used to respond to data
  // next goes to the next middleware
  let { query } = ctx
  ctx.status = 200
  ctx.body = query
})
Copy the code

Although ctx.request represents the request object and ctx.response represents the response object, KOA attaches some of the commonly used properties directly to CTX. For example, ctx.body represents the body of the response. What about retrieving the body of the request? Use ctx.request.body and get URL parameters again ctx.query. It’s a bit confusing to use.

The infrastructure looks like this, and we need to do two more things:

Cross domain processing
Request body resolution

Well, cross domains, of course. Front-end guys know that. Node.js receives the request body in a stream-based manner and cannot obtain it directly. Therefore, it needs to be processed separately to obtain the request body directly using ctx.request.body.

Start by installing two NPM packages:

$ yarn add @koa/cors koa-bodyparser
Copy the code

Then configure it in app.js:

const cors = require('@koa/cors')
const bodyParser = require('koa-bodyparser')

app.use(cors())
app.use(bodyParser())
Copy the code

Ws module integration

In essence, WebSocket and Http are two sets of services, both integrated within the same Koa framework, but they are virtually separate.

Because of the same Koa application, we want WebSocket and Http to share a port, so that we only have one control over starting/destroying/restarting operations.

To share the port, first make some changes to the entry file app.js:

const http = require('http')
const Koa = require('koa')

const app = new Koa()
const server = http.createServer(app.callback())

server.listen(9800.() = > {
  console.log(`listen to http://localhost:9800`)})// App.listen
Copy the code

Then we create ws-js under the utils directory:

// utils/ws.js
const WebSocketApi = (wss, app) = > {
  wss.on('connection'.(ws, req) = > {
    console.log('Connection successful')}}module.exports = WebSocketApi
Copy the code

Add this file to app.js and add the following code:

// app.js
const WebSocket = require('ws')
const WebSocketApi = require('./utils/ws')

const server = http.createServer(app.callback())
const wss = new WebSocket.Server({ server })

WebSocketApi(wss, app)
Copy the code

At this point, re-run Node app.js, open the browser console, and write a line of code:

var ws = new WebSocket('ws://localhost:9800')
Copy the code

In normal cases, the browser results are as follows:

ReadyState =1 indicates that the WebSocket connection was successful.

How do I maintain connection objects?

In the previous step, we integrated the WS module and tested the connection. We wrote all the WebSocket logic in the WebSocket API function. Now let’s continue with this function.

// utils/ws.js
const WebSocketApi = (wss, app) = > {
  wss.on('connection'.(ws, req) = > {
    console.log('Connection successful')}}Copy the code

The function takes two arguments, WSS is an instance of a WebSocket server and app is an instance of a Koa application. So you might say, what’s the use of the app here? What it does is simple: set global variables.

The main function of a signaling server is to find two connected parties and transfer data. So when there are many clients connected to the server, we need to find the two parties that communicate with each other among the many clients, so we need to identify and classify all client connections.

In the callback function that the code listens for the Connection event, the first parameter ws represents a connected client. Ws is a WebSocket instance object, and a call to WS-send () sends a message to that client.

ws.send('hello') / / message
wss.clients // All ws connection instances
Copy the code

Identifying WS is a simple matter of adding attributes to differentiate. Add user_id, room_id, etc., which can be passed in as a parameter when the client connects and then retrieved from the reQ parameter in the code above.

After setting the identity, save the “first name and last name” WS client so it can be found later.

But how? How do you maintain connection objects? This problem requires serious thinking. The WebSocket connection object is in memory and is opened in real time with the client. So we need to store the WS objects in memory, and one way to do that is to set them in the global variables of the Koa application, which is the point of the app argument at the beginning.

The Koa application’s global variables are added to app.context, so we create two global variables in the group of “originator” and “receiver” :

cusSender: array that holds all the originating WS objects
cusReader: array that holds all the ws objects on the receiving end

Then get these two variables and request parameters respectively:

// utils/ws.js
const WebSocketApi = (wss, app) = > {
  wss.on('connection'.(ws, req) = > {
    let { url } = req // Parse request parameters from the URL
    let { cusSender, cusReader } = app.context
    console.log('Connection successful')}}Copy the code

Request parameters are parsed from the URL. CusSender and cusReader are two arrays that hold instances of WS, and all subsequent connection lookups and state maintenance are performed under these two arrays.

Originator implementation

The originator refers to the end that initiates a connection. The originator needs to carry two parameters when connecting to the WebSocket:

Rule: the role of
Roomid: indicates the id of a room

The role of the originator is always sender, which only identifies the WebSocket as an originator. Roomid represents the unique ID of the current connection. In one-to-one communication, it can be the current user ID. For one-to-many communication, there is a “live room” -like concept, and a roomid is a roomid.

First on the client side, the URL that initiates the connection is as follows:

var rule = 'sender',
  roomid = '354682913546354'
var socket_url = `ws://localhost:9800/webrtc/${rule}/${roomid}`
var socket = new WebSocket(socket_url)
Copy the code

Here we add a URL prefix/webrTC to the WebSocket connection representing webrTC, and we take the parameters directly to the URL, since WebSocket does not support custom headers and can only carry parameters in the URL.

The server receives the sender code as follows:

wss.on('connection'.(ws, req) = > {
  let { url } = req // The url value is /webrtc/$role/$uniId
  let { cusSender, cusReader } = app.context
  if(! url.startsWith('/webrtc')) {
    return ws.clode() // Close connections whose URL prefix is not /webrtc
  }
  let [_, role, uniId] = url.slice(1).split('/')
  if(! uniId) {console.log('Missing arguments')
    return ws.clode()
  }
  console.log('Number of connected clients:', wss.clients.size)
  // Check if the connection is initiated
  if (role == 'sender') {
    // uniId is the roomID
    ws.roomid = uniId
    let index = (cusReader = cusReader || []).findIndex(
      row= > row.userid == ws.userid
    )
    // Check whether the sender exists. If so, update it. If not, add it
    if (index >= 0) {
      cusSender[index] = ws
    } else {
      cusSender.push(ws)
    }
    app.context.cusSender = [...cusSender]
  }
}
Copy the code

In the code above, we determine that the current connection belongs to the sender based on the sender resolved in the URL, bind the ROOMID to the WS instance, and update the cusSender array based on the conditions, so that the instance will not be added repeatedly even if the client connects multiple times (such as page refresh).

This is the logic for initiating a connection, and we will also deal with a case where the ws instance is cleared when the connection is closed:

wss.on('connection'.(ws, req) = > {
  ws.on('close'.() = > {
    if (from= ='sender') {
      // Clear the initiator
      let index = app.context.cusSender.findIndex(row= > row == ws)
      app.context.cusSender.splice(index, 1)
      // Unbind the receiver
      if (app.context.cusReader && app.context.cusReader.length > 0) {
        app.context.cusReader
          .filter(row= > row.roomid == ws.roomid)
          .forEach((row, ind) = > {
            app.context.cusReader[ind].roomid = null
            row.send('leaveline')})}})})Copy the code

Receiving end implementation

The receiving end refers to the client that receives and plays media streams from the originating end. The receiving end needs to carry two parameters when connecting to the WebSocket:

Rule: the role of
Userid: indicates the userid

Role The role of the initiator is the same as that of the initiator, and the value is fixed as Reader. The connection side can be thought of as a user, so when you initiate a connection, you pass the userID of the current user as a unique identifier bound to the connection.

On the client side, the URL for the receiver connection is as follows:

var rule = 'reader',
  userid = '6143e8603246123ce2e7b687'
var socket_url = `ws://localhost:9800/webrtc/${rule}/${userid}`
var socket = new WebSocket(socket_url)
Copy the code

The code for the server to receive the message sent by the Reader is as follows:

wss.on('connection'.(ws, req) = > {
  / /... omit
  if (role == 'reader') {
    // The receiver is connected
    ws.userid = uniId
    let index = (cusReader = cusReader || []).findIndex(
      row= > row.userid == ws.userid
    )
    // ws.send('ccc' + index)
    if (index >= 0) {
      cusReader[index] = ws
    } else {
      cusReader.push(ws)
    }
    app.context.cusReader = [...cusReader]
  }
}
Copy the code

The update logic for cusReader is the same as for cusSender, which ultimately ensures that only instances of the connection are stored in the array. Do the same for closing the connection:

wss.on('connection'.(ws, req) = > {
  ws.on('close'.() = > {
    if (role == 'reader') {
      // The receiver closes the logic
      let index = app.context.cusReader.findIndex(row= > row == ws)
      if (index >= 0) {
        app.context.cusReader.splice(index, 1)}}})Copy the code

Ready, messengers on the move!

In the first two steps we have achieved the client WebSocket instance information binding, as well as the maintenance of the connected instance, now we can receive the message from the client, and then send the message to the target client, let our “messenger” run with the message!

Client content, we continue to look at the last article on LAN communication and one-to-many communication, and then completely comb through the communication logic.

Firstly, both the initiating end peerA and the receiving end peerB have connected to the signaling server:

// peerA
var socketA = new WebSocket('ws://localhost:9800/webrtc/sender/xxxxxxxxxx')
// peerB
var socketB = new WebSocket('ws://localhost:9800/webrtc/reader/xxxxxxxxxx')
Copy the code

The server side then listens for the sent message and defines a method eventHandel to handle the logic of message forwarding:

wss.on('connection'.(ws, req) = > {
  ws.on('message'.msg= > {
    if (typeofmsg ! ='string') {
      msg = msg.toString()
      // return console.log(' type abnormal: ', typeof MSG)
    }
    let { cusSender, cusReader } = app.context
    eventHandel(msg, ws, role, cusSender, cusReader)
  })
})
Copy the code

At this point, the peerA end has acquired the video stream, stored it in the localStream variable, and started to broadcast. Let’s begin to sort out the steps of connecting peerB terminal to peerB.

Step 1: Client peerB enters the live broadcast room and sends a message to join the connection:

// peerB
var roomid = 'xxx'
socketB.send(`join|${roomid}`)
Copy the code

Note that socket information does not support sending objects. Convert all required parameters into strings, separated by |

Then on the signaling server, listen to the message sent by peerB, find peerA, and send the connection object:

const eventHandel = (message, ws, role, cusSender, cusReader) = > {
  if (role == 'reader') {
    let arrval = data.split('|')
    let [type, roomid] = arrval
    if (type == 'join') {
      let seader = cusSender.find(row= > row.roomid == roomid)
      if (seader) {
        seader.send(`${type}|${ws.userid}`)}}}}Copy the code

Step 2: The originator peerA listens to the join event and creates the offer and sends it to peerB:

// peerA
socketA.onmessage = evt= > {
  let string = evt.data
  let value = string.split('|')
  if (value[0] = ='join') {
    peerInit(value[1])}}var offer, peer
const peerInit = async usid => {
  // 1. Create a connection
  peer = new RTCPeerConnection()
  // 2. Add video streaming track
  localStream.getTracks().forEach(track= > {
    peer.addTrack(track, localStream)
  })
  // 3. Create SDP
  offer = await peer.createOffer()
  // 4. Send the SDP
  socketA.send(`offer|${usid}|${offer.sdp}`)}Copy the code

The server listens to peerA’s message, finds peerB, and sends the offer message:

// ws.js
const eventHandel = (message, ws, from, cusSender, cusReader) = > {
  if (from= ='sender') {
    let arrval = message.split('|')
    let [type, userid, val] = arrval
    // Note that type, userid, and val are generic values and will be passed to reader as is
    if (type == 'offer') {
      let reader = cusReader.find(row= > row.userid == userid)
      if (reader) {
        reader.send(`${type}|${ws.roomid}|${val}`)}}}}Copy the code

Step 3: Client peerB listens to the Offer event, then creates an answer and sends it to peerA:

// peerB
socketB.onmessage = evt= > {
  let string = evt.data
  let value = string.split('|')
  if (value[0] = ='offer') {
    transMedia(value)
  }
}
var answer, peer
const transMedia = async arr => {
  let [_, roomid, sdp] = arr
  let offer = new RTCSessionDescription({ type: 'offer', sdp })
  peer = new RTCPeerConnection()
  await peer.setRemoteDescription(offer)
  let answer = await peer.createAnswer()
  await peer.setLocalDescription(answer)
  socketB.send(`answer|${roomid}|${answer.sdp}`)}Copy the code

The server listens to the message sent by peerB, finds peerA, and sends the answer message:

// ws.js
const eventHandel = (message, ws, from, cusSender, cusReader) = > {
  if (role == 'reader') {
    let arrval = message.split('|')
    let [type, roomid, val] = arrval
    if (type == 'answer') {
      let sender = cusSender.find(row= > row.roomid == roomid)
      if (sender) {
        sender.send(`${type}|${ws.userid}|${val}`)}}}}Copy the code

Step 4: The originator peerA listens to the answer event and sets the local description.

// peerA
socketB.onmessage = evt= > {
  let string = evt.data
  let value = string.split('|')
  if (value[0] = ='answer') {
    let answer = new RTCSessionDescription({
      type: 'answer'.sdp: value[2]
    })
    peer.setLocalDescription(offer)
    peer.setRemoteDescription(answer)
  }
}
Copy the code

Step 5: The peerA side listens and passes the candidate event and sends data. This event is emitted when peer-setLocalDescription is executed in the previous step:

// peerA
peer.onicecandidate = event= > {
  if (event.candidate) {
    let candid = event.candidate.toJSON()
    socketA.send(`candid|${usid}|The ${JSON.stringify(candid)}`)}}Copy the code

Then listen on peerB and add candidate:

// peerB
socket.onmessage = evt= > {
  let string = evt.data
  let value = string.split('|')
  if (value[0] = ='candid') {
    let json = JSON.parse(value[1])
    let candid = new RTCIceCandidate(json)
    peer.addIceCandidate(candid)
  }
}
Copy the code

Ok, that’s it!!

The content of this article is more, it is recommended that you must start to write again, you will understand the process of one-to-many communication. Of course, this chapter has not implemented network holes, signaling server can be deployed on the server, but WebRTC client must be connected within the LAN.

In the next article, the third in the WebRTC series, we implement the ICE server.

Join a study group

This article source public number: programmer success. Here mainly share front-end engineering and architecture technical knowledge, welcome to pay attention to the public account, click “add group” to join the learning team, and explore learning progress together with the leaders! ~