Moment For Technology

Hand - to - hand with you to implement a load balancer

Posted on Dec. 2, 2022, 4:07 a.m. by Kenneth Bruce
Category: The front end Tag: The front end Load balancing

Writing is not easy, without the permission of the author forbid to reprint in any form! If you think the article is good, welcome to follow, like and share! Continue to share technical blog posts, follow the wechat public account ?? front-end LeBron Bydance school recruitment is in progress, school recruitment internal push code: 4FCV6BV Game department front-end team can chat directly push

Introduction to the

Load balancing refers to balancing loads (work tasks) according to a certain algorithm and allocating them to multiple operation units for operation and execution, such as Web servers, enterprise core application servers, and other major task servers, so that work tasks can be completed cooperatively. Load balancing on the original network structure provides a transparent and effective way to expand the bandwidth of servers and network equipment, strengthen the network data processing capacity, increase throughput, improve the network availability and flexibility, while withstanding the greater concurrency magnitude.

Simply put, it forwards a large number of concurrent request processing to multiple back-end nodes for processing, reducing work response time.

  • Avoid wasting resources
  • Avoid service unavailability

A, classification,

Layer 4 (Transport Layer)

Layer 4 is the transport layer in the OSI Layer 7 model. It includes TCP and UDP. These two protocols contain source IP addresses and destination IP addresses, as well as source port numbers and destination port numbers. After receiving a request from a client, layer 4 load balancer changes the IP address (IP + PORT) of the packet and forwards the packet to the application server.

Seven layers (application layer)

Proxy load balancing

Layer 7 is the application layer in the OSI seven-layer model. There are many protocols at the application layer, among which HTTP and HTTPS are commonly used. Seven layer load balancing can load these protocols. These application-layer protocols can contain a lot of meaningful content. For example, the load balancing of the same Web server, in addition to the LOAD balancing by IP + PORT, can also be determined by the URL, Cookie, browser category, language, and request type of the seven layers.

The essence of layer 4 load balancing is forwarding, while the essence of layer 7 load balancing is content exchange and proxy.

Layer 4 load balancing Seven layers of load balancing
Based on the IP + PORT URL or host IP address
similar The router Proxy server
The complexity of the low high
performance High, no need to parse the content , the algorithm is required to identify URL headers, cookies, etc
security Low, unable to identify DDoS attacks High: it can defend against SYN Flood attacks
Extend the functionality There is no Content cache, image anti-theft and so on

Two, common algorithms

Front data structure

interface urlObj{
  url:string.weight:number // This is valid only for weight polling
urlDesc: urlObj[]
interface urlCollectObj{
  count: number./ / the number of connections
  costTime: number.// Response time
  connection: number.// The number of real-time connections
urlCollect: urlCollectObj[]
Copy the code



const Random = (urlDesc) =  {
  let urlCollect = [];
  Url / / collection
  urlDesc.forEach((val) =  {
  return () =  {
    // Generate a random number subscript to return the corresponding URL
    const pos = parseInt(Math.random() * urlCollect.length);
    return urlCollect[pos];
module.exports = Random;
Copy the code

Weighted Round Robin

Weight polling algorithm

const WeiRoundRobin = (urlDesc) =  {
  let pos = 0,
    urlCollect = [],
    copyUrlDesc = JSON.parse(JSON.stringify(urlDesc));
  // Collect urls by weight
  while (copyUrlDesc.length  0) {
    for (let i = 0; i  copyUrlDesc.length; i++) {
      if (copyUrlDesc[i].weight === 0) {
        copyUrlDesc.splice(i, 1); i--; }}}// Polling to get the URL function
  return () =  {
    const res = urlCollect[pos++];
    if (pos === urlCollect.length) {
      pos = 0;
    return res;
module.exports = WeiRoundRobin;
Copy the code

IP Hash URL Hash

Hash of the source IP/URL

const { Hash } = require(".. /util");
const IpHash = (urlDesc) =  {
  let urlCollect = [];
  for (const key in urlDesc) {
    Url / / collection
  return (sourceInfo) =  {
    // Generate Hash decimal values
    const hashInfo = Hash(sourceInfo);
    // Take mod as subscript
    const urlPos = Math.abs(hashInfo) % urlCollect.length;
    / / return
    return urlCollect[urlPos];
module.exports = IpHash;
Copy the code

Consistent Hash

Consistency of the Hash

const { Hash } = require(".. /util");
const ConsistentHash = (urlDesc) =  {
  let urlHashMap = {},
    hashCollect = [];
  for (const key in urlDesc) {
    // Collect urlHash into array and generate HashMap
    const { url } = urlDesc[key];
    const hash = Hash(url);
    urlHashMap[hash] = url;
  // Sort the hash array from smallest to largest
  hashCollect = hashCollect.sort((a, b) =  a - b);
  return (sourceInfo) =  {
    // Generate Hash decimal values
    const hashInfo = Hash(sourceInfo);
    // Iterate through the hash array to find the first one that is larger than the hash value of the source information and return the URL through the hashMap
    hashCollect.forEach((val) =  {
      if (val = hashInfo) {
        returnurlHashMap[val]; }});// Return the maximum value if no value is found
    return urlHashMap[hashCollect[hashCollect.length - 1]];
module.exports = ConsistentHash;
Copy the code

Least Connections

Minimum connection number

const leastConnections = () =  {
  return (urlCollect) =  {
    let min = Number.POSITIVE_INFINITY,
      url = "";
    // Iterate through the object to find the address with the least number of connections
    for (let key in urlCollect) {
      const val = urlCollect[key].connection;
      if(val  min) { min = val; url = key; }}/ / return
    return url;
module.exports = leastConnections;
Copy the code

Note: urlCollect indicates that all loads are statistics objects and has the following attributes

  • Connection Number of real-time connections
  • Count Number of requests processed
  • CostTime Response time.


Minimum response time

const Fair = () =  {
  return (urlCollect) =  {
    let min = Number.POSITIVE_INFINITY,
      url = "";
     // Find the url that takes the least time
    for (const key in urlCollect) {
      const urlObj = urlCollect[key];
      if(urlObj.costTime  min) { min = urlObj.costTime; url = key; }}/ / return
    return url;
module.exports = Fair;
Copy the code

See here is not feeling the algorithm is quite simple ?

Look forward to the implementation of module 5 ?

Iii. Health monitoring

Health monitoring is health monitoring of the application server. To prevent requests from being forwarded to an abnormal application server, use a health monitoring policy. You can adjust the policy and frequency according to the service sensitivity.

HTTP/HTTPS Health Monitoring Steps (seven layers)

  1. The load balancing node sends a HEAD request to the application server.
  2. After receiving the HEAD request, the application server returns the corresponding status code.
  3. If no return status code is received within the timeout period, the system determines that the status code has timed out and the health check fails.
  4. If the status code is received within the timeout period, the load balancing node compares the status code to check whether the health check is successful.

TCP Health Check Steps (Layer 4)

  1. The load balancing node sends a TCP SYN request packet to the IP address + PORT of the Intranet application server.
  2. After receiving the request, the Intranet application server returns a SYN + ACK packet if it is listening normally.
  3. If no data packet is received within the timeout period, the system determines that the service does not respond, the health check fails, and sends an RST packet to the Intranet application server to interrupt the TCP connection.
  4. If the packet returned is received within the timeout period, the system determines that the service is running healthily and sends an RST packet to interrupt the TCP connection.

UDP Health Check Procedure (Layer 4)

  1. The load balancing node sends UDP packets to the IP address + PORT of the Intranet application server.
  2. If the Intranet application server is not listening, the system returnsPORT XX unreachableAn ICMP error message is displayed. Otherwise, it is normal.
  3. If an error message is received within the timeout period, the service is abnormal and the health check fails.
  4. If no error message is received within the timeout period, the service is running healthily.

Iv. VIP technology

Vrtual IP

Virtual IP

  • Under TCP/IP, all computers that want to connect to the Internet, in any form, do not need to have a unique IP address. The IP address is actually an abstraction of the physical address of the host hardware.

  • Simply put, there are two kinds of addresses

    • MAC Physical Address
    • IP Logical address
  • A virtual IP address is an IP address that is not assigned to a real host. That is, the host of a server provided externally has a real IP address and a virtual IP address, and either of the two IP addresses can be connected to this host.

    • The virtual IP address corresponds to the MAC address of the real host
  • The virtual IP address is generally used to achieve high availability. For example, the database link configuration in all projects is the virtual IP address. When the primary server fails and fails to provide external services, the virtual IP address is dynamically switched to the standby server.

Virtual IP Principles

  1. ARP is an address resolution protocol that translates an IP address into a MAC address.
  2. Each host has an ARP cache that stores the mapping between IP addresses and MAC addresses on the same network. Before sending data, the host checks the MAC address corresponding to the target IP address in the ARP cache and sends data to the MAC address. The operating system automatically maintains this cache.
  3. In Linux, the ARP command can be used to operate the ARP cache
  • For example, host A ( and host B ( exist. A serves as the primary server and B serves as the backup server. The two servers communicate with each other through the HeartBeat service.

  • That is, the primary server periodically sends data packets to the backup server to inform the primary server that the primary server is normal. If the backup server does not receive the HeartBeat from the primary server within a specified period of time, the primary server is considered to be down.

  • At this point, the backup server is upgraded to the primary server.

    • Server B sends its ARP cache and tells the router to modify the routing table and that the virtual IP address should point to

    • When an external server accesses the virtual IP again, machine B becomes the primary server and machine A becomes the backup server.

    • In this way, the switch between master and slave machines is completed, all of which are non-perceptible and transparent to the outside.

Five, based on NodeJS to achieve a simple load balancing

Those who want to implement the load balancer manually/look at the source code can check out the ?? code repository

The desired effect

After you edit config.js, NPM run start starts the equalizer and back-end service nodes

  • UrlDesc: back-end service node configuration object. Weight is valid only for the WeightRoundRobin algorithm
  • Port: indicates the listening port of the equalizer
  • Algorithm: name of the algorithm (all the algorithms in module 2 are implemented)
  • WorkerNum: Number of processes enabled on the back-end service port to provide concurrency.
  • BalancerNum: Indicates the number of processes enabled on the equalizer port to provide concurrency.
  • WorkerFilePath: indicates the execution file of the back-end service node. The absolute path is recommended.
const {ALGORITHM, BASE_URL} = require("./constant");
module.exports = {
    urlDesc: [{url: `${BASE_URL}:The ${16666}`.weight: 6}, {url: `${BASE_URL}:The ${16667}`.weight: 1}, {url: `${BASE_URL}:The ${16668}`.weight: 1}, {url: `${BASE_URL}:The ${16669}`.weight: 1}, {url: `${BASE_URL}:The ${16670}`.weight: 2}, {url: `${BASE_URL}:The ${16671}`.weight: 1}, {url: `${BASE_URL}:The ${16672}`.weight: 4],},port: 8080.algorithm: ALGORITHM.RANDOM,
    workerNum: 5.balancerNum: 5.workerFilePath:path.resolve(__dirname, "./worker.js")}Copy the code

Architecture drawing

Let's start with the main process, main.js

  1. Example Initialize the load balancing statistic object balanceDataBase

    • BalanceDataBase is an instance of the DataBase class that collects load balancing data (described later).
  2. Running the equalizer

    • Multi-process model to provide concurrency.
  3. Run the back-end service node

    • Multi-threaded + multi-process model, which runs multiple service nodes and provides concurrency.
const {urlDesc, balancerNum} = require("./config")
const cluster = require("cluster");
const path = require("path");
const cpusLen = require("os").cpus().length;
const {DataBase} = require("./util");
const {Worker} = require('worker_threads');
const runWorker = () =  {
    // Prevent the number of listening ports  the number of CPU cores
    const urlObjArr = urlDesc.slice(0, cpusLen);
    // Initialize the create child thread
    for (let i = 0; i  urlObjArr.length; i++) { createWorkerThread(urlObjArr[i].url); }}const runBalancer = () =  {
    // Set the child process execution file
    cluster.setupMaster({exec: path.resolve(__dirname, "./balancer.js")});
    // Initialize the create child process
    let max
    if (balancerNum) {
        max = balancerNum  cpusLen ? cpusLen : balancerNum
    } else {
        max = 1
    for (let i = 0; i  max; i++) { createBalancer(); }}// Initialize the load balancing statistics object
const balanceDataBase = new DataBase(urlDesc);
// Run the equalizer
// Run the back-end service node
Copy the code

Create equalizer (createBalancer function)

  1. Create a process

  2. Listen for process communication messages

    • Listen for update response time events and execute update functions

      • Used for the FAIR algorithm (minimum response time).
    • Listen to get statistics object events and return

  3. Listen for abnormal exit and re-create, process daemon.

const createBalancer = () =  {
    // Create process
    const worker = cluster.fork();
    worker.on("message".(msg) =  {
        // Listen for updated response time events
        if (msg.type === "updateCostTime") {
            balanceDataBase.updateCostTime(msg.URL, msg.costTime)
        // Listen to get url statistics object events and return
        if (msg.type === "getUrlCollect") {
            worker.send({type: "getUrlCollect".urlCollect: balanceDataBase.urlCollect})
    // Listen for abnormal exit events and recreate the process
    worker.on("exit".() =  {
Copy the code

Create back-end service node (createWorkerThread function)

  1. Create a thread

  2. Resolve the port to be listened on

  3. Communicate to the child thread, sending the port to listen on

  4. Through thread communication, listen for child thread events

    • Listen for connection events and fire handlers.
    • Listen for disconnection events and fire handlers.
    • This parameter is used to collect statistics on load balancing distribution and number of real-time connections.
  5. Listen for abnormal exit and re-create, thread daemon.

const createWorkerThread = (listenUrl) =  {
    // Create a thread
    const worker = new Worker(path.resolve(__dirname, "./workerThread.js"));
    // Get the listening port
    const listenPort = listenUrl.split(":") [2];
    // Send the child thread the port number to listen on
    worker.postMessage({type: "port".port: listenPort});
    // Receive a subthread message to count the number of times the process has been accessed
    worker.on("message".(msg) =  {
        // Listen for connection events and fire count events
        if (msg.type === "connect") {
        // Listen for disconnect events and fire count events
        else if (msg.type === "disconnect") { balanceDataBase.sub(msg.port); }});// Listen for abnormal exit events and recreate the process
    worker.on("exit".() =  {
Copy the code

Take a look at the balancer workflow, balancer.js

  1. Get the getURL utility function

  2. Listen for requests and proxy

    • Gets the parameters that need to be passed into the getURL utility function.
    • Obtain the URL of the destination address of the balancing agent by using the getURL function
    • Record the request start time
    • To deal with cross domain
    • Returns a response
    • Response time update events are triggered through process communication.

Note 1: LoadBalance function returns different getURL tool functions by algorithm name. See module 2: Common Algorithms for implementation of each algorithm

Note 2: The getSource function processes parameters and returns them. GetURL is the URL utility function mentioned above.

const cpusLen = require("os").cpus().length;
const LoadBalance = require("./algorithm");
const express = require("express");
const axios = require("axios");
const app = express();
const {urlFormat, ipFormat} = require("./util");
const {ALGORITHM, BASE_URL} = require("./constant");
const {urlDesc, algorithm, port} = require("./config");
const run = () =  {
    // Get the forward URL utility function
    const getURL = LoadBalance(urlDesc.slice(0, cpusLen), algorithm);
    // Listen for requests and balance agents
    app.get("/".async (req, res) = {
        // Get the parameters you want to pass in
        const source = await getSource(req);
        / / get the URL
        const URL = getURL(source);
        // res.redirect(302, URL
        // Record the request start time
        const start =;
        // Proxy request
        axios.get(URL).then(async (response) = {
            // Obtain the load balancing statistic object and return it
            const urlCollect = await getUrlCollect();
            // Processing across domains
   = urlCollect;
            // Return the data
            // Record the time and update it
            const costTime = - start;
            process.send({type: "updateCostTime", costTime, URL})
    // The load balancing server starts listening for requests
    app.listen(port, () =  {
        console.log(`Load Balance Server Running at ${BASE_URL}:${port}`);
const getSource = async (req) = {
    switch (algorithm) {
        case ALGORITHM.IP_HASH:
            return ipFormat(req);
        case ALGORITHM.URL_HASH:
            return urlFormat(req);
            return urlFormat(req);
            return await getUrlCollect();
        case ALGORITHM.FAIR:
            return await getUrlCollect();
            return null; }};Copy the code

How do I obtain the load balancing statistic object getUrlCollect from the balancer

  1. Through process communication, send a get message to the parent process.
  2. At the same time, start listening for communication messages from the parent process, and return with Promise ReSOvle after receiving.
// Obtain the load balancing statistics object
const getUrlCollect = () =  {
    return new Promise((resolve, reject) =  {
        try {
            process.send({type: "getUrlCollect"})
            process.on("message".msg=  {
                if (msg.type === "getUrlCollect") {
        } catch (e) {
Copy the code

How to implement service node concurrent workerThread.js

Use multi-thread + multi-process model to provide concurrency capability for each service node.

Main process flow

  1. Create a number of service nodes based on the configuration file.

    • Create a process
    • Listen for messages from the parent thread (service node listening port) and forward them to the child process.
    • Listen for messages from the child process and forward them to the parent thread (connection establishment, disconnect events).
    • Listen for abnormal exit and re-establish.
const cluster = require("cluster");
const cpusLen = require("os").cpus().length;
const {parentPort} = require('worker_threads');
const {workerNum, workerFilePath} = require("./config")
if (cluster.isMaster) {
    // Create a worker process function
    const createWorker = () =  {
        // Create process
        const worker = cluster.fork();
        // Listen for messages from the parent thread and forward them to the child process.
        parentPort.on("message".msg=  {
            if (msg.type === "port") {
                worker.send({type: "port".port: msg.port})
        // Listen for child messages and forward them to the parent thread
        worker.on("message".msg=  {
        // The listener process exits unexpectedly and is recreated
        worker.on("exit".() = { createWorker(); })}// Create a process based on the configuration, but not more than the number of CPU cores
    let max
    if (workerNum) {
        max = workerNum  cpusLen ? cpusLen : workerNum
    } else {
        max = 1
    for (let i = 0; i  max; i++) { createWorker(); }}else {
    // The backend service executes files
Copy the code

Child process worker.js (config.workerFilepath)

  1. A message is sent to the parent process through interprocess communication to trigger the connection establishment event.
  2. Return the corresponding.
  3. A message is sent to the parent process through interprocess communication to trigger the disconnection event.
var express = require("express");
var app = express();
let port = null;
app.get("/".(req, res) =  {
    // Triggers the connection event
    process.send({type: "connect", port});
    // Print the information
    console.log("HTTP Version: " + req.httpVersion);
    console.log("Connection PORT Is " + port);
    const msg = "Hello My PORT is " + port;
    // Return a response
    // Triggers the disconnect event
    process.send({type: "disconnect", port});
// Receive the port in the primary incoming communication message and listen
process.on("message".(msg) =  {
    if (msg.type === "port") {
        port = msg.port;
        app.listen(port, () =  {
            console.log("Worker Listening "+ port); }); }});Copy the code

Finally, let's look at the DataBase class

  • Members:
  1. Status: indicates the task queue status

  2. UrlCollect: Data statistics object (provide data for use/display by algorithms)

    • Count: Indicates the number of requests processed
    • CostTime: response time
    • Connection: Indicates the number of real-time connections
  3. The add method

    • Increase the number of connections and real-time connections
  4. Sub way

    • Reduce the number of real-time connections
  5. UpdateCostTime method

    • Update response time
class DataBase {
    urlCollect = {};
    / / initialization
    constructor (urlObj) {
        urlObj.forEach((val) =  {
            this.urlCollect[val.url] = {
                count: 0.costTime: 0.connection: 0}; }); }// Increase the number of connections and real-time connections
    add (port) {
        const url = `${BASE_URL}:${port}`;
    // Reduce the number of real-time connections
    sub (port) {
        const url = `${BASE_URL}:${port}`;
    // Update the response time
    updateCostTime (url, time) {
        this.urlCollect[url].costTime = time; }}Copy the code

The final result

Made a visual diagram to see the equilibrium effect (Random)✔️

Looks like a good equalizer ?

Little homework

Those who want to implement the load balancer manually/look at the source code can check out the ?? code repository

6. Knowledge expansion

Why can cluster processes listen on a single port?

  1. IsMaster is used to determine whether the cluster is the main process. The main process is not responsible for task processing, but only responsible for managing and scheduling work sub-processes.

  2. The master process starts a TCP server that is actually listening on the port. After the request triggers this TCP server connection event, it is forwarded via handle (IPC) to the worker process for processing.

    1. Handle forwarding can forward TCP servers, TCP sockets, UDP sockets, IPC pipes
    2. IPC only supports transporting strings, not objects (serializable).
    3. Forwarding process: Parent process sends - stringfy send(fd) - IPC - Get (fd) Parse - child process receives
    4. Fd is the handle file descriptor.
  3. How do I select a worker process?

    1. The RoundRobin algorithm is built into the Cluster module to select worker processes through polling.
  4. Why not use clusters for load balancing?

    1. Manually implement different load balancing algorithms based on different scenarios.

How does Node implement interprocess communication?

  1. Common interprocess communication

    • Pipeline communication

      • Anonymous pipe
      • A named pipe
    • A semaphore

    • The Shared memory

    • Socket

    • The message queue

  2. IPC channels implemented in Node depend on Libuv. Windows is implemented by named pipe, * NIx system is implemented by Domain Socket.

  3. Interprocess communication at the application layer is represented by simple message events and send() methods, and the interface is very concise and message-oriented.

  4. How is the IPC pipeline established?

    • The parent process first tells the child process the file descriptor of the pipeline through the environment variable
    • Parent processes create child processes
    • The child process starts and connects to the existing IPC pipe through the file descriptor, establishing a connection with the parent process.

Multiprocess vs. multithreaded

Multiple processes

  1. Data sharing is complex and requires IPC. Data is separate and synchronization is simple.
  2. High memory usage and low CPU usage.
  3. Creating and destroying is complex and slow
  4. Processes run independently and do not affect each other
  5. Can be used for multi - machine multi - core distributed, easy to expand


  1. Sharing process data, data sharing is simple, synchronization is complex.
  2. Low memory usage and high CPU usage.
  3. Creating and destroying is simple and fast.
  4. Threads breathe and die together.
  5. Can only be used for multi-core distribution.

Vii. Some ideas generated by this sharing

Please leave a comment for discussion.

  1. Node.js non-blocking asynchronous I/O fast, front-end extension server business?

  2. Enterprise practice, Node is still reliable?

    • Ali Node middle platform architecture
    • Tencent CloudBase Cloud development Node
    • Lots of Node.js full-stack engineer jobs
  3. Is Node computation-intensive unfriendly?

    • Serverless, computation-intensive written in C++/Go/Java, called in Faas/RPC mode.
  4. Node ecology is inferior to other mature languages

    • Ali exported Java ecology
    • Can we look at trends and build Node ecology to enhance team impact?
  5. discuss

    • Why is Node.js such a great Web backend advantage?

8. References

  1. Overview of Health Check - Load balancing
  2. Node.js in Simple Terms
  3. Node.js (
  4. Learn more about processes and threads in Node.js

  • Nuggets: LeBron on the front end

  • Zhihu: LeBron on the front end

  • Keep sharing technical blog posts, follow wechat public account ?? front-end LeBron

About (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.