【Nginx】 How to set up Nginx+Keepalived dual-system hot backup environment based on master/slave mode? This is the most complete one!!

Writing in the front

Recently published “massive data processing and big data technology combat”, details can pay attention to glacier technology wechat public number, check my “massive data processing and big data technology combat” published! The article.

Nginx+Keepalived dual hot standby environment based on master/slave mode. That must be arranged! Without further ado, let’s get right to the text.

Load balancing technology

Load balancing technology is very important for a web site, especially for a large web server cluster! A good load balancing architecture can realize failover and high availability environment, avoid single point of failure, and ensure the healthy and continuous operation of the website.

As a result of business expansion, the site visits are increasing, the load is getting higher and higher. Now we need to place nGINx load balancing in the web front end, and combine Keepalived to realize high availability of HA on the front nGINx.

1) Nginx process is based on the Master+Slave(worker) multi-process model, which has a very stable sub-process management function. In the Master process allocation mode, the Master process does not process services but distributes tasks to ensure the high reliability of the Master process. All service signals of the Slave(worker) process are sent by the Master process. All timeout tasks of the Slave(worker) process are terminated by the Master and belong to the non-blocking task model.

2) Keepalived is a highly reliable running tool for IMPLEMENTING VRRP backup routes under Linux. The service mode designed based on Keepalived can truly achieve a seamless IP exchange between the master server and the backup server in the event of a failure. Combined with the two, a relatively stable software LB scheme can be constructed.

Keepalived is introduced

Keepalived is a HIGH availability solution based on VRRP to avoid IP single points of failure. Similar tools include Heartbeat, Corosync, and Pacemaker. However, it generally does not exist on its own, but works with other load balancing technologies such as LVS, HaProxy, and NGINx to achieve high cluster availability.

VRRP protocol

VRRP stands for Virtual Router Redundancy Protocol. It can be considered as a fault-tolerant protocol to realize Router high availability. That is, N routers providing the same function form a Router Group. This Group has a master and multiple backup, but looks like one in the outside world, forming a virtual Router and possessing a virtual IP address (VIP, That is, the default route of other machines in the LAN where the router resides), the master who owns this IP address is actually responsible for ARP corresponding and forwarding IP packets, and other routers in the group are on standby as backup. The master sends multicast messages. If the backup fails to receive VRRP packets within the timeout period, the master is considered to be down. In this case, you need to elect a Backup as the master based on the VRRP priority to ensure high availability of the router.

In VRRP, the Virtual Router uses 00-00-5E-00-01-XX as the Virtual MAC address. XX is the unique Virtual Router IDentifier (VRID). Only one physical Router occupies this address at a time. Multicast IP address 224.0.0.18 is used to send notification messages periodically within the physical router group within the virtual router. Each Router has a priority level ranging from 1 to 255. The highest priority becomes the master Router. By lowering the priority of the master router, the router in the backup state preempts the state of the primary router. The larger IP address with the same backup priority is the master router and takes over the virtual IP address.

Keepalived compared to Heartbeat/Corosync etc

Heartbeat, Corosync, Keepalived which cluster component should we choose?

First of all, Heartbeat and Corosync are of the same type. Keepalived and Corosync are not of the same type at all. Keepalived Virtual Router Redundancy Protocol (VRRP); Heartbeat or Corosync is highly available based on host or network services;

Simply put, Keepalived is intended to mimic the high availability of routers, and Heartbeat or Corosync is intended to make services highly available. So general Keepalived is to achieve front-end high availability, commonly used front-end high availability combination has, is our common LVS+Keepalived, Nginx+Keepalived, HAproxy+Keepalived. While Heartbeat or Corosync is highly available to implement the service, Common combinations include Heartbeat V3 (Corosync)+Pacemaker+NFS+Httpd for high availability of Web servers, Heartbeat V3 (Corosync)+Pacemaker+NFS+MySQL for high availability of MySQL servers.

To sum up, Keepalived is lightweight and highly available. It is generally used for high availability on the front end and does not require shared storage. It is commonly used for high availability on two nodes. Heartbeat(or Corosync) is typically used for high availability of services and requires shared storage, which is typically used for high availability of multiple nodes. We made that clear.

Heartbaet or Corosync?

Corosync is generally used because it works better than heartbeat, and even Pacemaker, which is separate from heartbeat, says it will prefer Corosync in future development, so for now Corosync + Pacemaker is the best combination.

Two-node high availability (HA) is generally implemented through virtual IP (floating IP) method, based on Linux/Unix IP alias technology.

There are two types of two-node high availability methods:

1) Dual-node active/standby mode: Two front-end servers, one active server and one hot spare server, are used. In normal cases, the active server is bound to a public network virtual IP address to provide load balancing services, while the hot spare server is idle. When the active server is faulty, the hot spare server takes over the public network virtual IP address of the active server to provide load balancing services. However, the hot spare server is always in a waste state when the main machine does not break down. This scheme is not economical for websites with few servers.

2) Dual-node active/standby mode: Two load balancing servers are used at the front end, each in active state, and each is bound with a public network virtual IP address to provide load balancing services. When one fails, the other takes over the public network virtual IP address of the failed server (in this case, the non-failing machine takes over all requests). This approach is cost-effective and well suited to the current architectural environment.

Nginx+ Keepalived implements high availability load balancing in master-slave mode.

Keepalived can be thought of as the implementation of VRRP on Linux. There are three main modules, namely Core, Check and VRRP.

The core module is the core of Keepalived, which is responsible for the startup and maintenance of the main process as well as the loading and parsing of the global configuration file.
Check is responsible for health checks, including common checks.
The VRRP module implements the VRRP protocol.

The environment that

Operating system: centos6.8, 64 master machine (master – node) : 103.110.98.14/192.168.1.14 slave machine (slave node) : 103.110.98.24/192.168.1.24 public virtual IP (VIP) : 103.110.98.20 / / load balancer configuration domain name resolution to the VIP

Application environment:

Environmental installation

Install the Nginx and Keepalive services. (The installation procedure is the same on the master-node and slave-node servers.)

Install dependencies

[root@master-node ~]# yum -y install gcc pcre-devel zlib-devel openssl-devel
Copy the code

You can go to the link: download.csdn.net/download/l1…

[root@master-node ~]# cd /usr/local/src/
[root@master-node src]Wget # http://nginx.org/download/nginx-1.9.7.tar.gz
[root@master-node src]Wget # http://www.keepalived.org/software/keepalived-1.3.2.tar.gz
Copy the code

Install nginx

[root@master-node src]# tar - ZVXF nginx - 1.9.7. Tar. Gz
[root@master-node src]# CD nginx - 1.9.7
Copy the code

To add WWW users, the -m parameter indicates that the user home directory is not added, and the -s parameter indicates that the shell type is specified

[root @ master - node nginx - 1.9.7]# useradd www -M -s /sbin/nologin[root @ master - node nginx - 1.9.7]# vim auto/cc/gcc
Disable Debug compile mode at about 179 lines
#CFLAGS="$CFLAGS -g"[root @ master - node nginx - 1.9.7]# ./configure --prefix=/usr/local/nginx --user=www --group=www --with-http_ssl_module --with-http_flv_module --with-http_stub_status_module --with-http_gzip_static_module --with-pcre[root @ master - node nginx - 1.9.7]# make && make install
Copy the code

Install keepalived

[root@master-node src]# tar - ZVXF keepalived - 1.3.2. Tar. Gz
[root@master-node src]# CD keepalived - 1.3.2[root @ master - node keepalived - 1.3.2]# ./configure[root @ master - node keepalived - 1.3.2]# make && make install[root @ master - node keepalived - 1.3.2]# cp/usr/local/SRC/keepalived 1.3.2 / keepalived/etc/init. D/keepalived/etc/rc. D/init. D/a[root @ master - node keepalived - 1.3.2]# cp /usr/local/etc/sysconfig/keepalived /etc/sysconfig/[root @ master - node keepalived - 1.3.2]# mkdir /etc/keepalived[root @ master - node keepalived - 1.3.2]# cp /usr/local/etc/keepalived/keepalived.conf /etc/keepalived/[root @ master - node keepalived - 1.3.2]# cp /usr/local/sbin/keepalived /usr/sbin/
Copy the code

Add nginx and Keepalive services to boot services

[root @ master - node keepalived - 1.3.2]# echo "/usr/local/nginx/sbin/nginx" >> /etc/rc.local[root @ master - node keepalived - 1.3.2]# echo "/etc/init.d/keepalived start" >> /etc/rc.local
Copy the code

Configure the service

Disable SElinux and configure the firewall (this is required for both master and slave load balancers).

[root@master-node ~]# vim /etc/sysconfig/selinux
#SELINUX=enforcing #
#SELINUXTYPE=targeted
SELINUX=disabled                           # increase
[root@master-node ~]# setenForce 0 # make the configuration take effect immediately

[root@master-node ~]# vim /etc/sysconfig/iptables. -A INPUT -S 103.110.98.0/24 -d 224.0.0.18 -j ACCEPTAllow multicast address communication
-A INPUT -s 192.168.1.0/24 -d 224.0.0.18 -j ACCEPT
-A INPUT -s 103.110.98.0/24 -p vrrp -j ACCEPT                                  Allow VRRP (virtual router Redundancy) communication
-A INPUT -s 192.168.1.0/24 -p vrrp -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT      Enable port 80 access

[root@master-node ~]/etc/init.d/iptables restart
Copy the code

Configure nginx

The master node and slave – node nginx configuration are exactly the same two servers, main is to configure the/usr/local/nginx/conf/nginx. Conf HTTP, of course also can configure virtual host vhost directory, Then configure the vhost file such as lb.conf.

Among them, the multiple domain name pointing is realized by virtual host (configure the server below HTTP); Different virtual directories of the same domain are implemented through different locations under each server; Go to the backend server and configure upstream under vhost/ lb.conf and reference it via proxy_pass in server or location.

To implement the planned access, configure lb.conf as follows (add proxy_cache_PATH and proxy_temp_PATH to enable nginx caching)

[root@master-node ~]# vim /usr/local/nginx/conf/nginx.conf
user  www;
worker_processes  8;
 
#error_log logs/error.log;
#error_log logs/error.log notice;
#error_log logs/error.log info;
 
#pid logs/nginx.pid;
 
 
events {
    worker_connections  65535;
}
 
 
http {
    include       mime.types;
    default_type  application/octet-stream;
    charset utf-8;
       
    # # # # # #
    ## set access log format
    # # # # # #
    log_format  main  '$http_x_forwarded_for $remote_addr $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_cookie" $host $request_time';
 
    # # # # # # #
    ## http setting
    # # # # # # #
    sendfile       on;
    tcp_nopush     on;
    tcp_nodelay    on;
    keepalive_timeout  65;
    proxy_cache_path /var/www/cache levels=1:2 keys_zone=mycache:20m max_size=2048m inactive=60m;
    proxy_temp_path /var/www/cache/tmp;
 
    fastcgi_connect_timeout 3000;
    fastcgi_send_timeout 3000;
    fastcgi_read_timeout 3000;
    fastcgi_buffer_size 256k;
    fastcgi_buffers 8 256k;
    fastcgi_busy_buffers_size 256k;
    fastcgi_temp_file_write_size 256k;
    fastcgi_intercept_errors on;
 
    #
    client_header_timeout 600s;
    client_body_timeout 600s;
   # client_max_body_size 50m;
    client_max_body_size 100m;               # Maximum number of bytes per file that a client is allowed to request
    client_body_buffer_size 256k;            The maximum number of bytes that the agent can buffer requests, which can be interpreted as being saved locally before being passed to the user
 
    gzip  on;
    gzip_min_length  1k;
    gzip_buffers     4 16k;
    gzip_http_version 1.1;
    gzip_comp_level 9;
    gzip_types       text/plain application/x-javascript text/css application/xml text/javascript application/x-httpd-php;
    gzip_vary on;
 
    ## includes vhosts
    include vhosts/*.conf;
}
Copy the code

[root@master-node ~]# mkdir /usr/local/nginx/conf/vhosts
[root@master-node ~]# mkdir /var/www/cache
[root@master-node ~]# ulimit 65535
Copy the code

[root@master-node ~]# vim /usr/local/nginx/conf/vhosts/LB.confupstream LB-WWW { ip_hash; Server 192.168.1.101:80 max_fails = 3 fail_timeout = 30 s;#max_fails = 3 Specifies the number of failures allowed. The default value is 1Server 192.168.1.102:80 max_fails = 3 fail_timeout = 30 s;#fail_timeout = 30s When max_FAILS fails for the first time, the time for sending requests to the back-end server is suspendedServer 192.168.1.118:80 max_fails = 3 fail_timeout = 30 s; } upstream LB-OA { ip_hash; Server 192.168.1.101:8080 max_fails = 3 fail_timeout = 30 s; Server 192.168.1.102:8080 max_fails = 3 fail_timeout = 30 s; } server { listen 80; server_name dev.wangshibo.com; access_log /usr/local/nginx/logs/dev-access.log main;
      error_log  /usr/local/nginx/logs/dev-error.log; The location/SVN {proxy_pass http://192.168.1.108/svn/; proxy_redirect off ; proxy_set_header Host$host;
         proxy_set_header X-Real-IP $remote_addr;
         proxy_set_header REMOTE-HOST $remote_addr;
         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
         proxy_connect_timeout 300;             Connect to the backend server timeout time, initiate handshake waiting response time
         proxy_send_timeout 300;                The backend server must send all data within the specified time
         proxy_read_timeout 600;                The response time of the backend server after the connection is successful has entered the backend queue waiting for processing
         proxy_buffer_size 256k;                The proxy request buffer, which stores the user's header information for nginx to process
         proxy_buffers 4 256k;                  Tell nginx how much space can be used to store a single buffer
         proxy_busy_buffers_size 256k;          Apply for maximum proxy_buffers if the system is busy
         proxy_temp_file_write_size 256k;       Size of temporary files cached by proxyproxy_next_upstream error timeout invalid_header http_500 http_503 http_404; proxy_max_temp_file_size 128m; proxy_cache mycache; proxy_cache_valid 200 302 60m; proxy_cache_valid 404 1m; } the location/submin {proxy_pass http://192.168.1.108/submin/; proxy_redirect off ; proxy_set_header Host$host;
         proxy_set_header X-Real-IP $remote_addr;
         proxy_set_header REMOTE-HOST $remote_addr;
         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
         proxy_connect_timeout 300;
         proxy_send_timeout 300;
         proxy_read_timeout 600;
         proxy_buffer_size 256k;
         proxy_buffers 4 256k;
         proxy_busy_buffers_size 256k;
         proxy_temp_file_write_size 256k;
         proxy_next_upstream error timeout invalid_header http_500 http_503 http_404;
         proxy_max_temp_file_size 128m;
         proxy_cache mycache;        
         proxy_cache_valid 200 302 60m;
         proxy_cache_valid 404 1m;
        }
    }
    
server {
     listen       80;
     server_name  www.wangshibo.com;
  
      access_log  /usr/local/nginx/logs/www-access.log main;
      error_log  /usr/local/nginx/logs/www-error.log;
  
     location / {
         proxy_pass http://LB-WWW;
         proxy_redirect off ;
         proxy_set_header Host $host;
         proxy_set_header X-Real-IP $remote_addr;
         proxy_set_header REMOTE-HOST $remote_addr;
         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
         proxy_connect_timeout 300;
         proxy_send_timeout 300;
         proxy_read_timeout 600;
         proxy_buffer_size 256k;
         proxy_buffers 4 256k;
         proxy_busy_buffers_size 256k;
         proxy_temp_file_write_size 256k;
         proxy_next_upstream error timeout invalid_header http_500 http_503 http_404;
         proxy_max_temp_file_size 128m;
         proxy_cache mycache;                                
         proxy_cache_valid 200 302 60m;                      
         proxy_cache_valid 404 1m;
        }
}
   
 server {
       listen       80;
       server_name  oa.wangshibo.com;
  
      access_log  /usr/local/nginx/logs/oa-access.log main;
      error_log  /usr/local/nginx/logs/oa-error.log;
  
       location / {
         proxy_pass http://LB-OA;
         proxy_redirect off ;
         proxy_set_header Host $host;
         proxy_set_header X-Real-IP $remote_addr;
         proxy_set_header REMOTE-HOST $remote_addr;
         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_connect_timeout 300; proxy_send_timeout 300; proxy_read_timeout 600; proxy_buffer_size 256k; proxy_buffers 4 256k; proxy_busy_buffers_size 256k; proxy_temp_file_write_size 256k; proxy_next_upstream error timeout invalid_header http_500 http_503 http_404; proxy_max_temp_file_size 128m; proxy_cache mycache; proxy_cache_valid 200 302 60m; proxy_cache_valid 404 1m; }}Copy the code

Verify the Nginx configuration

Verification method (to ensure normal communication between the local host of the load balancer and the back-end real server) : 1) Use IP addresses to access the urls of back-end real servers configured in Lb.cong. 2) Use domain names and paths to access domain names and virtual paths of back-end real servers configured in LB.Cong

Nginx configuration of the back-end application server, 192.168.1.108 is used as an example.

Because 192.168.1.108 is an openstack VM, it does not have an external IP address and cannot resolve the domain name. So add the IP address to server_name to make it accessible using the IP address.

[root@108-server ~]# cat /usr/local/nginx/conf/vhosts/svn.conf
server {
listen 80;
#server_name dev.wangshibo.com;Server_name dev.wangshibo.com 192.168.1.108; access_log /usr/local/nginx/logs/dev.wangshibo-access.log main;
error_log /usr/local/nginx/logs/dev.wangshibo-error.log;

location / {
root /var/www/html;
index index.html index.php index.htm;
}
}

[root@108-server ~]# ll /var/www/html/
drwxr-xr-x. 2 www www 4096 Dec 7 01:46 submin
drwxr-xr-x. 2 www www 4096 Dec 7 01:45 svn
[root@108-server ~]# cat /var/www/html/svn/index.htmlThis is the page of SVN /192.168.1.108 [root@108-server ~]# cat /var/www/html/submin/index.htmlThis is the page of submin/192.168.1.108 [root@108-server ~]# cat /etc/hosts127.0.0.1 localhost localhost.localdomain localhost4 localhost4. Localdomain4 ::1 localhost localhost.localdomain Localhost6 localhost6.localDomain6 192.168.1.108 dev.wangshibo.com [root@108-server ~]# curl http://dev.wangshibo.com // Because it is an Intranet, the machine cannot be connected to the Internet and cannot resolve the domain name. So no response for domain name access. IP access only
[root@ops-server4 vhosts]# curl http://192.168.1.108This is 192.168.1.108 page!!!!!! [root@ops-server4 vhosts]# curl http://192.168.1.108/svn/ / / last/symbol to add, otherwise can't access.This is the page of SVN /192.168.1.108 [root@ops-server4 vhosts]# curl http://192.168.1.108/submin/This is the page of submin/192.168.1.108Copy the code

Then run the test on the master-node and slave-node load machines (iptables firewall requires port 80 enabled) :

[root@master-node ~]# curl http://192.168.1.108/svn/This is the page of SVN /192.168.1.108 [root@master-node ~]# curl http://192.168.1.108/submin/This is the page of submin/192.168.1.108Copy the code

Browser access: Bind dev.wangshibo.com to the local host as follows, that is, bind the public IP address of the master and slave machine to test whether it can be accessed normally (after the official completion of the nginx+ Keepalive environment, 103.110.98.14 dev.wangshibo.com 103.110.98.24 dev.wangshibo.com

Keepalived configuration

1) Keepalived configuration on master-node loaders

[root@master-node ~]# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak
[root@master-node ~]# vim /etc/keepalived/keepalived.conf
Copy the code

! Configuration File for keepalived     # global definition
  
global_defs {
notification_email {     Keepalived Specifies a mailbox to send notification emails to when an event occurs (such as switching)
[email protected]   Set alarm email address. You can set multiple email addresses, one for each line. You need to enable the sendmail service on the local computer
[email protected]
}
  
notification_email_from [email protected]   Keepalived needs to send email notification addresses when operations such as switching occurSmtp_server 127.0.0.1# specify the SMTP server that sends the email
smtp_connect_timeout 30    Set the timeout period for connecting to the SMTP server
router_id master-node     # An identifier for a machine running Keepalived, usually set to hostname. Information displayed in the subject of an email when a fault occurs.
}
  
vrrp_script chk_http_port {      Check if the nginx service is running. There are many ways, such as process, scripting and so on
    script "/opt/chk_nginx.sh"   This is monitored by scripting
    interval 2                   Check the script execution interval every 2s
    weight -5                    # priority change caused by script result, detection failure (script return non-0) priority -5
    fall 2                    If the test fails twice in a row, it is true. Reduces priority with weight (between 1 and 255)
    rise 1                    If the test succeeds once, it succeeds. The priority is not changed
}
  
vrrp_instance VI_1 {    #keepalived Within the same virtual_Router_id the highest priority (0-255) will become the master, and the next priority will take over if the host with the highest priority fails
    state MASTER    Keepalived specifies the role of keepalived, MASTER indicates that the host is the primary server, BACKUP indicates that the host is the standby server. Note that state specifies the Initial state of instance(Initial), which is the Initial state of the server after configuration. If it's set to MASTER, but if it's not as high priority as the other one, then this one will send its own priority when it sends notifications, and the other one will find that its priority is not as high as its priority, and it will preempt back as MASTER
    interface em1          # specify an interface to the HA monitoring network. The nic bound to the instance because the virtual IP address must be added to an existing NICMcAst_src_ip 103.110.98.14Select a stable nic port for sending multicast packets. This is equivalent to the heartbeat port. If this is not set, use the default IP address of the bound NIC. The IP address specified by interface
    virtual_router_id 51         # Virtual route id. This id is a number that is unique to the same VRRP instance. For a vrrP_instance, the MASTER and BACKUP must be the same
    priority 101                 The higher the number, the higher the priority. In the same vrrp_instance, the priority of MASTER must be greater than that of BACKUP
    advert_int 1                 Set the synchronization check interval between MASTER and BACKUP load balancers, in seconds
    authentication {             Set the authentication type and password. Master and slave must be the same
        auth_type PASS           The VRRP authentication type can be PASS or AH
        auth_pass 1111           In the same VRrp_instance, MASTER and BACKUP must use the same password to communicate with each other
    }
    virtual_ipaddress {          #VRRP HA virtual address If there are multiple VIPs, continue to fill in the new lineTrack_script 103.110.98.20} {Services that perform monitoring. Note that this setting cannot be written immediately after the VRrp_script configuration block, otherwise nginx monitoring will fail!!
       chk_http_port                    # reference the VRRP script, the name specified in the vrrp_script section. Run them periodically to change priorities and eventually trigger a master/slave switchover.}}Copy the code

Keepalived configuration on slave-node loaders

[root@slave-node ~]# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak
[root@slave-node ~]# vim /etc/keepalived/keepalived.conf
Copy the code

! Configuration File forkeepalived global_defs { notification_email { [email protected] [email protected] } notification_email_from [email protected] smtp_server 127.0.0.1 smtp_connect_timeout 30 Router_id slave-node} vrrp_script chk_http_port {script"/opt/chk_nginx.sh"Interval 2 weight-5 fall 2 rise 1} vrrp_instance VI_1 {state BACKUP interface em1 McAst_src_ip 103.110.98.24 virtual_router_id 51 priority 99 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 103.110.98.20} track_script {chk_http_port}}Copy the code

Let Keepalived monitor NginX status:

1) After the previous configuration, if the keepalived of the master server stops service, the slave server will automatically take over the VIP external services; – Keepalived will take over VIP again once the main server is restored. But that’s not what we need, we need to be able to switch automatically when NginX is out of service. 2) Keepalived supports configuration monitoring scripts, we can use scripts to monitor the state of NginX, if the state is abnormal, then a series of operations, and finally can not restore NginX, then kill Keepalived, so that the slave server can take over the service.

The simplest way to monitor the state of NginX is to monitor the NginX process. A more reliable way is to check the NginX port. The most reliable way is to check whether multiple urls can get pages.

Keepalive. conf (vrrp_script)

1) Through the script execution of the return result, change the priority, Keepalived continue to send notification messages, backup to determine the priority. This is a direct way to monitor Nginx processes. 2) If an exception is detected in the script, directly close the Keepalived process. If the Backup machine cannot receive Advertisement, it will preempt IP. This is the way to check NginX ports.

The above script configuration section, “killall nginx 0” belongs to the first kind of circumstance, “/ opt/chk_nginx. Sh” belongs to the 2nd kind of circumstance. Personally, I prefer to judge by shell script, but exit 1 in case of exception, exit 0 normally, keepalived decides whether to preempt VIP according to dynamically adjusted VRrP_instance priority election:

If the script execution result is 0 and the weight configuration value is greater than 0, the priority is increased accordingly
If the script execution result is non-zero and the weight configuration value is less than 0, the priority is reduced accordingly
In other cases, the original priority remains the same, that is, the value of priority in the configuration file.

Tip: Priorities do not constantly increase or decrease

You can write multiple scripts and set different weights for each script (just list them in the configuration)

If nopreempt is set in vrrp_instance on the MASTER node, it will be restored. If nopreempt is set in vrrp_instance on the MASTER node, it will be restored. It doesn’t preempt even if its PRIO is higher, which avoids unnecessary switching under normal circumstances

Scripts can be used to detect the status of service processes and dynamically adjust priorities to achieve active/standby switchover.

Also: there are virtual_server and real_server configurations in the default keepalive.conf. We don’t need them, they are for LVS.

How to try to restore service since Keepalived only checks whether the machine and other machine keepalived are normal and implement VIP drift, and will not drift VIP if the machine nginx fails. So write a script to determine whether the native Nginx is working properly, and if you find that nginx is not working properly, restart it. Wait for 3 seconds to check again, if it still fails, do not try again, turn off Keepalived, other hosts will take over VIP;

It is easy to write monitoring scripts based on the above strategies. This script is valid only if keepalived service is running! If the Keepalived service is disabled first, then the nginx service cannot be self-started after it is disabled.

This script detects the running state of Ngnix and tries to restart ngnix if the nginx process does not exist. If it fails to start, it stops Keepalived and is ready for another machine to take over. The monitoring script is as follows (master and slave must have this monitoring script) :

[root@master-node ~]# vim /opt/chk_nginx.sh

#! /bin/bash
counter=$(ps -C nginx --no-heading|wc -l)
if [ "${counter}" = "0" ]; then
    /usr/local/nginx/sbin/nginx
    sleep 2
    counter=$(ps -C nginx --no-heading|wc -l)
    if [ "${counter}" = "0" ]; then
        /etc/init.d/keepalived stop
    fi
fi
Copy the code

[root@master-node ~]# chmod 755 /opt/chk_nginx.sh
[root@master-node ~]# sh /opt/chk_nginx.sh
80/tcp open http
Copy the code

2) If the master is suspended, then the slave preempts the VIP and runs the nginx service on the slave. 3) If the nginx service on the master is suspended, then the nginx service on the slave is suspended. Nginx automatically restarts, and when the restart fails, keepalived is automatically disabled, so that VIP resources are transferred to slave. 4) Check the health status of the back-end server 5) Enable nginx service on both master and slave. When a keepalived service stops, the VIP will migrate to the node where keepalived service still exists. If you want the Nginx service to hang and the VIP to drift to another node, you must use a script or shell command in the configuration file to control this. (Nginx service will automatically start after downtime, and if it fails to start, keepalived will be forcibly turned off, causing VIP resources to drift to another machine)

Final validation: Turn off Keepalived or Nginx on the master server and the VIP will automatically float to the secondary server.

1) Start nginx and Keepalived on master and slave servers to ensure that the two services are enabled properly:

[root@master-node ~]# /usr/local/nginx/sbin/nginx
[root@master-node ~]# /etc/init.d/keepalived start
[root@slave-node ~]# /usr/local/nginx/sbin/nginx
[root@slave-node ~]# /etc/init.d/keepalived start
Copy the code

2) Check whether the virtual IP address is bound on the primary server

[root@master-node ~]# ip addr. 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 44:a8:42:17:3d:dd brd Ff :ff:ff:ff:ff:ff :ff: FF: FF inet 103.110.98.14/26 BRD 103.10.86.63 scope Global EM1 valid_lft forever preferred_lft forever 103.110.98.20/32 Scope Global EM1 valid_lft forever preferred_lft forever Inet 103.110.98.20/26 BRD 103.10.86.63 scope global secondary em1:0 valid_lft forever preferred_lft forever inet6 fe80::46a8:42ff:fe17:3ddd/64 scope link valid_lft forever preferred_lft foreverCopy the code

3) Stop Keepalived on the main server:

[root@master-node ~]# /etc/init.d/keepalived stop
Stopping keepalived (via systemctl): [ OK ]
[root@master-node ~]# /etc/init.d/keepalived status
[root@master-node ~]# ps -ef|grep keepalived
root 26952 24348 0 17:49 pts/0 00:00:00 grep --color=auto keepalived
[root@master-node ~]# 
Copy the code

4) Then check on the secondary server and find that the VIP has been taken over:

[root@slave-node ~]# ip addr. 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 44:a8:42:17:3c:a5 brd Ff: FF: FF: FF: FF: FF: FF INET 103.110.98.24/26 BRD 103.10.86.63 scope Global EM1 INET 103.110.98.20/32 scope Global EM1 INET6 fe80::46a8:42ff:fe17:3ca5/64 scope link valid_lft forever preferred_lft forever .......Copy the code

After master’s Keepalived service is suspended, VIP resources are automatically transferred to slave, and the website is accessible normally without any impact!

5) Restart keepalived on the master server and find that the master server has taken over the VIP again. At this time, the VIP on the slave machine is no longer

[root@master-node ~]# /etc/init.d/keepalived start
Starting keepalived (via systemctl): [ OK ]
[root@master-node ~]# ip addr. 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 44:a8:42:17:3d:dd brd Ff :ff:ff:ff:ff:ff :ff: FF: FF inet 103.110.98.14/26 BRD 103.10.86.63 scope Global EM1 valid_lft forever preferred_lft forever 103.110.98.20/32 Scope Global EM1 valid_lft forever preferred_lft forever Inet 103.110.98.20/26 BRD 103.10.86.63 scope global secondary em1:0 valid_lft forever preferred_lft forever inet6 fe80::46a8:42ff:fe17:3ddd/64 scope link valid_lft forever preferred_lft forever ...... [root@slave-node ~]# ip addr. 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 44:a8:42:17:3c:a5 brd Ff: ff ff ff: ff: ff inet 103.110.98.24/26 BRD 103.10.86.63 scope global em1 inet6 fe80: : 46 a8:42 ff: fe17:3 ca5/64 the scope of the link valid_lft forever preferred_lft foreverCopy the code

Then verify the nginx service failure, see keepalived to monitor the status of nginx script is normal? If you manually shut down the nginx service on the master machine, it will automatically start up after 2 seconds at most (because the interval between executing the keepalive script to monitor the nginx status is 2 seconds). Domain access is almost unaffected!

[root@master-node ~]# /usr/local/nginx/sbin/nginx -s stop
[root@master-node ~]# ps -ef|grep nginx
root 28401 24826 0 19:43 pts/1 00:00:00 grep --color=auto nginx
[root@master-node ~]# ps -ef|grep nginx
root 28871 28870 0 19:47 ? 00:00:00 /bin/sh /opt/chk_nginx.sh
root 28875 24826 0 19:47 pts/1 00:00:00 grep --color=auto nginx
[root@master-node ~]# ps -ef|grep nginx
root 28408 1 0 19:43 ? 00:00:00 nginx: master process /usr/local/nginx/sbin/nginx
www 28410 28408 0 19:43 ? 00:00:00 nginx: worker process
www 28411 28408 0 19:43 ? 00:00:00 nginx: worker process
www 28412 28408 0 19:43 ? 00:00:00 nginx: worker process
www 28413 28408 0 19:43 ? 00:00:00 nginx: worker process
Copy the code

Finally, you can view /var/log/messages on the two servers to check the VIP drift of VRRP logs ~~~~

Possible problems

1) The possible causes of VIP binding failure are as follows: -> After iptables is enabled, no policy allowing VRRP communication is enabled (which may lead to split brain). You can choose to close the iptables -> keepalive. conf file due to incorrect configuration, such as interface binding device error

2) After THE VIP is bound, the external ping fails because: -> The network is faulty. You can check whether the lower gateway is normal. -> Arping -i nic name -c 5 -s VIP gateway

Big welfare

Follow the wechat official account of “Glacier Technology” and reply the keyword “Design Mode” in the background to get the PDF document of “23 Design Modes in Simple Java”. Return keyword “Java8” to obtain the Java8 new features tutorial PDF. “Distributed traffic limiting solutions under 100 million levels of traffic” PDF document, the three PDFS are created by the glacial and collated super core tutorial, interview essential!!

Ok, that’s enough for today! Don’t forget to click a like, to see and forward, let more people see, learn together, progress together!!

Write in the last

If you think glacier wrote good, please search and pay attention to “glacier Technology” wechat public number, learn with glacier high concurrency, distributed, micro services, big data, Internet and cloud native technology, “glacier technology” wechat public number updated a large number of technical topics, each technical article is full of dry goods! Many readers have read the articles on the wechat public account of “Glacier Technology” and succeeded in job-hopping to big factories. There are also many readers to achieve a technological leap, become the company’s technical backbone! If you also want to like them to improve their ability to achieve a leap in technical ability, into the big factory, promotion and salary, then pay attention to the “Glacier Technology” wechat public account, update the super core technology every day dry goods, so that you no longer confused about how to improve technical ability!