demand

Common scenarios during O&M:

  • The server or application is faulty and needs to be redeployed.
  • Configurations such as middleware and application services are lost and need to be recovered.
  • The database data is lost or abnormal and needs to be restored.
  • System expansion requires configuration files.

A reliable solution to the above scenario is to restore from the backup, so backup management is our last resort.

Local backup can be stored for a short time (for example, one month), avoiding excessive disk space. The remote backup is stored for a long time (for example, one year). In specific circumstances, the storage can be carried out according to the supervision requirements.

According to different requirements, we can divide the backup content into the following types:

  • System-level configuration files

Kernel parameters, hosts resolution, crontab schedule task, environment variables, firewall, etc. \

  • Application-level configuration files

Nginx, Java applications, middleware, DNS, database, etc \

  • Log level data

Binlog log, application log, nginx log, etc

  • Database backup

The problem

By analyzing the backup location and backup type, we can say that 80% of the backup requirements are covered, but the remaining 20% May be tricky for us:

  1. Backup control is too diffuse

In general, we use shell+ Crontab to perform periodic backup on each server. Once requirements change, script management becomes more difficult.

  1. Configuration diversity

For different applications, different requirements, and different scenarios, our configuration will be complex and diverse, so our scripts need to adapt to various scenarios, which is difficult to be fully compatible.

  1. Repeat the backup

The backup efficiency deteriorates due to repeated backup

  1. The backup NETWORK adapter has heavy traffic and the network is affected

  2. The DISK I/O of the remote backup center is insufficient, and remote backup takes time

.

All the above are the problems we may encounter when doing backup. In addition to providing sufficient hardware space, disk IO, gigabit or higher network card traffic, etc., what we need to do in operation and maintenance is to put forward a more suitable scheme for us in terms of backup process control.

The solution

Considering the SSH control and natural idempotent of Ansible, we use “Ansible-playbook + rsync” to realize centralized control here:

  • Because Ansible controls all servers in a unified manner, different backup types need to be judged. If the backup type is matched, the backup is performed. If the backup type is not matched, the backup is not performed.
  • Ansible first backs up the local unified directory based on the type and then uses rsync to back up the unified local directory to the remote backup center.

With the centralized management of Ansible, we can implement unified management of scripts on the control side. However, for the diversified configuration, we still need to carry out standardized configuration based on relevant configuration management specifications. Otherwise, the problem of diversity will still affect us.

The specific implementation

1. Directory structure

[root@test ansible]# tree /etc/ansible├ ─ ─ ansible. CFG ├ ─ ─ hosts ├ ─ ─ data_backup. Yml ├ ─ ─ roles │ ├ ─ ─ databak │ │ ├ ─ ─ the tasks │ │ │ ├ ─ ─ create_dir. Yml │ │ │ ├ ─ ─ Del_30_days_ago_dir. Yml │ │ │ ├ ─ ─ http_conf. Yml │ │ │ ├ ─ ─ keepalived. Yml │ │ │ ├ ─ ─ nginx_conf. Yml │ │ │ ├ ─ ─ Mysql_conf. Yml │ │ │ ├ ─ ─ sys_conf. Yml │ │ │ ├ ─ ─ mysql_conf. Yml │ │ │ ├ ─ ─ rsync. Yml │ │ │ └ ─ ─ the main, yml │ │ └ ─ ─vars│ │ └ ─ ─ the main ymlCopy the code

We organize the directory structure by Playbook, where:

  • Tasks: Backs up different types of files and controls them locally and remotely.
  • Vars: specifies the variable used for this backup

2. Create a data backup role file

vim data_backup.yml
- hosts: "{{ host_ip }}"
  remote_user: root
  gather_facts: False
  roles:
    - databak
Copy the code

For different server IP addresses, the script is passed to host_IP in the form of parameters to implement batch backup. Then auxiliary cronTab can realize periodic backup.

The following script is available for reference:

vim ansible_all_data_backup.sh
#! /bin/bash

today_date=`date "+%Y%m%d"`
yesterday_date=`date -d yesterday "+%Y%m%d"`
old_date=`date -d "30 days ago" "+%Y%m%d"`

for host in `cat host_list | grep -v "^ #"`
do
    echo "`date "+%Y-%m-%d %H:%M:%S"`---${host} start backup." >> /App/logs/ansible_all_data_backup.log
    ansible-playbook ansible-playbook/data_backup.yml --extra-vars="host_ip=${host} today_date=${today_date} yesterday_date=${yesterday_date} old_date=${old_date}" >> /App/logs/ansible_all_data_backup.log 2> &1
    if [ $? -eq 0 ]
    then
        echo "`date "+%Y-%m-%d %H:%M:%S"`---${host} backup finished, start backup next host." >> /App/logs/ansible_all_data_backup.log
    else
        echo "`date "+%Y-%m-%d %H:%M:%S"`---${host} backup failed." >> /App/logs/ansible_all_data_backup.log
        echo "${host} data backup failed\n`tail -10 /App/logs/ansible_all_data_backup.log`" | mail -s "[ansible] ${host} data backup failed" "[email protected]"
        break
    fi
done
Copy the code

3. Create a variable file

vim vars/main.yml
backup_path: /App/backup
data_backup_center: "192.168.3.119"
nginx_logrotate_path: /data/nginx/nginx_logrotate
nginx_logrotate_path_json: /App/logs/nginx_logrotate
# the following three variables as the default value, because ansible - extra - vars was introduced into the highest priority of variable parameters, so - extra - vars to prevail, where to set the default value is to avoid extra - vars to null result in an error
today_date: "19700101"
yesterday_date: "19700101"
old_date: "19700101"
Copy the code

Among them:

  • Backup_path is our local unified backup directory.

  • Data_backup_center is our remote backup center;

  • Nginx_logrotate is our nginx-related log directory;

  • Other variables that control the time the local backup is saved;

4. Create task files

# 1. Operating system-level configuration files
vim tasks/sys_conf.yml
# Back up configuration files related to each system
- name: backup system config file to {{ backup_path }}/{{ today_date }}/config
  command: chdir={{ backup_path }}/{{ today_date }}/config cp -rfv {{ item }} .
  ignore_errors: True
  with_items:
    - /etc/hosts
    - /etc/rc.d/rc.local
    - /etc/crontab
    - /var/spool/cron
    - /App/scripts
    - /etc/zabbix/zabbix_agentd.d
    - /opt/shell
    - /etc/profile
    
  # 2. Application level configuration asks for that
 vim  tasks/nginx_conf.yml
# to judge whether the machine to install nginx, if install nginx configuration file backup directory to the native backup directory, / usr/local/openresty/nginx for fixed installation path.
- name: Nginx installed or not
  stat: path=/usr/local/openresty/nginx/sbin/nginx
  register: nginx_results
- name: backup nginx config file to {{ backup_path }}/{{ today_date }}/config
  command: chdir={{ backup_path }}/{{ today_date }}/config cp -rfv /usr/local/openresty/nginx/conf ./nginx_conf
  when: nginx_results.stat.exists 
Copy the code

Above:

  • Since our operating systems are standardized, operating system-level configuration files can be backed up directly;
  • Different applications depend on different services, and application-level configuration files will be diversified. Therefore, we first judge whether the application exists according to the backup type, and then perform backup.

5. Implementation

# Check files
[root@test ansible]# ansible-playbook -C data_backup.yml
# Execute playBook, single backup
[root@test ansible]# ansible-playbook -e host_ip=10.10.2.1 data_backup.yml 
Script execution, batch backup
[root@test ansible]# bash ansible_all_data_backup.sh
Copy the code

conclusion

As backup types and sizes increase, a single backup center may become a backup bottleneck due to disk I/O and space problems. Therefore, backup needs to be planned in advance.