A, in this paper,

Project delivery may encounter mixed business scenarios including core transaction (OLTP) and report analysis (OLAP), in which report analysis services have high complexity and consume a large amount of system resources, but have low real-time requirements, while core transaction services have large concurrency and are mostly simple transactions, which have high real-time requirements. When the system is at the service peak, concurrent operations of report analysis services will increase the system load and occupy resources for a long time, which may eventually lead to overall performance cracking. Core transaction services with high real-time requirements cannot be responded due to resource contention, thus affecting the overall customer experience.

The purpose of resource management and control is to properly manage resources and concurrency based on service scenarios and available resources to ensure that databases can run properly in heavy load scenarios and do not become stuck due to resource contention and exhaustion, thus improving overall system throughput.

Ii. Scenario analysis

As shown in the preceding figure, business scenarios are mainly divided into two categories: core transaction (OLTP) and Report analysis (OLAP). Report service has a lower priority and ensures the normal operation of the service system when it is reasonable.

The SQL running in the service system is divided into simple SQL and complex SQL. The concurrent execution of a large number of complex SQL leads to resource contention of the database server. The concurrent execution of simple SQL does not pose continuous pressure on the server and can be completed within a short period of time without causing service accumulation. The SQL running in the report service is mostly complex SQL, and the overall business logic is relatively complex. In the database level, it is necessary to carry out reasonable resource control on the core transaction and report service respectively to ensure the normal operation of the business system.

Iii. Program planning

(1) Static resource pool planning

A static resource pool controls the upper limit of server resources that can be used by a database. To ensure proper running of a server operating system, a certain amount of server resources are reserved. Static resource pool configuration is recommended: The database allocates 93% OF CPU resources and 70% of memory resources. This ensures that the server can properly respond to system requests.

  • Static resource pools allocate 93% of CPU resources and 70% of memory resources.

(2) Transaction users and report users are separated

Report analysis services have lower priorities and lower real-time performance but higher complexity. To effectively control resources, report analysis and core transaction services are separated from each other by database users. For example, the budget_config_user database is used for core transaction services and report_user database is used for report analysis services. For transaction users and report users, CPU resources and concurrency are controlled to ensure the stable operation of the database.

In combination with the report analysis service load survey, daily monitoring, and test verification, complex SQL reports with less than 20 concurrent requests do not cause server resource contention or slow down of the service system. Therefore, report users are configured to use a maximum of 20% of THE CPU resources.

Combined with the load survey, daily monitoring and test verification of the core transaction business, the complex SQL within 50 concurrent will not cause continuous pressure on the system, and the overall CPU load is less than 60%.

  • Transaction users are allocated 60% of the CPU quota and 50 concurrent requests.
  • Report user allocates 20% CPU quota and 20 concurrency.

CPU quota refers to the percentage of occupied CPU time slices. If the CPU quota allocated to a user is not used, the system automatically shares the CPU quota to other users. The CPU quota is the percentage of CPU cores that a user can use. The system converts percentages to specific cores for users to use, and the CPU quota resources available to users do not exceed the range of cores through percentage conversion.

(3) Setting of concurrent control threshold

The concurrency control of resource control is evaluated based on the SQL cost (SQL execution cost). In combination with customer scenarios, hardware configuration, and SQL test analysis, if the SQL cost is less than 1000, SQL concurrency does not pose continuous pressure on the server and can be executed within a short period of time without causing service accumulation. When the SQL cost value is greater than 1000, a large number of concurrent requests compete for server resources and slow down the system.

Therefore, set the cost threshold for controlled SQL to 1000. When the SQL cost value is greater than 1000, the concurrency is controlled by resource control. When the SQL cost value is less than 1000, the concurrency is not controlled by resource control.

  • The cost value that distinguishes SQL from simple is set to 1000

Iv. Implementation Plan

(1) Configure static resource pools

Log in to the O&M management page, configure the static service pool, and set the CPU to 93% and memory to 70%

(2) Separation of database users

Create a transaction user (budget_config_user) and a report user (report_user).

(3) Configure cGroup

Log in to the data server as user omm and run the following command to set the CPU quota:

source /opt/huawei/Bigdata/mppdb/.mppdbgs_profile
gs_ssh -c "gs_cgroup -c -S class1 -s 60"
gs_ssh -c "gs_cgroup -c -S class1 -G wg1 -g 99"
gs_ssh -c "gs_cgroup -c -S class2 -s 20 "
gs_ssh -c "gs_cgroup -u -S class2 -s 20 --fixed"
gs_ssh -c "gs_cgroup -c -S class2 -G wg2 -g 99 "Copy the code

(4) Create resource pools and bind them to Cgroups

Log in to the database server as user omm and run the following command to configure concurrent management:

source /opt/huawei/Bigdata/mppdb/.mppdbgs_profile
gSQL -d postgres -p 25308 -c "Create resource pool rp1 with (mem_percent=0,active_statements=50,control_group='class1:wg1');" gSQL -d postgres -p 25308 -c "create   resource pool rp2 with (mem_percent=0,active_statements=20,control_group='class2:wg2');"Copy the code

(5) Bind a resource pool to a user

Log in to the database server as user omm and run the following command to bind the user to the resource pool:

source /opt/huawei/Bigdata/mppdb/.mppdbgs_profile
gSQL -d postgres -p 25308 -c "alter user budget_config_user resource pool 'rp1';"
gSQL -d postgres -p 25308 -c "alter user report_user resource pool 'rp2';"Copy the code

(6) Modify database parameters and restart the database to take effect

Log in to the database server as user omm and run the following command to modify database parameters:

source /opt/huawei/Bigdata/mppdb/.mppdbgs_profile
gs_guc reload -Z coordinator -Z datanode -N all   -I all -c "parctl_min_cost=1000"
gs_guc set -Z coordinator -Z datanode -N all -I   all -c "enable_dynamic_workload=off"
cm_ctl stop
cm_ctl startCopy the code

5. Test and verification of resource management and control

(a) Test SQL sample

select count(1) from p#fasp_t_glctrl122299 a,p#fasp_t_glctrl122299 b;Copy the code

Print the execution plan as follows. If the cost value is greater than 1000, the concurrency control threshold of resource control has been set to 1000 based on the scheme:

(II) Concurrent verification of transaction users

  • Use the transaction user budget_config_user
  • Using test SQL sample (cost value > 1000)
  • Start 100 concurrent tests

Use budget_config_user to validate 100 concurrent SQL samples. If the number of concurrent SQL tests exceeds 50, the remaining SQL is queued in the pipeline.

(iii) Concurrent verification of report users

  • Use report user report_user
  • Using test SQL sample (cost value > 1000)
  • Start 100 concurrent tests

Verify 100 concurrent SQL samples with Report_user. When the number of concurrent SQL queries reaches 20, the remaining SQL queries are queued for execution in the pipeline.

(4) Simultaneous verification of report users and transaction users

  • Use the transaction user budget_config_user and report user report_user respectively
  • Using test SQL sample (cost value > 1000)
  • Start 100 concurrent tests separately

Use the budget_config_user and report_user to perform 100 concurrent SQL validation, 50 concurrent transactions for transaction users and 20 concurrent reports for report users.

(5) report user quota CPU verification

  • Use report user report_user
  • Using test SQL sample (cost value > 1000)
  • Start 100 concurrent tests

Set the CPU quota to 20%, run Report_user to verify the 100 concurrent SQL sample, and manage resources when the CPU usage reaches 20%.

Set the CPU quota to 30%, run Report_user to verify the 100 concurrent SQL sample, and manage resources when the CPU usage reaches 30%.

(6) TRANSACTION user quota CPU verification

  • Use the transaction user budget_config_user
  • Using test SQL sample (cost value > 1000)
  • Start 100 concurrent tests

When the CPU quota is 60%, the CPU usage can exceed 60%. There is no mandatory CPU limit (different from the CPU quota), and the CPU usage can be flexibly expanded during service peak hours.


Click to follow, the first time to learn about Huawei cloud fresh technology ~