Longtao Jiang, SphereEx Middleware R&D Engineer, Apache ShardingSphere Committer. Mainly responsible for DistSQL and security related feature innovation and development.

LAN Chengxiang, SphereEx Middleware R&d Engineer, Apache ShardingSphere Committer. Currently focusing on DistSQL design and development.

background

Since the release of Apache ShardingSphere 5.0.0-Beta, DistSQL has rapidly gained popularity among developers and operations personnel due to its “dynamic execution with no restart” and beautiful syntax that is close to standard SQL. With iterations of versions 5.0.0 and 5.1.0, the ShardingSphere community has once again added a lot of syntax to DistSQL, bringing many useful features.

This article explains DistSQL’s latest features from a “cluster governance” perspective.

ShardingSphere cluster

A typical cluster composed of ShardingSphere-Proxy generally contains multiple computing nodes and storage nodes, as shown in the figure:

To facilitate communication, in ShardingSphere, Proxy is called a computing node and distributed database resources (such as DS) managed by Proxy are treated

0, ds

1), called resources or storage nodes.

Multiple Proxy compute nodes connect to the same registry, share configurations and rules, and sense each other’s online status. At the same time, these compute nodes share the underlying storage nodes and can simultaneously read and write to the storage nodes. In this case, the user’s application can perform equivalent operations when connected to any compute node.

In this cluster architecture, when computing resources are insufficient, users can quickly scale the Proxy, reduce the risk of single point of failure to some extent, and improve system availability.

You can also add a load balancing mechanism between applications and compute nodes.

Compute Node Management

Compute node governance: applies to Cluster Mode. For more information about the Cluster Mode, see ShardingSphere Mode.

The cluster to prepare

In this example, three Proxy compute nodes are simulated in a single machine. The configuration mode is as follows:

mode:
  type: Cluster
  repository:
    type: ZooKeeper
    props:
      namespace: governance_ds
      server-lists: localhost:2181
      retryIntervalMilliseconds: 500
      timeToLiveSeconds: 60
      maxRetries: 3
      operationTimeoutMilliseconds: 500
  overwrite: false
Copy the code

Execute the startup commands respectively:

sh %SHARDINGSPHERE_PROXY_HOME%/bin/start.sh 3307
sh %SHARDINGSPHERE_PROXY_HOME%/bin/start.sh 3308
sh %SHARDINGSPHERE_PROXY_HOME%/bin/start.sh 3309
Copy the code

With three Proxy instances successfully started, the compute node cluster is ready.

SHOW INSTANCE LIST

Connect to any compute node using a client, for example, 3307:

Mysql -h 127.0.0.1 -p 3307 -u root -pCopy the code

View the list of instances:

mysql> SHOW INSTANCE LIST; +----------------+-----------+------+---------+ | instance_id | host | port | status | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- + | | 10.7.5.35 @ 3309 10.7.5.35 | 3309 | enabled | | | 10.7.5.35 @ 3308 10.7.5.35 | 3308 | enabled | | 3307 | 10.7.5.35 @ 10.7.5.35 | 3307 | enabled | +----------------+-----------+------+---------+Copy the code

The meanings of the parameters are as follows:

Instance_id: indicates the id of an instance. Currently, it consists of host and port.

Host: indicates the IP address of a host.

Port: indicates the port number.

Status: indicates the status of the instance. Enabled and disabled indicate the status of the instance.

DISABLE INSTANCE

The DISABLE INSTANCE statement is used to DISABLE a specified compute node. Note that this directive does not terminate the process of the target instance, just logically deactivate it. DISABLE INSTANCE supports the following syntax:

DISABLE the INSTANCE 10.7.5.35 @ 3308; # or DISABLE INSTANCE IP=10.7.5.35, PORT=3308;Copy the code

Example:

Mysql > DISABLE INSTANCE 10.7.5.35 @ 3308; Query OK, 0 rows affected (0.02sec) mysql> SHOW INSTANCE LIST; +----------------+-----------+------+----------+ | instance_id | host | port | status | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- + | | 10.7.5.35 @ 3309 10.7.5.35 | 3309 | enabled | | | 10.7.5.35 @ 3308 10.7.5.35 | 3308 | disabled | | 3307 | 10.7.5.35 @ 10.7.5.35 | 3307 | enabled | +----------------+-----------+------+----------+Copy the code

After the DISABLE INSTANCE statement is executed and the INSTANCE status of port 3308 is disabled, the compute node is disabled.

If there is a client connected to 10.7.5.35@3308, an exception will be displayed when any SQL is executed:

1000 - Circuit break mode is ON.
Copy the code

💡 Warning: Do not DISABLE the compute node currently receiving the command. If the command is sent to 10.7.5.35@3309, DISABLE INSTANCE 10.7.5.35@3309. You will receive an exception message.

ENABLE INSTANCE

The ENABLE INSTANCE statement is used to ENABLE a specified compute node. ENABLE INSTANCE also supports the following syntax:

The ENABLE INSTANCE statement is used to ENABLE a specified compute node. ENABLE INSTANCE also supports the following syntax:

Example:

mysql> SHOW INSTANCE LIST; +----------------+-----------+------+----------+ | instance_id | host | port | status | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- + | | 10.7.5.35 @ 3309 10.7.5.35 | 3309 | enabled | | | 10.7.5.35 @ 3308 10.7.5.35 | 3308 | disabled | | 3307 | 10.7.5.35 @ 10.7.5.35 | 3307 | enabled | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- + mysql > ENABLE INSTANCE 10.7.5.35 @ 3308; Query OK, 0 rows affected (0.01sec) mysql> SHOW INSTANCE LIST; +----------------+-----------+------+----------+ | instance_id | host | port | status | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- + | | 10.7.5.35 @ 3309 10.7.5.35 | 3309 | enabled | | | 10.7.5.35 @ 3308 10.7.5.35 | 3308 | enabled | | 3307 | 10.7.5.35 @ 10.7.5.35 | 3307 | enabled | +----------------+-----------+------+----------+Copy the code

After the ENABLE INSTANCE statement is executed, you can view that the INSTANCE status of port 3308 is enabled.

Compute node parameter management

In a previous article, SCTL reborn: Into the Arms of RAL, we explain the evolution of SCTL (ShardingSphere Control Language) to RAL (Resource & Rule Administration Language). It also brings new SHOW VARIABLE and SET VARIABLE syntax. However, in 5.0.0-beta, the DistSQL RAL for the VARIABLE category contains only the following three statements:

SET VARIABLE TRANSACTION_TYPE = xx; (LOCAL, XA, BASE)
SHOW VARIABLE TRANSACTION_TYPE;
SHOW VARIABLE CACHED_CONNECTIONS;
Copy the code

In a lot of customer feedback, we found that querying and modifying the props configuration of the Proxy (in server.yaml) was also a frequent operation. As a result, starting with the 5.0.0 GA version, DistSQL RAL has added support for props configuration.

SHOW VARIABLE

Let’s review the props configuration first:

props:
  max-connections-size-per-query: 1
  kernel-executor-size: 16  # Infinite by default.
  proxy-frontend-flush-threshold: 128  # The default value is 128.
  proxy-opentracing-enabled: false
  proxy-hint-enabled: false
  sql-show: false
  check-table-metadata-enabled: false
  show-process-list-enabled: false
    # Proxy backend query fetch size. A larger value may increase the memory usage of ShardingSphere Proxy.
    # The default value is -1, which means set the minimum value for different JDBC drivers.
  proxy-backend-query-fetch-size: -1
  check-duplicate-table-enabled: false
  proxy-frontend-executor-size: 0 # Proxy frontend executor size. The default value is 0, which means let Netty decide.
    # Available options of proxy backend executor suitable: OLAP(default), OLTP. The OLTP option may reduce time cost of writing packets to client, but it may increase the latency of SQL execution
    # and block other clients if client connections are more than `proxy-frontend-executor-size`, especially executing slow SQL.
  proxy-backend-executor-suitable: OLAP
  proxy-frontend-max-connections: 0 # Less than or equal to 0 means no limitation.
  sql-federation-enabled: false
    # Available proxy backend driver type: JDBC (default), ExperimentalVertx
  proxy-backend-driver-type: JDBC
Copy the code

Users can now make interactive queries using the following syntax:

SHOW VARIABLE PROXY_PROPERTY_NAME;
Copy the code

Such as:

mysql> SHOW VARIABLE MAX_CONNECTIONS_SIZE_PER_QUERY; +--------------------------------+ | max_connections_size_per_query | +--------------------------------+ | 1 | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + 1 row in the set (0.00 SEC) mysql > SHOW VARIABLE SQL_SHOW; + -- -- -- -- -- -- -- -- -- -- + | sql_show | + -- -- -- -- -- -- -- -- -- -- + | false | + -- -- -- -- -- -- -- -- -- -- + 1 row in the set (0.00 SEC)...Copy the code

💡 Tip: In DistSQL syntax, parameter names are separated by underscores, unlike in files.

SHOW ALL VARIABLES

Because the Proxy has a large number of parameters, you can also query ALL parameter values by running SHOW ALL VARIABLES:

mysql> SHOW ALL VARIABLES; +---------------------------------------+----------------+ | variable_name | variable_value | +---------------------------------------+----------------+ | sql_show | false | | sql_simple | false | | kernel_executor_size | 0 | | max_connections_size_per_query | 1 | | check_table_metadata_enabled | false | | proxy_frontend_database_protocol_type | | | proxy_frontend_flush_threshold | 128 | | proxy_opentracing_enabled | false |  | proxy_hint_enabled | false | | show_process_list_enabled | false | | lock_wait_timeout_milliseconds | 50000 | | proxy_backend_query_fetch_size | -1 | | check_duplicate_table_enabled | false | | proxy_frontend_executor_size | 0 | | proxy_backend_executor_suitable | OLAP | | proxy_frontend_max_connections | 0 | | sql_federation_enabled | false | | proxy_backend_driver_type | JDBC | | agent_plugins_enabled | false | | cached_connections | 0 | | transaction_type | The LOCAL | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + 21 rows in the set (0.01 SEC)Copy the code

SET VARIABLE

Dynamic management of resources and rules is DistSQL’s unique advantage. Now you can also update the props parameters dynamically using the SET VARIABLE statement. Such as:

SET VARIABLE SQL_SHOW = true; SET VARIABLE PROXY_HINT_ENABLED = true; SET VARIABLE SQL_FEDERATION_ENABLED = true; ...Copy the code

💡 tip:

  • The following parameters can be modified using the SET VARIABLE statement, but the new values take effect only after the Proxy restarts:

    kernel

    executor

    size

    proxy

    frontend

    executor_size

    proxy

    backend

    driver_type

  • The following parameters are read-only and cannot be modified:

cached_connections

  • Other parameters that are not specified take effect immediately after modification.

Storage Node Management

In ShardingSphere, storage nodes are not directly bound to compute nodes. A storage node may play different roles in different schemas to implement different service logic. Therefore, a storage node is always associated with a logical library.

In DistSQL, storage nodes are managed using resource-related statements, including:

  • ADD the RESOURCE;

  • ALTER the RESOURCE;

  • DROP the RESOURCE;

  • SHOW the SCHEMA of RESOURCES.

Logical library Preparation

RESOURCE statements only apply to logical libraries, so you need to create and successfully select a logical library using the USE command before operating:

DROP DATABASE IF EXISTS sharding_db;

CREATE DATABASE sharding_db;

USE sharding_db;
Copy the code

ADD RESOURCE

The ADD RESOURCE syntax supports the following forms:

  • Specify HOST, PORT, and DB

    ADD RESOURCE resource_0 (HOST=127.0.0.1, PORT=3306, DB=db0, USER=root, PASSWORD=root);

  • The specified URL

    ADD the RESOURCE resource_1 [URL = “JDBC: mysql: / / 127.0.0.1:3306 / db1? serverTimezone=UTC&useSSL=false”, USER=root, PASSWORD=root );

Also, both syntax forms support the extended parameter PROPERTIES. This parameter is used to configure the properties of the connection pool between Proxy and storage node. For example:

ADD RESOURCE resource_2 (HOST=127.0.0.1, PORT=3306, DB=db2, USER=root, PASSWORD=root, The PROPERTIES (" maximumPoolSize "= 10)), resource_3 [URL =" JDBC: mysql: / / 127.0.0.1:3306 / db3? serverTimezone=UTC&useSSL=false", USER=root, PASSWORD=root, PROPERTIES("maximumPoolSize"=10,"idleTimeout"="30000") );Copy the code

💡 Tip: JDBC connection parameters, such as useSSL, can be specified only in URL format.

ALTER RESOURCE

The ALTER RESOURCE command is used to modify connection information of existing storage nodes, such as changing the connection pool size or modifying JDBC connection parameters.

Syntactically, ALTER RESOURCE is identical to ADD RESOURCE, for example:

ALTER RESOURCE resource_2 (HOST=127.0.0.1, PORT=3306, DB=db2, USER=root, The PROPERTIES (" maximumPoolSize "= 50)), resource_3 [URL =" JDBC: mysql: / / 127.0.0.1:3306 / db3? serverTimezone=GMT&useSSL=false", USER=root, PASSWORD=root, PROPERTIES("maximumPoolSize"=50,"idleTimeout"="30000") );Copy the code

💡 Warning: Metadata may be changed or application data may be abnormal if a storage node is modified. ALTER RESOURCE cannot modify the target DB of a connection, but only the following information can be modified:

  • The user name

  • The user password

  • PROPERTIES Connection pool parameters

  • JDBC parameters

DROP RESOURCE

DROP RESOURCE Is used to DROP a storage node from a logical library without deleting any data from the storage node. The syntax is as follows:

DROP RESOURCE resource_0, resource_1;
Copy the code

💡 Warning: To protect correct data, storage nodes referenced by rules cannot be deleted.

Such as t

Order is a sharded table whose actual tables are distributed over resource

0 and the resource

1, then resource

0 and the resource

1 即被 t

The sharding rule reference of order cannot be deleted.

SHOW SCHEMA RESOURCES

DROP RESOURCE resource_0, resource_1; SQL > select * from 'SHOW SCHEMA RESOURCES'; SHOW SCHEMA RESOURCES FROM sharding_db;Copy the code

Example: After adding four storage nodes using the previous ADD RESOURCE command, run the query operation:

Because the query result has a large number of columns, only part of the query result is intercepted.

This is how you can manage storage nodes dynamically using DistSQL. In contrast to modifying YAML files, DistSQL executes in real time without restarting the Proxy compute node, making online services more secure.

At the same time, changes executed through DistSQL are synchronized to other compute nodes in the cluster in real time through the registry, and clients connected to any compute node can see changes to the storage node in real time.

This is the charm of cluster governance.

conclusion

That’s all for this sharing. If you have any questions or suggestions about Apache ShardingSphere, please feel free to post them on the GitHub Issue list, or go to the Chinese community for discussion. Making issue:

Github.com/apache/shar…

Contribution Guidelines:

Shardingsphere.apache.org/community/c…

Chinese Community:

community.sphere-ex.com/

Reference Information:

1. ShardingSphere – Proxy quick start: shardingsphere.apache.org/document/5….

2. DistSQL RDL:shardingsphere.apache.org/document/cu…

3. DistSQL RQL:shardingsphere.apache.org/document/cu…

4. DistSQL RAL:shardingsphere.apache.org/document/cu…