preface

The previous article covered the concept of RPC services and the basic use of gRPC, using the proto syntax tutorial. However, when we really deploy gRPC service to the production environment, we will encounter many problems, the first to consider is the protocol data authentication problem, secondly, gRPC also supports streaming communication mode, this article will make an introduction.

RPC authentication

RPC services are generally used in the service Intranet, but there are ALSO RPC services in the Internet, but no matter what kind of RPC services, http2.0 or other TCP based socket protocol, in the deployment of the production environment or the need to add identity encryption authentication mechanism, the client and server need to do a trusted authentication. To ensure data security.

Common encryption authentication method for RPC services

Channel encryption based on SSL/TLS Generally, the authentication mechanism encrypts the transport channel over TLS/SSL to prevent the leakage of sensitive data in the request and response messages. There are three main scenarios used:

  • Back-end micro-services, such as mobile App, TV, and multi-screen, are directly opened to the end side. There is no unified API Gateway/SLB for secure access and authentication.
  • The back-end microservice is directly open to the management or operation and maintenance Portal deployed in the DMZ.
  • Back-end micro services are open directly to third party partners/channels. In addition to the common cross-network scenarios, in some service scenarios with high security requirements, transmission channels are required to be encrypted as long as the communication is between hosts, VMS, and containers, even on the Intranet. In this scenario, SSL/TLS is required even if only RPC calls of modules on the Intranet exist.

The following is a typical scenario for using SSL/TLS:

At present, the most widely used SSL/TLS tool/library is OpenSSL, which is a security protocol to provide security and data integrity for network communication, including the main password algorithm, common key and certificate encapsulation management functions and SSL protocol.

Most SSL encryption sites use an open source software package called OpenSSL. Since it is the most widely used secure transmission method on the Internet, it is widely used by online banking, online payment, e-commerce sites, portals, email and other important websites.

Separate encryption for sensitive data Some RPC calls do not involve the transmission of sensitive data, or the proportion of sensitive fields is small. To maximize throughput and reduce call delay, HTTP/TCP + separate encryption for sensitive fields is usually used to ensure the security of sensitive information transmission. It also reduces the performance cost of using SSL/TLS encryption channels, especially for JDK native SSL libraries.

It works as follows:

Generally, the Handler interception mechanism is used to uniformly intercept request and response messages, and encrypt and decrypt sensitive fields according to annotations or encryption and decryption identifiers to avoid service intrusion.

There are two main disadvantages of adopting this scheme:

  • The identification of sensitive information may be biased, easily omitted or over-protected. Laws and regulations on data and privacy protection need to be interpreted. Moreover, different countries have different definitions of sensitive data, which will bring a lot of difficulties in identification.
  • Interface upgrade is prone to omissions, such as the development of new fields and forgetting to identify sensitive data.

GRPC encryption authentication method

For gRPC, SSL/TLS is also a basic authentication method for identity encryption. SSL/TLS uses public key encryption. The client requests the public key from the server and encrypts information using the public key.

SSL/TLS SSL/TLS is classified into one-way authentication and bidirectional authentication. In actual services, one-way authentication is commonly used, that is, the client authenticates the server, but the server does not authenticate the client. The certification process is as follows:

  • The client sends the version number of the CLIENT SSL protocol, supported encryption algorithms, random numbers, and other optional information to the server.
  • The server sends a handshake reply to confirm the VERSION number of the SSL protocol, type of encryption algorithm, random number, and other related information to the client.
  • The server sends its own public key to the client.
  • The client to the server certificate authentication, legitimacy check include: server certificate has expired, the issuing of the server certificate of the CA is reliable, and public key certificate issuer can correctly solve the server certificate “digital signatures” issuers, whether the domain name and server on the server’s certificate of actual domain matching, etc.
  • The client randomly generates a “symmetric password” for subsequent communication, encrypts it with the public key of the server, and sends the encrypted “pre-master password” to the server.
  • The server will unlock the encrypted “pre-master password” with its own private key and then perform a series of steps to generate the master password;
  • The client sends a message to the server, indicating that the master password will be used as the symmetric key for the subsequent data communication, and notifies the server that the handshake process is over.
  • The server sends a message to the client, indicating that the master password will be used as the symmetric key for the subsequent data communication, and notifies the client that the handshake process on the server is over.
  • The SSL handshake is completed, the SSL secure channel is established, and the client and server use the same symmetric key to encrypt data and then transmit data through the Socket.

Practice gRPC TLS authentication

We can simply through an example to practice gRPC server and client TLS authentication mechanism, which also includes certificate generation, server, client initialization and so on.

Generate a certificate

openssl req -newkey rsa:2048 -nodes -keyout server.key -x509 -days 3650 -out server.crt
Copy the code

During the certificate generation process, You need to set Country Name, State or Province Name, Locality Name, Organization Name, Organizational Unit Name, Common Name, and Email Address, etc., these can be filled in as needed, or they can be left blank. Note: The Common Name must be filled in after you define it. This Name can be specified when the client connects, otherwise it may not be automatically obtained if it is left blank

We can define Common Name as rpc_service and press Enter to generate server.key and server. CRT files

The gRPC service is deployed

Follow the proto file definition from the previous article (blog.csdn.net/dream_succe…) Implement server-side code server.py:

from concurrent import futures
import time
import grpc
import test_pb2
import test_pb2_grpc

# Implement the SearchService defined in the proto file
class RequestRpc(test_pb2_grpc.SearchService):
    Implement RPC calls defined in the proto file
    def doRequest(self, request, context):
        return test_pb2.Search(query = 'hello {msg}'.format(msg = request.name)) The # return data is in the defined SearchResponse format

def serve():
    The maximum receive and send sizes (in M) can be defined. The default size is only 4M
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10), options=[
        ('grpc.max_send_message_length', 100 * 1024 * 1024),
        ('grpc.max_receive_message_length', 100 * 1024 * 1024)])
    
    test_pb2_grpc.add_SearchServiceServicer_to_server(RequestRpc(), server)
    server.add_insecure_port('[: :] : 50051')
    server.start()
    try:
        while True:
            time.sleep(60*60*24) # one day in seconds
    except KeyboardInterrupt:
        server.stop(0)

if __name__ == '__main__':
    serve()
Copy the code

Client code client.py:

import grpc
import helloworld_pb2
import helloworld_pb2_grpc

def run():
    Connect to the RPC server
    channel = grpc.insecure_channel('localhost:50051')
    Call the RPC service
    stub = test_pb2_grpc.SearchServiceStub(channel)
    response = stub.doRequest(test_pb2.SearchRequest(query='henry'))
    print("client received: ", response)

if __name__ == '__main__':
    run()
Copy the code

Adding TLS Authentication

CRT/server.key/server.py/server.py/server.py/server.py/server. CRT/server.py/server.py/server.py/server.py/server.py/server.py/server.py/server.py/server.py/server.py/server.py/server.py

from concurrent import futures
import time
import grpc
import test_pb2
import test_pb2_grpc

# Implement the SearchService defined in the proto file
class RequestRpc(test_pb2_grpc.SearchService):
    Implement RPC calls defined in the proto file
    def doRequest(self, request, context):
        return test_pb2.Search(query = 'hello {msg}'.format(msg = request.name)) The # return data is in the defined SearchResponse format

def serve():
    The maximum receive and send sizes (in M) can be defined. The default size is only 4M
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10), options=[
        ('grpc.max_send_message_length', 100 * 1024 * 1024),
        ('grpc.max_receive_message_length', 100 * 1024 * 1024)])
    
    test_pb2_grpc.add_SearchServiceServicer_to_server(RequestRpc(), server)
    
    server.add_insecure_port('[: :] : 50051') # comment out the unsafe way to start the service and add the following:
    # read in key and certificate
        with open('server.key'.'rb') as f:
            private_key = f.read()
        with open('server.crt'.'rb') as f:
            certificate_chain = f.read()

        # create server credentials
        server_credentials = grpc.ssl_server_credentials(
            ((private_key, certificate_chain,),))
        server.add_secure_port('[: :] : 50051', server_credentials)
    
    server.start()
    try:
        while True:
            time.sleep(60*60*24) # one day in seconds
    except KeyboardInterrupt:
        server.stop(0)

if __name__ == '__main__':
    serve()
Copy the code

The client logic is as follows:

import grpc
import helloworld_pb2
import helloworld_pb2_grpc

def run():
    Read the certificate
    with open('server.crt'.'rb') as f:
        trusted_certs = f.read()
    credentials = grpc.ssl_channel_credentials(
        root_certificates=trusted_certs)
        
    The COMMON NAME of the certificate is rpc_service
    channel = grpc.secure_channel("{}, {}".format('localhost', 50051), credentials,
        options=(('grpc.ssl_target_name_override'."rpc_service",),
                 ('grpc.max_send_message_length', 100 * 1024 * 1024),
                 ('grpc.max_receive_message_length', 100 * 1024 * 1024))Call the RPC service
    stub = test_pb2_grpc.SearchServiceStub(channel)
    response = stub.doRequest(test_pb2.SearchRequest(query='henry'))
    print("client received: ", response)

if __name__ == '__main__':
    run()
Copy the code

GRPC service with TLS authentication can be implemented

Flow communication of gRPC

The way of streaming communication

GRPC, like HTTP communication, is a communication based on the “request response” mode. Based on different business scenarios, it can be divided into:

  1. The server responds to a single request by the client:
  2. The server streams a response to a request from the client (essentially sending multiple pieces of data back to the client)
  3. The client streams the request and the server responds once
  4. Streaming client request, the server streaming response Similar to establish a socket connection, make a chat session, the server can also be a lot of data to the client push, such as the client when initial connection to send a short message or message id synchronization data to the server, the server can you give the client to synchronize data streaming to pass to the client.

The concrete realization of flow communication

For example, if we want to implement a client to try to get the latitude and longitude of a location from the server, we can do this through this streaming process. Server.start () adds this stream to get latitude and longitude:

def ListFeatures(self, request, context):
  left = min(request.lo.longitude, request.hi.longitude)
  right = max(request.lo.longitude, request.hi.longitude)
  top = max(request.lo.latitude, request.hi.latitude)
  bottom = min(request.lo.latitude, request.hi.latitude)
  for feature in self.db:
    if (feature.location.longitude >= left and
        feature.location.longitude <= right and
        feature.location.latitude >= bottom and
        feature.location.latitude <= top):
      yield feature
Copy the code

Python uses the yield keyword to implement a generator, streaming response. The client gets the streaming response from the server step by step through a for loop or with a generator:

for feature in stub.ListFeatures(rectangle):
Copy the code

When to use Streaming RPC

  1. Large packet
  2. Real-time scene

GRPC exception handling

With the rapid development of the Internet, Internet service is no longer a single application, but a micro service composed of several modules, each module can be independently expanded, reduced, independent online deployment and so on. Modules communicate with each other through the network. Our application must handle network errors properly. For example, the network jitter occurs, and the peer machine that is communicating just comes online again.

GRPC exception type

GRPC has its own set of error codes similar to HTTP status code, each of which is a string, such as INTERNAL, ABORTED, and UNAVAILABLE. We often encounter an error cause analysis such as “StatusCode=Unavailable, Detail=”failed to connect to all addresses”

  1. The RPC client request does not reach the server, for example, network resolution jitter or cloud host IP address change
  2. The server initialized instance has a problem and cannot process client requests
  3. The connection between the RPC client and the server is down or incomplete

For the above situation, we need to add reconnection mechanism, retransmission mechanism to ensure service reliability.

Reconnection mechanism

We can add try except mechanism to catch exceptions when connecting gRPC service. Once abnormal connection is found, we can try to retry the connection. In flask framework or Tornado framework, we can write the connection method as a singleton, and then add aggravation link into the request exception processing mechanism encapsulated in the framework. Flask, for example, can define:

@app.errorhandler(Exception)
def handle_exception(e):
	If the RPC connection fails, execute the reconnect code
Copy the code

If tornado can be defined:

    def log_exception(self, typ, value, tb):
        ifIssubclass (typ RpcConnectError) :Add the code to reconnect the RPC here
Copy the code

This ensures that RPC connection exceptions on your system are reconnected to meet reliability requirements

Retry mechanism

The gRPC channel can be initialized with the options parameter. We specified the maximum number of bytes to send and receive. We can also specify the automatic retry configuration, see github.com/grpc/propos… Transparent retry: When the application logic on the server does not receive the request, the gRPC automatically retries the request. Transparent retry can solve the first and second points in the appeal cause analysis. We can also configure the retry by configuring the retryPolicy parameter of service config.

Implementation cases:

options = [('grpc.max_send_message_length', 100 * 1024 * 1024),
                       ('grpc.max_receive_message_length', 100 * 1024 * 1024),
                       ('grpc.enable_retries', 1),
                       ('grpc.service_config'.'{"retryPolicy":{"maxAttempts": 4, "initialBackoff": "0.1s", "maxBackoff": "1s", "backoffMutiplier": 2, "retryableStatusCodes": [ "UNAVAILABLE" ] } }')]
            channel = grpc.insecure_channel("{}, {}".format('localhost', 50051),
                                            options=options)
Copy the code

The above code turns on grpc.enable_retries, which is enabled by default but can be set to 0 to turn off transparent retry. Another GRPC. Service_config is a configuration, we can configure the retry strategy (parameter configuration may refer to: GRPC. Making. IO/GRPC/core/g… :

{
    "retryPolicy": {"maxAttempts": 4."initialBackoff": "0.1 s"."maxBackoff": "1s"."backoffMutiplier": 2."retryableStatusCodes": [
            "UNAVAILABLE"]}}Copy the code

For retries such as UNAVAILABLE, you can specify the number of retries and so on. For details about the parameters, see the official website.

  • MaxAttempts must be an integer greater than 1, and values greater than 5 are treated as 5
  • InitialBackoff and maxBackoff must be specified and must have a value greater than 0
  • BackoffMultiplier must be specified and greater than zero
  • RetryableStatusCodes must be formulated as status codes for data, cannot be empty, and no status codes must be valid gPRC status codes, can be integers, and are case insensitive

Hedging strategy

Hedging is the active sending of multiple requests in a single invocation without waiting for a response. If a method uses a hedging strategy, the first request is sent as a normal RPC call, and if there is no response within the configured time, the second request is sent directly, and so on until the maxAttempts are sent

Implement case

options = [('grpc.max_send_message_length', 100 * 1024 * 1024),
                       ('grpc.max_receive_message_length', 100 * 1024 * 1024),
                       ('grpc.enable_retries', 1),
                       ('grpc.service_config'.'{ "hedgingPolicy":{ "maxAttempts": 4, "hedgingDelay": "0.1s ", "nonFatalStatusCodes":["UNAVAILABLE", "INTERNAL", "ABORTED"]}}')]
            channel = grpc.insecure_channel("{}, {}".format('localhost', 50051),
                                            options=options)
Copy the code

Note: When using hedging, requests may access different back ends (if load balancing is set up), which requires the method to be safe and expected for multiple executions

Current limiting retry

When the ratio of client failures to success exceeds a certain threshold, the gRPC disables these retry policies to prevent server overload due to retry. This is the retry traffic limiting policy. Can also be configured in servie config:

"retryThrottling": {"maxTokens": 10,
    "tokenRatio": 0.1}Copy the code

For each server, the gRPC client maintains a token_count variable, initially set to maxToken, with values ranging from 0 to maxToken

Token_count has an effect on each RPC request

  • Each failed RPC request decrements token_count 1
  • Successful RPC will increment toKEN_count and tokenRatio If token_count <= (maxTokens/2), retry policy will be turned off until token_count > (maxTokens/2)

If token_count <= (maxTokens/2), the retry request will be cancelled. And the status code is returned to the caller

The actual traffic limiting parameters are determined by the server performance resources.

reference

  • zhuanlan.zhihu.com/p/35914545
  • Segmentfault.com/a/119000001…
  • Blog.csdn.net/DAGU131/art…
  • Jiangew. Me/GRPC – 05 / # 1 -…
  • Razeencheng.com/post/how-to…