In the previous game Development Experience (part 1) : Five Hidden Holes in Game Architecture and How to Deal with them, we mainly explained some hidden holes in game architecture design and how to deal with them. If you missed it, you can go back to the previous content. This issue will focus on the design ideas and technical implementation of global service games.

Battle game design ideas

Protocol selection

At the beginning of game design, you need to decide which protocol to choose for communication. For battle games, the first thing to recommend is UDP.

Despite the UDP have higher request for the development infrastructure, need developers to achieve transmission successful inspection, retransmission and reliability assurance, but relative to the low cost of development of TCP, UDP has great advantage in efficiency and timeliness, it can be short interval retransmission, according to business characteristics or directly send the plural package to improve the success rate of weakly connected state.

In addition, compared with UDP’s low delay, TCP’s retransmission time is in the second level. For battle games, TCP’s retransmission speed will undoubtedly cause very bad player experience. Although BBR and other technologies are available, they can only achieve more efficient bandwidth utilization, and the actual business response efficiency is not particularly improved.

Of course, in addition to technical analysis, the actual situation of the company is also an important factor to be considered. Generally speaking, THERE is no problem using TCP for COC, KOA(King of Avalon) and other map games with less interaction, while CR (Clash Royal), Battle of Liberty, national gunfight, bullets and bullets, such as MOBA, FPS and other battle games with strong networking requirements, UDP is basically necessary.

Synchronization mechanism

Frame synchronization: Frame synchronization is the best choice for fast paced mobile battle games that want to reduce unnecessary load on the server. The advantage of frame synchronization is that the upload information can be simplified, just do some simple data summary report, requiring very little bandwidth. State synchronization: high security, but state synchronization requires much more bandwidth than frame synchronization, and global service games themselves are facing a very severe global network environment, and even in many cases need dedicated lines to solve network problems, at this time, the network overhead will be relatively large cost. In a nutshell: If you want to make global games, learn frame syncing.

Finally, a brief description of the features of battle games: 1) Because of the use of UDP and frame synchronization, the amount of packet interaction is huge; 2) Player snapshot and frame data storage requires high performance and large capacity memory storage; 3) High network stability requirements; 4) The architecture is mainly modularized, and some can even be two-site and three-center disaster recovery, requiring agile deployment.

In general, combat games have high requirements for system architecture. Cloud platform products are designed to help users solve these problems.

Network Architecture Design

So, how do we provide network support for battle class strong Internet users? First of all, in terms of network quality, we use self-established BGP and our own AS number to directly connect routes with operators such AS China Unicom and China Mobile. Compared with the middleman, this approach has better coverage quality, disaster tolerance and high availability of the network.

In addition, we separate the computer room from the network. The network exits are independent, and they go to different exit POP points through different physical paths. The POP points are the places where UCloud builds its own BGP and establishes connections with operators. When users deploy on UCloud, they take advantage of the availability area architecture.

This way, all the rooms can be pooling resources and through the shuttle through a network, provided free of charge for the user to compensate the room in the same area of the network communication, convenient for users to realize the high availability disaster architecture across the room and at the same time, many POP point can also avoid physical path problem and/or local metropolitan area network operators fault (such as local Beijing), Ensure continuous high availability of equipment room exits.

UCloud cloud platform based on the game architecture example

The figure below is a global service game architecture based on UCloud cloud platform. First, the data storage layer directly uses high availability database and Redis to reduce operation and maintenance costs and ensure business availability. On the logic of the native product, the backend strengthens the functions such as semi-synchronization, and strengthens the data consistency, security, and query performance without changing any user habits.

In addition, the server area adopts the overall frame cluster + high-performance load balancing access mode, TCP layer 4 packet forwarding, to ensure reliable performance under large concurrency; And the battle server uses the registration mechanism, the room management server is mainly from high availability, because the battle uses UDP+ frame synchronization mode, the packet quantity may reach 5W-8W or even more, so the network enhancement cloud host is directly used to carry.

At the global level, the database and load balancer use UCloud high availability products and clusters for cross-availability zone Dr. Service servers are deployed in half in THE D/E availability zones to ensure the room-level Dr Capability of services.

According to the characteristics of battle games, the following points should be paid attention to in the architecture design of battle games: 1) service clustering. If you want to make a global battle game, the server needs to cluster high availability, if the rolling mode, operation and maintenance costs and players across the cost of battle will be very high; 2) Service modularization. When the function is split, it is easy to manage and expand, and maximize efficiency and save cost. 3) Business automation. Helps reduce operation and maintenance costs, and enables rapid expansion and agility in case of business emergencies.

Global service technology implementation

Global server games will amplify the theme of fighting with people. With the addition of national war and other national characteristics of gameplay, global server based battle games can well stimulate the sense of belonging of players, and enhance players’ engagement to the game and active fighting. When it comes to technology, global server battle games are much more difficult to implement than divisional battle games, mainly for the following three reasons:

Network: Different modes of battle games have different requirements for network performance, but in order to ensure transmission performance, UDP protocol is generally used to achieve reliable service transmission. Code: The core architecture may not be that different from a partitioned battle game, but there are more considerations in network design, deployment architecture patterns, network latency, etc. Architecture: Whether deployed centrally or distributed, the architecture’s local carrying capacity and modular design must be considered to cope with the global influx of players.

network

China’s actual export of public networks mainly includes China Telecom 163, China Unicom, China Mobile and China CN2. In terms of total bandwidth, China Telecom is the largest, China Unicom is second, and China Mobile is the smallest. In terms of the actual utilization rate, The export of China Telecom 163 is congested all the year around with a availability rate of less than 80%. China Unicom Mobile is a little better, but its capacity to deal with network fluctuations is not optimistic due to its small export volume.

CN2 is the international export specially opened by China Telecom for enterprise customers, which is also the best international export network in China at present, but even CN2 is not completely stable. According to the monitoring records, CN2 link still has several long time jitter every month, and there is still a risk if it just catches up with the game push.

The safest solution is to use private lines, which have SLA, availability guarantee, high stability, low latency, zero packet loss and other features. However, the cost is much higher than that of public networks.

UCloud also uses its own BGP technology in the overseas market, and is based on BGP+ overseas private lines to ensure the optimal access quality. Its route-based positioning accuracy is much higher than CDN intelligent DNS.

In addition, in terms of operation and product security, we have copied the domestic model overseas and optimized it according to the situation of different computer rooms and regional characteristics. The overseas computer room architecture and cloud product architecture are globally synchronized with that of China to ensure the consistency of customer experience and the standard of service.

code

Previously, we were approached by a customer who wanted a guaranteed network optimization acceleration solution to implement global service and required that the entire acceleration process be business-insensitive and non-intrusive. In short, network acceleration is achieved without changing any code. To this end, we carried out a series of technical research and program design, PathX program was born.

In the preliminary design and implementation, we carried out in-depth research on the two methods of layer 3, 4 network forwarding and grassroots proxy. During the research process, we found that using the grassroots proxy method would interrupt TCP connection, and at the same time, there would be the situation that services could not be forwarded in the process of use, which is the so-called “fake death”. Through comparison, we finally choose the three-tier network forwarding scheme, and make a relatively broad protocol support architecture.

Subsequently, we also iterated on THE UDP battle requirements of CR, and designed the load-forwarding mode of UDP+ frame synchronization high-packet traffic by integrating DPDK and high-packet technology on the original basis. With the advent of the Global server era, we iterated these features into the PathX product, which is now version 2.0.

architecture

In the case of global service, massive user data needs centralized access, processing and analysis. In the field of big data, Hadoop is undoubtedly the most economical big data solution. However, the threshold of using Hadoop is very high, requiring a maintenance team of at least 7-8 people. However, the relatively simple common database such as MySQL cluster is not very ideal in terms of performance and cost performance.

In order to meet users’ requirements for high-performance big data analysis, we developed UDW, a data warehouse solution based on PostgreSQL, with high-performance data storage capacity of PB level. In addition, we differentiated storage intensive and computational intensive according to users’ different requirements. It can be used in scenarios with large amount of data or high requirement for real-time computing.

Below is a fairly standard global server game architecture. First, users deploy core business servers in the United States, including databases, player nodes, big data, login services, etc. Then through the global acceleration program, to provide players with a stable quality of game services. Some users, such as FPS game manufacturers, will deploy an additional small micro node overseas to ensure minimal latency and stability for players.

In architectural design, another important point is the use of keyframes, keyframes and game prediction will seriously affect the user’s requirements for the game. For example, if a user asks for a delay of 60 milliseconds or less, the game will not be able to cover any area with a delay of more than 60 milliseconds, and the player will drop directly.

In deployment of global game, in addition to consider the effect of network delay on the players, players are pouring into the game with problems such as, the influence of core node to measure to meet the requirements of the game, the best access path, which of players in different regions of the area where players can connect to the server, and so on, this is a problem need to be in the early game design plan.

Some thoughts and suggestions on global server game design

Cloud business, RESEARCH and development, operation and maintenance, although the division of labor is different, but are an indispensable part of the project team. In my experience, there is usually very little communication between operations and R&D in the early stages of a project. As a result, people often blame each other when something goes wrong, with the ops thinking it’s the code, and the dev thinking it’s the OPS not doing a good job, which is bad for a game project.

From the point of view of the project, it is suggested that cloud providers, R&D, operation and maintenance can achieve in-depth cooperation with each other. Cloud providers can provide the most appropriate products and solutions according to the demands of game users. Operation and maintenance is the core personnel to ensure the long-term stable operation of the whole game. Operation and maintenance is very good at how to make the business highly available and how to facilitate the maintenance in the later period. Research and development is the cornerstone of the entire project, and the implementation of the code will largely solidify the operation and maintenance of a game.

In the early stage of project construction, the three parties should not be limited to their own fields, and cooperate and open to each other. Under the condition of the project to allow, research and development design framework can joint operations, the realization of the public cloud architecture personnel assessment game together, as far as possible in the early considering the system availability, stability and resistance to pressure, such ability from the Angle of technology to avoid many unnecessary detours or errors, such as the center of the article said the single point problems, Achieve long-term business development.

For more technical and event information, please follow the wechat public account of UCloud Technology Bulletin Board; Or search wechat ID: ucloud_tech to follow.