Guide language | become architects is indispensable to programmers advanced a path, especially in today’s increasingly intelligent society, for each programmer architecture ability put forward new requirements. This article is the sharing and arrangement of Ma Wenshuang, director of Tencent Cloud storage and virtualization, Wang Chao, General manager of Shell Housing Basic Platform & the most valuable expert of Tencent cloud “TVP”, And Wang Xiaobo, CTO of Ticket business Group of Tongcheng Yilong & the most valuable expert of Tencent cloud “TVP” in cloud + community Salon online, hoping to communicate with you.

[click the video to see complete live playback] (https://cloud.tencent.com/developer/salon/live-1266?channel=salonbanner)

Talk about the evolution of architecture

** Wang Xiaobo: ** Actually architecture evolution, in my opinion, this is just a summary of my career. I used to do infrastructure, middleware and a series of things, but these years I have been doing business architecture and application architecture.

From my perspective, the history of architecture technology evolution can be divided into two parts: one is the application technology architecture part, and the other is the basic technology architecture part. The two evolution modes and key nodes are not quite the same. But application architecture is built on the evolutionary history of infrastructure.

The evolution of architecture can be divided into three stages. Around 2006, the definition of architect in China was not so clear, and many people did not know what an architect was. In the beginning we were just architects who were good enough to plan framework development specifications.

The first generation of architecture evolution is a series of planning and decoupling parts or a series of model building parts centered on a technical programming framework. At that time, a lot of the main things about open source were programming frameworks, more pure programming.

The second generation has entered the state of high concurrency, distributed, and large traffic. At this time, the evolution of the architecture pays more attention to the peripheral infrastructure, that is, whether the relational database is OK, whether the Cache is OK, whether the load is OK, whether the horizontal expansion can be done, whether the distribution is enough, whether the flow control is enough. Or is there enough stable, highly available control……

The evolution of this generation architecture is characterized by an emphasis on infrastructure, where a large amount of resources become infrastructure and applications iterate better on this basis.

The third stage should be data-based application architecture. Application architecture in its current state is more data-driven, that is, more and more data-based mining produces new applications.

Or to put it another way: when technologies like “machine learning” are no longer promoted as an advertising slogan, but actually fall into every corner of the system, then our new architectural challenges will come.

This generation of data driven architecture pay more attention to is for huge amounts of data mining and the real-time application, rapid calculation to large amounts of data, we require to do even faster application development and deployment, this time more features, more data driven and so on a series of things, continuously create new data driven architecture concept.

** From another perspective, architecture develops with the development of the whole industry and the needs of society.

First of all, portal and social era. Around 2000, PC Internet was booming. There were four portals, and the Internet was mainly news content transmission. So distributed content delivery networks, or CDN, were booming in those days. After a few years, SNS social networking products appeared to solve the problem of information transmission between people.

Technology from compiled language, gradually transition to the wide application of dynamic interpretive language, easy to write some complex business logic relational code, at the same time, like relational database also began to be widely used.

In addition, because the relational network is very complex, to meet the performance requirements, began to apply a large number of caching, to make up for some scenarios of insufficient access ability of the relational database.

At that time, there was a rapid development of the technology is search engine, comprehensive search is to solve the needs and problems of information to be found quickly, but because it is the whole network search, there are very high requirements for storage, computing, this time presented a distributed system and the prototype of NLP, to further improve the engineering ability.

The second stage is the mobile Internet stage, that is, from PC to mobile phone and then to a variety of terminals, people’s understanding of the Internet gradually deepened, traffic began to increase exponentially.

More storage and computing power is needed, so cloud computing is widely proposed, with the emergence of many large scale clusters.

Finally, the rapid development of consumer Internet and industrial Internet, engineering capability in the last period has been very well solved, before the widespread application of IoT, there will be no more exponential terminal device networking, basic engineering capability is no longer a problem, the main technological development will focus on big data, AI architecture. For example, GPU clusters, elastic computing, and machine learning frameworks are becoming more and more important in solving complex relational graphs with graph databases.

So in my opinion, architecture technologies evolve and evolve because society and user needs change.

** Wenshuang Ma: Teacher ** Xiaobo and Teacher Wang Chao have made a very good summary and summary of the evolution of Internet products and technology architecture. I’d like to talk to you about the evolution of cloud hard disk technology and some of our thoughts and lessons learned here.

When we designed cloud hard disk in 2013, we thought it was a distributed storage system. Tencent has accumulated years of experience in distributed system, and distributed file system TFS, KV storage and CKV are very mature products. With mature and stable products to support the cloud disk reliability is good, can quickly go online, gathered Tencent’s three star products TFS, TSSD and CKV, designed the cloud disk 1.0.

TFS, TSSD and CKV, these systems have very good availability, and then use them to build cloud hard disk availability is guaranteed. Cold data sinks into THE TFS composed of HDDS, and hot data floats up into the TSSD cluster composed of SSDS. This ensures performance, takes care of throughput, and allows unlimited expansion and cost advantages.

However, by the end of 2014, less than a year and a half into the operation, there were three relatively large unusable failures and minor problems. In 2014, it was unavailable for more than 12 hours.

Why are there so many problems with the 1.0 architecture? In our opinion, the core problem is that we did not design the architecture according to the needs of customers, which is also what Teacher Wang Chao mentioned just now. We built the architecture and designed product functions based on existing technologies, and overemphasized the reuse of existing systems, which led to the complexity of the architecture and the difficulty in ensuring availability.

A typical example is that the underlying storage uses a KV storage system, which is characterized by greater key-value friendliness, better cloud disk performance and lower cost.

We have a higher proportion of Linux cloud hosts online. If we design 4096 cloud disks for Linux cloud hosts, users can get better performance and lower cost. We designed 4096 and 512 two sizes of cloud disk, 512 for Windows cloud host to use, 4096 for Linux system. When I went online, I met a crazy joke from users: After Linux cloud host was reinstalled to Windows, the data could not be used.

In this example, we made our system architecture more complex in order to support different sizes of cloud disks. We thought it would bring better performance to users, but users do not recognize the function of the product at all. They can not change the system, and the hard disk can only be locked in a certain system, which is a very bad experience.

After reflection, we decided to start from the user needs, summed up in three core requirements: data reliability, availability, and stable performance.

Based on users’ three core demands for cloud hard disks, we designed the second-generation architecture. It is characterized by uniform sector size, consistent storage media, fixed cluster size, etc., static data routing, simple design.

The availability of the second-generation architecture system has been substantially improved, supporting the rapid development of cloud hosts in 2015 and 2016.

Next, we removed the access layer in the 3G architecture, and it was natural to mix SSDS and HDDS to reduce costs, followed by the introduction of SPDK and RDMA technologies to reduce latency.

Therefore, the lesson we learned is that the architecture design of the product must be based on user needs, and the core pain points of users at different times should be grasped, the architecture should be evolved, and these problems of users can be solved successfully. Instead of imagining product functions based on existing technical capabilities, and then pushing them to users, it is conceivable that such products will be voted by users with their feet, no matter how clever the technical architecture behind, the business is doomed to failure.

** Wang xiaobo: In **’s view, qualified evolution should come from real requirements. When the requirements change, the experience changes, and more new and better experiences are needed, the architecture will also start the next evolution.

02 How to Implement a HIGH Availability Architecture?

** Marvin Frost: ** Generally speaking, the availability of the entire system must be measured by a number, everyone is divided by the total time of the month, such as 99.99% or 99.999% availability, feeling very high.

However, the problem is that the impact of a 10-minute system failure during peak hours may be very different from that of a 10-minute system failure during peak hours. If the calculated availability is exactly the same, the availability cannot reflect the real situation of the system.

Tencent uses a different algorithm to calculate availability internally, not by looking at the system, but by looking at requests. Since the system is designed to provide services, the number of rejected requests in the system, plus the number of timeout requests, and then divided by the total number of requests, if your system comes up with availability close to 100%, then your system must be highly available.

In the process of designing and operating the system, it is necessary to be alert to the decline of system availability or even unavailability caused by the steep decline of service capacity of a single server in the system.

For example, if we design a system with a load calculator to distribute requests, and many business servers are connected behind it, the QPS of a single server is 10,000. If we connect ten servers, the system’s processing capacity is 100,000. If one server fails, the QPS of the system will be reduced to 90,000. The processing power of a system can be estimated.

When designing the architecture, we always assume that the server status in the system is normal or faulty, the normal server stays online, and the faulty server is quickly kicked off by the load balancer. In fact, we often do not consider that in the process of system operation, a single server will suddenly enter a sub-health state, this is very common.

For example, when the forwarding capability of the switch is reduced due to an abnormal switch, the network capability of the server will definitely decline significantly. If the CPU frequency drops, there will be ECC error correction, and the computing capability of the whole server will plummet.

Hard disks are also a common source of problems. Hard disk faults cause I/O latency to increase sharply, or even cause I/O hangouts to become unavailable. After a single server has entered the sub-health state, the server’s request capacity decreases sharply, even completely blocked.

In many open source systems, such as HDFS, the processing capacity of a single node deteriorates and the entire system becomes unserviceable.

Therefore, we must consider in the process of high availability architecture design, the system in a single server into the sub-health state, the processing capacity becomes poor, the system is still high availability? ** Actually, from this point of view, there are mainly several aspects to deal with:

First of all, if you can avoid relying on business request from the design, it is relatively easy, the equivalent of every single business logic to compare, is directly on a single server to complete the business logic, so even if a single server processing ability to drop, the impact on the system availability is controllable.

If the business chain is long and there are dependencies between businesses, it will be more troublesome. For example, when HDFS writes data, three copies need to be written to three nodes, but the write is a serial write. In this case, three data nodes exist. If the write is slow on any node, the client cannot write data. The longer the service chain, the greater the impact of sub-health or degraded nodes.

In this case, through a full-link detection and monitoring, these abnormal and sub-healthy nodes can be quickly removed, so that the system availability can be quickly recovered.

Two or three years ago, Tencent Cloud team invested a lot of energy in whole-link monitoring and eliminating sub-healthy nodes. For example, when a large number of network faults occur, they cannot be removed. However, if the network adapter of a single server is abnormal or the network performance of your access switch is degraded, they should be removed quickly to avoid misjudgment.

** Wang Chao: ** as an architect, I really need to get deep into many details, see the problems and solve them in a targeted way. For example, the definition of high availability varies from request-based to RunTime, and the boundaries need to be very clear. Architects need to look both deep and broad.

For example, I once encountered a problem. I found that the team’s code was prone to bugs, and I had to fix it in an emergency. As a result, there were new problems, and it took a long time to solve each problem, forming a vicious circle. In addition, problems are often discovered by operations and products first, while technology cannot be found at the first time and is always passively notified.

Based on this problem, we can find that many teams release multiple projects every day, and some dependency problems between projects are not solved. Many teams are developing at the same time, and there are code conflicts, and there is no good release system, and it cannot be rolled back immediately after release. These problems summed up, some monitoring is not in place, the system is not perfect, there are also some deficiencies in the mechanism.

As an architect, you need to have basic solutions, like how do you go to a single point? How to design a release? How? All of this general knowledge can be found online, but solving it in an ideal way often takes too long for the business to wait.

In my opinion, long-term plans are necessary, but short-term plans are also very important. It is not necessary to use the most ideal and technical solution to solve, but can learn from the idea of architecture.

For example, is it possible to distribute the code of these teams into one team or even one person to solve the database write conflict problem? If you post a lot, break it up into several time Windows. Similar to Log Structure Merge, it is simple and efficient to Merge.

I think the idea of architecture can be applied to all aspects. Production architecture is also a kind of architecture in essence, which is the connotation of architecture that I think about from another perspective.

Another is to analyze the characteristics of the company, for example, shell business is run in the daytime, few people go to look at the house at night, at this time, chaos engineering is done at night, even if the system goes wrong for a short time, the impact is not so big, there is still time to repair. You have to learn how to use this advantage to improve the robustness of the system.

** Wang xiaobo: Does ** complete high availability exist? This problem needs to be looked at dialectically, what is the situation of high availability, high availability to what situation we can finish this thing? In other words, what are our goals in terms of high availability? What granularity of high availability is needed?

For me, the first thing that comes to mind is cost. For example, high availability costs for the overall architecture can be high if the outage is significant and you are doing multiple remote tasks. If you can recover when you die, the way to reduce costs is to downgrade.

High availability measures such as circuit breakers and downgrades, which may be cheaper, are an advanced condition for an architect, especially an application system architect. In fact, the key is to understand what is the downgrading? What would be the least damaging to us to fuse under what conditions? This incident reflects whether an architect has a deep understanding of the technology, or a deep understanding of the underlying technology.

When you build a high availability architecture, the technology must be good in the first place, but the reverse is true, as long as the technology is good, it can be completely solved? I don’t think so. If you don’t understand the business process or the business scenario, it raises a big question of what does a downgrade look like? Is it no impact on the business, no impact on the business experience, or no impact on the overall transaction process or revenue?

A series of downgrades are entirely different, depending on the architect’s understanding of the business, modeling such a business process into a system architecture, and then fully implementing how to degrade it in the system architecture.

So not only do you have to be good at technology, good at logic, you have to pick your time, and ultimately you have to know the business better than the business.

How to quickly control the project architecture?

** Wang Xiaobo: ** For a new project, the challenge to system architecture design capability falls into two categories. One is a completely new, zero-based project. Such projects need to be done step by step because they are built from scratch. The harder part is taking on a business system in a completely new area, which can be very challenging.

Most architects are more likely to encounter this problem in their actual work, such as taking over a previous system and having “animals” running through the architecture or code for the first time.

Faced with such a situation, the first thing he thought of was to review the margins of the entire system and the current logical process of the system, and then sort out the business. Finally, he designed the business and technology into a completely new structure based on his own ideas.

This new architecture must be kept in mind first, because it takes time for the architecture after a new project to develop from what you consider to be gibberish to what you consider to be good. In fact, this matter should be reached to the best state through time and steps. If this state is a step in place, of course, not to say 100 percent will fail, but this situation is very easy to capsize.

In fact, the distance from your ideal good to the reality is not only from the quality of technology, but also from your understanding of the business and your understanding of the team, including your consideration of time and business cost. So I would design a completely new architecture myself, but I kept it in mind, constantly benchmarking the best in my heart with the goals of each step.

So what’s best? Over time, the premise is the first time to make the system usable and achieve business goals, that’s the quickest thing to do, and the rest is the dream in mind.

** Marvin Frost: When ** takes on a new project, the first thing he does is find great people to join his team. After all, the implementation of any architecture, in fact, there must be corresponding programmers to the implementation.

First of all, any architecture should be simple and controllable. If a student told me that it might take at least one year to build this architecture, for example, it would take one or two quarters to research a certain technology, and then six months to implement the corresponding implementation, in my opinion, it is not feasible.

I think there should be a quick, mature architecture, quick trial and error, and then prototype it first. See if this is what users want, high availability, high concurrency, high performance, which aspect is more important, and invest the corresponding resources in it to do the corresponding evolution.

** If I take over a brand new project, I will first find out the situation of the existing system, list and draw out the key logic and hierarchical structure of the whole code, and figure out which modules there are, how they communicate, what middleware there are, what third-party services there are and so on. Then analyze the risk points, list the risk points, present problems and failures, and then design a reasonable plan.

Don’t take a generic solution and implement it right away. Analyze your business and see if you really need high concurrency and low latency.

For example, due to the low cost of pure online operation, a large number of requests will be generated at the same time for traditional Internet and e-commerce, which have very high requirements on concurrency and flow control. Your architecture should consider how to use queues to decouple and fuse.

In the industrial Internet, offline and online actions are linked, and all requests are accompanied by offline physical actions, so the concurrent requests will not be too large, so there is no need to pursue the capacity of hundreds of thousands of QPS.

However, Latency is more important. For example, shell business, because real estate transactions have very long links, completing a link may go through three or four hundred services. If each Latency is very high, the request will not be returned in a short time, and the experience will be very poor. So, consider how to shorten, merge links, how to cache.

In addition, it is difficult to locate and trace problems due to such business characteristics. There is often a problem, it is not clear which service or several services are the problem, the whole investigation and positioning cost is very high. A problem is often investigated by several teams, that is, the problem of organizational shock.

So how to quickly locate, one is to Trace a request, a data life cycle management. The second is to be aware of the platform release, network environment, middleware and other configuration changes. The third is to use some data analysis and AI-assisted positioning.

The architect needs to understand the current state of the system, the current state of the business, the current state of the team capabilities, and then adapt to the local situation. You should think about the real structure based on your current system situation, business situation and the organization of the team. Only then will the picture in your brain that Teacher Xiaobo said just now appear, make long-term planning and transition to that picture according to the evolution path.

04 How to improve Architecture Capability comprehensively?

** Marvin Frost: ** Members of my team have asked similar questions about how to improve architectural capabilities. I think the more effective way is to do different types of projects, to solve business problems, to solve business problems, equivalent to more to practice.

However, some people may say: I have no power to complete my own work in this team, and I don’t have the opportunity to participate in other projects. I don’t have access to many Internet technologies, so how to deal with my slow growth?

In this regard, first of all, I must read some technical books on architecture, or watch the teachers’ architecture sharing livestreams in the cloud + community to learn the ideas and methodology of high availability architecture. But from this can only learn the theoretical basis, more and more important is to practice, to practice.

For example, high availability architecture, high scalability, system scalability how to do? Is it possible to buy a load calculator in Tencent cloud above, to hang the logical server, build a simple, with horizontal expansion ability of the system.

In fact, although I have no chance to participate in the project, I still need to build the environment by myself, do business by myself, simulate the real business scene by myself, and then practice.

For example, to do module decoupling, then you can go to build a Kafka to use, these are actually very simple, especially now the cloud environment is very simple, the key is that we have to practice and practice.

** When I first got into technology, no one knew what a technical architect was. As technology has evolved, I have made some summaries myself. First of all, I suggest that engineers and architects should take the initiative to undertake and solve some horizontal technical problems, jump out of their own technical boundaries, think and touch, and promote some horizontal things, and gradually there will be more opportunities.

The other thing is to have an open mind. No matter what new technology is not mature now, or in the concept or POC stage, it can be paid attention to, not rejected, but learned. Of course, because of the rapid development of technology and many directions, it is difficult to learn each direction deeply.

So it’s important to find something that suits you, that combines and develops with your current field, that you’re interested in doing, and then go deeper. Find a good open source project to look at the code and understand how it’s designed. Because architecture is everywhere and there are many good architectural ideas in the code, it is important to understand them.

Take Linux, which Teacher Ma is familiar with, for example. After so many years of development, the essence of the architecture is still applied in many places. The underlying core is not likely to change, so we must have a deep understanding of it.

Last but not least, when you have a certain amount of precipitation, accumulation and product, you should try to open source this thing and put it on the Internet for more people to use, verify and give you feedback. This feedback is very important, and it’s hard to improve your architecture if you’re always working behind closed doors.

Marvin: * * * * active students must have goals, know what you want, to do a high availability architecture of know what you’re missing, then set yourself targets, such as this month I’ll go to do the design of the state of the server, next month can do scalable design, what time to do what kind of goals, initiative is particularly important.

** Wang xiaobo: ** Programmer and architect are two words, but I think code is the way to go. Architect is just a term for a period of transition, a fragment of programmer. Everyone is essentially a part of the division of labor in the technology world, whether it’s a programmer or an architect, whether it’s testing or operations, it’s essentially a programmer, because we all work for code, we all work for code.

In order to become a good architect, one must first define what kind of architect one wants to be, which is very important in the whole process of growth. If the goal is to know Linux as well as Mr Ma does, then you need to master the whole storage technology in the future.

Architecture is about how we can understand the principles of technology, really get into the principles of computing, what a computer is, what storage is, why the things that exist today, and then imagine this thing, and explore a series of things over and over here.

If the goal is to make commercial decisions and solve industrial chain problems like Teacher Wang Chao, in addition to understanding the principles of technology, some business principles may also be added. Since the business is essentially a commercial operating system, a deeper understanding may not be the same.

Teacher Wang Chao and I grew up with similar experiences. We both started with technical architecture and then moved to application architecture. In this process, we all come across a question: how can technology get wider? My advantage point seems to have changed from a point to a surface.

In fact, WHAT I want to say is that the whole process is probably a transition from a programmer to a good architect. Application architects should broaden their scope, because the essential difference between architects and programmers is the difference in global thinking. If you just write a line of code to accomplish a function, your thinking will stay in a corner like this.

But if we want to start from the architect, we need to go higher, that is, we need to look at the whole project and the whole system from a global perspective, where are the boundaries of the tasks to be completed, where are the difficulties, and where are the decoupling parts, so that we can design the architecture well.

In addition, if we just stand at this height without in-depth and direct technology, in fact, we will become technical product managers, who can design good products, but need to find a brother to help me realize.

So architects must do a lot of the technology themselves, and have the ability to implement the technology themselves. There are too many technical things. If you want to expand the surface and break through the points, and then go deep into some points, you must be an excellent architect if you have some surface.

So what is the difference between point and surface architects? A student once encountered a strange phenomenon, that is, when he was running, his progress suddenly disappeared, which is a paralyzing level for the business above. When he got up, he disappeared again, which is very scary.

The architect may first look to see if the application has any bugs of its own, because it suddenly disappears. If the broader knowledge of the students would say that we take a look at the DB code, is it caused by the Bug DB. When it turns out it’s not, try rebooting. When he was distressed, he met an old-timer, remind him to look at the kernel, is this version of the kernel has a problem, finally solved the problem.

If the scope of knowledge is not complete, the height of the station is not enough, and the depth is not enough in the design architecture, the designed technical architecture will always be left in the PPT architecture. With the development of technology today, the basic technology of architecture is blooming, but the essence of architecture is to master the whole technical aspect and technical depth, one width and one vertical, so as to get better results, which is also the core part of high availability or architecture design.

** Marvin Frost: ** I strongly agree with the saying “the ability to build is the ability to solve problems”. In my opinion, outstanding students must have a characteristic that they are curious about technology. When confronted with a problem, they will be very excited to think about how it is produced, how to solve it and what solutions are available.

This is also problem driven. No matter the problem assigned by the boss or what problem others are solving, they will be interested and eager to solve it. This can drive the ability to develop better.

** Technology is a field, programmer is a profession, architecture is an ability, there are three levels. Architecture can cross many functions, such as writing code, code that focuses on extensibility and robustness, plug-in design, etc. Systems have architecture, product architecture, organizational architecture, and business architecture everywhere.

One of the great benefits of being a technical background is that from the very beginning you practice this kind of thinking, which is both a kind of thinking and a kind of ability, and this ability grows over time, and you have to have this kind of structured thinking and the ability to do whatever you do, and this is called transferability.

We must cultivate and cherish the thinking and ability of architecture for a long time, which has both depth and breadth. The process will surely bring new cognition and new growth, which will force the improvement of ability, forming this virtuous cycle.

05 Q&A

Q: For the monodb problem, apart from searching the kernel to find the problem, the system crashed frequently. What did you do?

** Wang Xiaobo: ** because the system crashed frequently and it was impossible to solve the problem quickly at that time, the business had to be guaranteed. Because I was in charge of the entire technical architecture, including the operation and maintenance team, which all belonged to the technical architecture, the first thing we did was to stop the loss when the failure occurred.

Why make a stop first? Theoretically speaking, in addition to the conventional flow to see whether there is an increase, there is no change, in addition to the occurrence of an unknown phenomenon, so first need to stop loss downgrade.

If the breakdown of the moment to do this is difficult, this thing must be done before. That is to know the whole system before it fails. Architects and operations need to know in what scenarios each used thing appears and is not available.

At that time, we decided to downgrade the related business. For the business, at this time, the loss is minimal. The second thing after demotion is to fix it quickly.

To sum up, preparation is very important in such a situation. As an infrastructure and basic operation team, the most important thing is to fully understand their own business, every suspicious point to the contingency plan, the corresponding downgrade plan, and then practice these things.

Q: How to ensure the high availability of load balancing in cloud environment?

** Marvin Frost: ** Although I am not a load balancer, I know something about it, so I can share my knowledge with this student.

For load balancing on the external network, there is a BGP scheduling, which is the end-to-end availability guarantee. For example, your load balancer is located in Shanghai. If you are visiting Shanghai from Guangzhou, use the nearest BGP link to visit Shanghai. If the link of guangzhou to Shanghai broke, tencent cloud will make the BGP routing, distribute the BGP routing to Beijing, to visit Shanghai when load balancing in guangzhou to Beijing, then go to visit to your load balance, the time delay will increase slightly, but at least in our link broken cases there will be a guarantee of availability.

Load balancer is in the export of our cloud room gateway, to provide services, in the form of cluster is usually in a number of units, tencent cloud load balancing is now have the ability to migrate, a cluster because some fault or hidden trouble some machines, such as memory some hidden trouble, this time we will reduce the load balancer migration from one cluster to another cluster. During the migration, external services are not aware of, which ensures high availability of load balancers.

Q: Where can I learn real architecture techniques?

** Wang Chao: The current learning structure is actually much easier than a decade ago. Tencent Cloud + community provides a lot of materials, articles and courses for everyone to learn, and these courses are systematically sorted out and taught to everyone.

Twenty years ago, there was almost no information available, and the only framework for learning was open source, or even open source projects. So early architectural lessons were learned from the Linux kernel.

Now the bar is really low, there’s a lot of books that describe principles, and you want to be able to drill down into code logic, like election strategy for distributed systems, why you use that election strategy, data consistency strategy, implementation of strong consistency or weak consistency, how data synchronization is implemented, and so on.

Q: If the system is faulty, how can I quickly locate the fault?

** Wang Chao: ** how to fast positioning, to train this ability. If a process is killed, you can easily locate a line of code through Coredump and source mapping, which is a basic ability of a good engineer.

Q: The dictionary contains personal address information, how to prevent dragging storage?

** Wang Chao: ** Anti-drag library method is similar, to the data desensitization, personal and address should also be stored separately, when using to get the key to organize, the key should also be updated dynamically, can only be read alone is an unidentifiable data fragment.

Q: What is the future of architecture evolution? Will it be any different in 10 years?

** Wang Xiaobo: ** this question is the reverse of our theme. We are talking about the evolution of architecture. We are summarizing the past from today. Well, essentially we don’t know, because it’s not time yet, and we’ll have another meeting in 10 years’ time to summarize this at that point.

In essence, the evolution of architecture will never stop, but it is hard to predict where it will go, but the principles of architecture will not change over the next decade.

Architecture looks like it is difficult to estimate the future ten years, but this decade architecture ideas will not change, is based on computer technology, more know more technology, then the technology more deeply, with updated further technology to reflect today’s architecture, make it develop to the next generation, more use of technology to consider the business, more to consider the business logic, Then use the technical way to solve it, this thing will not change for sure.

Q: The task is urgent, the time is short, how to make the architecture?

** Wang Xiaobo: ** As we mentioned earlier, architecture design must be in the right place at the right time, the whole context is important. For example, the boss thinks it shouldn’t take too long, because he needs to solve a business problem, and he just writes the code to me. What about the architect?

In the beginning, when I was working as an architect, I actually hated this thing, because I thought that any technology, any system should be well designed and then started.

As time went on, I found that the matter really had to be reconsidered. For business, tomorrow is the business deadline. If we don’t seize the market tomorrow, maybe we don’t have to do this. To do architecture well is to be seriously responsible for the technology, to observe the business carefully, and then determine the business logic.

So should we go and catch up? Or catch up on some technology? In this case, of course, you own judgement is not necessarily the right, after all, is a technical personnel, the strongest is a technology that may be your boss is a businessman, may he be more sensitivity to the business, the matter should go to listen to the boss, but also doesn’t mean we are really want to go directly to Dui code, On the basis of meeting the conditions, we should write down the technical debt after these things, and then pay it off later. Maybe this is a relatively acceptable result.

After all technical personnel in this matter is always contradictory, is when time conflict and technology which to choose, but in the more mature consideration, to obey the business idea may need bigger, because can really solve the problem of existence, because the code itself not to print money, but must solve the problem of survival after the replacement of technical debt.

Q: What technical literacy does an architect need?

** Marvin Frost: ** I think an architect should be a programmer first and foremost. This is not to say that an architect can only build architecture but not write code. An architect should be a programmer, or the team has no one, and now when the architecture is to be landed, you can join.

I think to be able to write code first, the whole technology stack from the bottom to the top, such as the principle of the bottom operating system, kernel, implementation, and then to the upper network, database design, database optimization, should be said that we as programmers will be involved in the process.

Then there are some distributed, big data systems, network communication framework, plus RPC framework, in fact, it is very necessary to master, as well as message middleware.

In addition, now more popular cloud native and container technology, these are to master. The architect’s primary ability is to solve problems, and when your team has a problem, whether it’s a technical problem, a business problem, or some other team collaboration problem, the architect should be able to be on the front line.

I think it is an architect’s technical accomplishment to have ideas, open the team’s mind, and let the team try to solve problems.

Q: If you build your own system, how can you expose the problem without users?

** Marvin Frost: ** BEFORE I suggested students to build the system by themselves, to practice, if there is no user you can be your system user, you can do a lot of robot test client, send requests to your system, these are ok.

We all have a lot of testing tools at our disposal, which are different from real systems. But the amount of traffic it puts on your system, testing the usability of your system, that’s fine.

Q: What competencies should be developed to become an architect?

** Marvin Frost: ** HERE I will say it from another Angle. I mentioned the technical literacy and problem-solving ability of architects before. I think another Angle is that architects are not fighting alone, but with a group of people to solve a certain business problem.

This process requires you to make your team fighting, this is your competitiveness. Leading the team to solve business problems, how to persuade your brother team to accept your plan, how to persuade the boss to invest corresponding resources for you to complete the project, in fact, these are some soft strengths, AND I think they are very important.

Q: Are the three teachers still hiring?

** Wang Chao: ** is looking forward to the joining of technical partners in the industry and making contributions to the new residential industry together. Welcome to send your resume [email protected]

** Wang Xiaobo: ** We are always looking for excellent programmers of all kinds, namely programmers in the broad sense, architects, operation and maintenance, testing, including product managers.

Our Base is very rich, we can come to Beijing to enjoy the haze, or chengdu, the land of abundance. Our headquarters is in Suzhou, which is also a good place, where we can eat hairy crabs while writing codes. Search our jobs page and our domain name, which is easy to remember, Ly.com.

** Marvin Frost: ** MY side is also from the bottom technology to the upper application, background development, all are needed. I recruit people to see a point, if your initiative is good enough, learning ability is strong enough, in a certain aspect of outstanding ability, you are welcome to send my resume: [email protected].