Abstract:

Chen Kangxian (also known as Long Long, blog), a technical expert of Taobao Technology Department, is the author of the book “Design and Practice of Large Distributed Website Architecture”. He has accumulated rich practical experience in distributed system architecture design, high concurrency system design, system stability guarantee and other fields.

Large distributed web architecture design and practice: by Chen Kangxian compiled "large distributed web site architecture design and practice of" mainly introduces the large distributed web site architecture involves some technical details, including the implementation of SOA architecture, Internet security architecture, building distributed sites rely on infrastructure, system stability, security and huge amounts of data analysis, etc; This paper deeply describes the core principles of large-scale distributed website architecture design, and through some typical architectural design cases to help readers understand some common scenarios and problems encountered in large-scale distributed website design.Copy the code

The following is the text of the interview:

CSDN: Please first tell us about your current work and what technical areas you focus on. Chen Kangxian: At present, I am in charge of Ali Live broadcast platform in Taobao Game, including the overall technical structure and business promotion. Ali Live Broadcast platform aims to provide one-stop solution for live broadcast, covering business function modules such as large audio and video live broadcast, chat in large live broadcast room, bullet screen, PPT teaching, and anchor reward. Large-scale live broadcasting is a very challenging business scenario. It not only needs to solve a series of problems caused by audio and video encoding and decoding, slicing, distribution, stability and playback quality monitoring, but also needs to solve problems such as message sending, signaling and heartbeat in ultra-high concurrency scenarios. As well as various UGC content filtering (image, text), multi-terminal compatibility, digital copyright protection and so on.

Because of work before, to understand and come into contact with something more miscellaneous, was in a cloud mobile phone mall, because it is a heterogeneous system, need cross-platform deployment, to study the application under heterogeneous SOA architecture communications, routing, deployment, upgrade and migration, at the end of the taobao, taobao found have all sorts of scenarios for the corresponding middleware, Spent a lot of energy to be familiar with the working principle of various middleware under distributed scenarios and usage scenarios, in order to improve the reliability of the system architecture and low workload, stayed in the shop over there a few months, to understand how a complex website system works, how the page modular, how to apply colours to a drawing, how to improve performance, through static and do pay treasure card behind the treasure, Since that time data analysis platform is not so mature, in order to see the system running state and business data, to study the data online and offline analysis, because when the page is in pay treasure to client first screen, requirements for system reliability is very high, and go to the system stability to ensure the principle, technology and tools, Most of the time, I am driven by demand to learn new knowledge, so that I can use it immediately after learning. When I encounter a new technology, I will first learn what it does and what scenarios it is suitable for, and then I will further study it after I have specific business scenarios. The current focus is mainly on the following areas, including the technical development and architecture of audio and video fields (such as digital copyright protection, codec and video protocol, point-to-point communication), high-performance WEB duplex communication (communication protocol performance), and reliability monitoring of audio and video applications.

CSDN: Can you talk about your technical path?

Chen Kangxian: after college computer professional, and because of all kinds of coincidence in taobao, individual itself is very interested for technology, enjoy a sense of achievement from through practice, satisfaction, and so is actually quite naturally chose this line, as a code farmer fun is that by asm code, express their understanding of the world. Of course, the main reason is personal interest, like all kinds of thrashing.

CSDN: Have you been working in Ali since graduation? Is it because of the technical culture or other reasons that you stick to it? And talk about graduation these years in the work of the harvest and experience.

Chen Kangxian: Ali is facing the whole of China electricity industry and even the global electricity industry one of the biggest challenges, both in terms of business scale, or user scale, both the order of magnitude of the gap with other competitor, is actually thousands and thousands of yards behind the farmers in silent support, it can provide other places cannot provide scenarios and challenges, this is perhaps the biggest cause of stick. To join work colleague, ali must be skilled in certain areas, therefore, with every colleague’s cooperation process, in fact is also a learning process, all become one with the industry experts, they will also look at problems is more and more comprehensive, more and more mature, harvest is not only from technical, and the way of thinking. There are great changes and growth in outlook on life and world outlook. Large companies can gather talents, provide lots of learning opportunities and communication atmosphere, and encourage experience sharing and personal accumulation. In the long run, such an atmosphere also benefits people.

CSDN: There are many projects you have done in Ali. What impressed you the most or gained the most? Why is that?

Chen Kangxian: Actually also experienced many stages, different stages, different roles, may experience is not the same, also different feeling, from the beginning to see more, do more, then more think more, design, and each stage focus point is different, specific technical details from the difficulty of research, to the overall scheme of the control, risk control, different stages, The feeling may be different. Relatively deep impression may have so a few things, remember a few times, a look at two o ‘clock at night received alarm messages, system hang up and then climb up the pants to find problems, various turn log code, find the problem after need to resolve, and the corresponding PE classmates call somebody else would have slept, and pants up until the problem to repair, No complaints, ali’s classmates are like this, there are online problems to solve the first time, professionalism is definitely worthy of admiration.

I am actually a man is lazy, no meat, can automatically once a q&a function just online, taobao sellers and very clever, all sorts of small advertising to sell clothes and sell mobile phone very lively, the final result is that the entire page is unreadable, helpless painful operations MM a a manually delete, delete much faster to also fail to catch hair, I am the kind of person who likes to mind their own business, however, think of using bayesian algorithm can solve the problem, then more than employment + weekend spent one or two weeks to develop a set of anti-spam system, discuss the matter is not how much this algorithm more cow force, bayesian algorithm anti-spam is nothing new, but because this thing is head a hot to do, But again received unexpected effect, in fact, such things also do a lot of, but this is more representative.

Another more impressive projects is a live last year’s double tenth, due to the change of work before, just a few days to a new team and got a task, to design a broadcast system, can support ten thousand people online at the same time, XXX XXX anchor push flow at the same time, it’s a game of double tenth 12-12 core gameplay, but previous double tenth has never done anything similar live, Have no experience to draw lessons from, while the double tenth away also has one or two months time, well, then, start to do, the scheme design, coordinate resources, capacity assessment, pressure test, the intermediate involves N team cooperation, coordination, capacity, technical and meet all kinds of pit, class with one or two months in a row, everyone was very tired, can be said to be the most challenges of a project, Everything had to start from scratch, and the launch date couldn’t be pushed back. Thankfully, we finally solved all the problems, and the overall performance during Singles’ Day was relatively stable, though not perfect. Of course, all this is not dependent on the day to eat, with the early plan and partners’ efforts are inseparable, in fact, it is quite a lot of harvest.

, of course, no matter in what part of the job at hand is a duty, but don’t be limited by their current role, much to learn about the people around you are doing, understanding system design idea, ask a few more why, why to be like this design, what is the advantage of this design, there is no other better solution, active learning, you can get more.

CSDN: The Internet is developing with each passing day, and technology is changing constantly. As a technician, what learning methods or skills can you share when new technologies come?

Chen Kangxian: Technology is always from scratch, from there to excellent, subverting the entire industry technology, in the early stage of birth is also a baby swaddling, need to constantly improve. For technical study, therefore, must first grasp the context, in ideas, learn from the source code is also very important, you know, source in front, no secret, not only to learning, but also know the why, it’ll be easier one instance, the development of the technology is often evolution, from the initial concept is put forward, the prototype, to the industrialization, Finally, it is widely recognized by the industry and used on a large scale. There is a process of evolution in this process. Therefore, it is easy to learn feature by feature as long as you understand the operation mechanism of the technology, namely the so-called principle, value and application scenarios. Of course, a lot of things are from the other side of the ocean, from the technical input and application, to the relevant articles and books, there will be a certain lag, however, such as the domestic translation of books, and a long time, so, have to get used to seeing English documents.

Of course, the most important thing is to understand the technology at the core of the essence, including the principle, solve the problem of what, what kind of scene is suitable for use, in addition, also depends on the relevant technical community activity, is it possible to become a mainstream in the future, this is very important, in general, solve the problems of the same field, there may be many solutions, then select the options, It will affect you and your team for a long time, and if you choose a technology that is not mature, or one with a less active community, then you will have to spend more time solving problems in the production environment. When understanding, and then to learn, actually becomes easier, and, after a technology come out, will be constantly improved, with new features, but they are all on the basis of the original brick add tile, when you understand the nature of these technologies, and to understand these improvements, the new features, will be relatively easier.

In addition, do not blindly seek the new, popular is not necessarily good, suitable for you is the best, A to learn A, B to learn B, C to think C is good, learning is A cost, spend time in the right place, more persistence. The introduction of new technologies also needs to take into account the surrounding ecological environment, whether the community is mature, otherwise only the development of various middleware, various tools, you can drink a pot of water, the fad fades, a piece of hair.

CSDN: What other interests do you have besides programming/development? What is your current daily rhythm?

Chen Kangxian: In addition to system design, programming development and various meetings, my spare time is very limited. The limited time will generally be divided, so all kinds of writing, PPT will take part time, because you need to work in a variety of experience, tread pit, which is a valuable asset in life, with the passage of time, it is difficult to want to do any project three years ago, to write any code, you get any experience, therefore, Daily review and reflection are important.

After all, the body is the capital of the revolution. The body is our own. Once there is a problem with our health, any success will be meaningless. Another is reading and studying the pace of technological development is very quickly, if you can’t continue to learn, may be behind, and these behind the concept of the final will be directly reflected in what you design on the system, the other is to learn by reading make their knowledge more comprehensive, more broad vision, so that, in turn, will lead you to the train of thought to solve the problem more widely, become more creative, Reading is a very good way of learning, because the usual fast food learning will be easy to get into details and can not understand the comprehensive, knowledge can not be systematic, therefore, you can also borrow the time to comb their knowledge. Because I like music, I will spend part of my time searching for various popular songs, light music, piano, violin and so on. Music can make my brain in a relaxed state, and then accompany my family, travel and so on.

Master the three levels of knowledge and technology

CSDN: Before that, you published the book “Design and Practice of Large Distributed Web Architecture”. Could you share the reasons, process, difficulties and insights of writing the book? And introduce the features of the book.

Chen Kangxian: Actually, it was a coincidence. I remember that in April 2013, Ms. Dong Ying from The point of view of Blog came to me and asked me if I was interested in writing a book on distributed system. Actually, I had already foreseen that it would be difficult to do this, but I still accepted this project without turning back. Mainly think that writing a book is a noble thing, also be contribute to the development of Internet technology render a, from another point of view, a rare opportunity, is also a review of the prior knowledge of a comprehensive, some knowledge points and details, some are also learning don’t know why, or no personal verification, just to take this opportunity to deeply understand and mining.

Personally, I think there are three levels to master knowledge or technology. One level is to be able to do it in a project; another level is to write out the experience to benefit everyone; of course, another level is to be able to express it clearly in various occasions at any time and anywhere.

In fact, the process of writing is far more difficult than I imagined. Working in an Internet company itself is quite tiring and laborious. Sometimes I have to work overtime until 9-10 o ‘clock at night and go home, and I have to spare one or two hours to write. The most painful way need a context switch, in the process of the work to the writing, writing to life, and writing it is something need inspiration, sometimes in it took ages to suppress a few words, and on the way to work you might Vince spout, most of the time often have to wait until night, before it can be completely settled down to.

During the more than a year of writing this book, I can hardly describe it as suffering. I thought of giving up for many times, and even doubted and struggled with it. I can only say, Oh, my God, thank you for all the people who accompany me and support me. The positioning of the book from the beginning, it is not too high, high, but want to make different positions and different readers, to be able to learn, as a result, the content of the process, there is also a summary, of course, every time looking back on it, this book when writing the “timid, are treading on thin ice” feeling, are still there, writing is a serious matter, Every time I write, I often worry about whether I will mislead the readers because of my misunderstanding. From the current perspective, this book is far from perfect, but it is impossible to write a book endlessly. It is inevitable that there will be imperfect places, and it is sometimes painful to accept flaws.

Avoiding failure is at the heart of all engineering

CSDN: What is your personal understanding of architecture/software architecture?

Chen Kangxian: The following is just a little understanding, architecture is a blend of thought, not only is a blend of technology, at the same time, it is a blend of art, a good architecture is not just stay in the technical documentation, but in the process of practice constantly revised and adjusted, it also puts forward the higher request to the architect, just stay on the abstract and conceptual phase is of not much value, detail is the devil, Some seem simpler architecture from the abstract level, the biggest challenge in fact often comes from the details of these detail contains both the realization of the function of product visual interaction, also contains the business rules, such as risk of logic processing, also includes some unpredictable on the technology of “pit”, specific technical solutions in the process of implementation, It may take a lot of time and effort to solve and avoid the problems that may occur in extreme situations.

Architecture should meet the business development of a period of time, but this is exactly how long period of time, say, three months have said half a year, has said for a year, also has three years, different people different environments for the understanding of the problem may be different, startups, or try to business, tottering narrow escape, priority is the business model and non-technical architecture, thus, Architecture should be as simple as possible as easy as possible at this time, three months after business comming to even exists, it is difficult to say, this time to think after three years of architecture, basically is also a powerful and unconstrained style, for the more mature business, before or on the stability of the business system of refactorings, you will need to put in the long run, the eye Avoid some problems that may be faced in the medium to long term, such as the number of database sub-tables, ID length, sub-table dimensions, etc.

Another is extensible system is needed, when the design should have certain extension points, avoid slightly change requires the system reconstruction, open for extension, but closed for modification, actually it is easy to understand, modify the original system rather than the extension of the original system, easier to introduce a new problem, also can bring more testing effort. For a period of time within the framework of evolution, often experience from clear, and then to fuzzy chaos, refactoring, again clear again, then the process of blur, always a rapidly changing market environment, therefore, the design of the system to follow open for extension, but closed for modification principle, this can be convenient and timely access to the new process, and can not influence the existing process.

From a macro point of view, the relationship between each system should not be the relationship between chimneys, but like the high-rise buildings in the city, which are connected by roads. Therefore, to improve the speed of building houses, existing infrastructure and existing middleware should be fully utilized to reduce the cost and risk of system construction.

There are several layers of architectural design, no architecture is architecture, focusing on solving existing problems is architecture, and a good architecture is one that both constrains and frees developers to focus on functional design. Try to make complex things simple, but do not make simple things complicated. Technology is never used to show off, but to solve practical problems. Avoiding failure is at the core of all engineering techniques, and architecture is also a technique to mitigate risk.

Distributed architecture vs. centralized architecture systems, and thinking

CSDN: Distributed system architecture is a very broad concept, what are its characteristics, and when to use distributed in the site? What other scenes does it have?

Chen Kangxian: Distributed architecture is actually solved the centralized architecture system ability further scaling up of the bottleneck, facing these bottlenecks include resources, operations, development and maintenance, etc., because of the limitation of single machine hardware by technological conditions, the extended upwards, but cost may not be linear exponential, distributed architecture through a lot of cheap PC Server cluster, In addition, as the development team becomes larger and larger, the business becomes more and more complex, distributed and service-oriented, the system can be better disassembled, so that more teams can work together with higher efficiency.

However, from another perspective, the distributed architecture is a kind of complex structure, many traditional architecture can weaken the problem below, become highlighted in the distributed environment, and even become a crucial question, such as data consistency issues, such as network communication, serialization, delay problems, such as how to cope with the failure problem, Traditional condition through database transaction data consistency under considerable weakened, and the distributed environment will be a very complicated problem, another is distributed architecture makes the inside of the cluster network communications become more frequent, communication protocols, serialization, communication delay, fault tolerance and the performance will be complicated, Failure will become the norm under distributed environment, how to deal with these failures will become a very complex problem, a mature system of distributed architecture depends on a lot of infrastructure, from a variety of middleware, the operations of automation system, monitoring system, disaster system, all these need the accumulation of a period of time, and continue to input and, therefore, Considering distributed architecture at the same time, also needs from the perspective of input and output, and return comprehensive consideration, for startups, need to know first what is the problem to be solved, and then to think about what kind of structure, enterprises need to refer to large companies blindly architecture, may make the system becomes overly complex, early loss of response to the characteristics of flexible, thus losing competitiveness.

I understand the website architecture

CSDN: What are the ideas and principles of large website architecture design?

Chen Kangxian: In fact, it is difficult to say that there is a unified idea and design principle that can be universally applicable, because everyone’s understanding and concept of design are different. Personally, I think designing a complex large-scale website is actually a process of divide and rule:

First of all, we need to fully understand the business, understand the needs, understand the current primary problems to be solved, and what the possible risks are, and then break down the target to carry out specific technology selection, model design and architecture design. If it is need to solve the core problem of concurrency can through a variety of cache means (local cache, distributed cache), to increase the throughput of the query, that although a certain extent, need to make sacrifices on data consistency, from strong consistency to eventual consistency, but, if the data consistency is not the core problem to solve, so, This problem you can put a put first priority, if, in turn, the core problem into a data consistency, such as trading systems, then emphasize the consistency of the data too much, because of the distributed environment in order to cope with the high concurrency written and huge amounts of data storage, often need to depots table extension of relational database, It also brought great challenges to data consistency, strong consistency of the single repository transaction security, at this time to upgrade for distributed transactions across the library, and through two phase or three phase of the distributed transaction promised to submit, due to the distributed transaction between the manager and the resource manager many times the cost of network communication, throughput and efficiency is hard to meet the requirements of high concurrency scenarios, And this is actually for the trading system, is a tough question to avoid, therefore, come up with a lot of bring you solve this problem, a way through the reliable messaging system ensurance, synchronization for asynchronous, however, and the introduction of new problems, the messaging system to ensure that no message, whether it is hard to guarantee the order of the messages and the repeat delivery, In this way, as the receiver of the message, it is necessary to ensure the idempotent processing of the message and to de-duplicate the message.

Personally, promoting lockheed Martin’s famous aircraft designer kelly Johnson KISS principle proposed by the architectural design can simple not complicated, determined to cut any fancy design, because three years later may not how even some reality can not appear, join the contemporary architectural design, cause the system very complex. Sometimes what seems to be introduced is a very simple and easy to solve the problem, which may cause a series of unnecessary troubles in the specific implementation process.

Another point is untested for the introduction of new technology, new concept must be careful, must be in full after the verification, and the use of the mass, the emergence of new technology, new ideas, there’s a temptation, does not represent a conservative, technology is always in progress continuously, there is no wrong to embrace change, but the introduction of immature technology seems to short-term benefits, But its risks or costs may far outweigh the benefits.

CSDN: When designing a large website architecture, what aspects should be considered? What do YOU need to know about server/storage deployment?

Chen Kangxian: large-scale web site design is a very complex problem, a lot of problems need to consider, such as huge amounts of data storage, storage is divided into online and offline storage, online and relational database storage and the relational database storage, persistent storage and memory storage, this requires the architect according to the selection of a specific setting.

High concurrency and allows data loss situation, can use memory storage, and a single query condition, the only need to query according to the primary key, you can choose the key – value of storage, for huge amounts of documents and pictures, can with the help of a distributed file systems, as well as the CDN edge nodes, which solves the problems of storage, and can separate the cold heat data, Moreover, edge nodes are used to improve the efficiency of user access. If it is a complex query requiring multi-dimension, relational database is needed. When large amount of data, write requests, concurrent and depots table is required, because the depots table can limit the query of data dimension, the query conditions must take depots table button, if you need more than one query dimensions, you need to use the data synchronization tool synchronization out another dimension of data structures, or structures, vertical search engine, to provide a multi-dimensional data query, In many cases, it is difficult to use one storage tool to solve all problems. Therefore, multiple storage devices are needed to improve efficiency and user experience.

Another example is the deployment of applications, from centralized architecture to distributed architecture, SOA as a service, and then to the popular microservice architecture. Load balancing devices at the access layer solve the expansion problems of stateless WEB applications, while soft load centers solve the problems of service discovery and service routing. Lightweight virtualization and the emergence of Docker make it easier for Martin Fowler’s concept of microservices to become a reality. Of course, large websites also need to consider how to ensure the availability and data integrity of the whole station in case of force majeure in a certain region, such as same-city disaster recovery and remote disaster recovery (such as two-site, three-center). And the remote live, remote live architecture is in the middle of the storm.

Large site architecture is often not happen overnight, but by demand driven, step by step, after years of evolution of different size, different periods of different stages of business faced by different, different needs, need to solve the core problem is different, this leads to different stages of different architecture, and architecture is also evolve with the development.

CSND: What are the typical failures of large websites and what are the common solutions or optimization recommendations?

Chen Kangxian: A into the size of the site may experience every day in fault, fault may be only in the most people perceive it had repaired before, some of the reasons for failure, may be a change in the business logic to rely on the test is not sufficient, or mistakes in serialization is not compatible with version upgrade, or test cases not covered program bug, There may be a collision of classes with the same name in different versions of jar packages, etc. There may be too much traffic causing the logs to burst the disk, too much machine load causing too many threads to block, too much lock competition causing processes to die, too many database connection pools running out, too many JVM GC’s, etc.

But there are also may be due to the physical environment, such as cable was pulled out, got the fiber cut room power outages, hardware equipment damage and so on, the cause of the fault may be strange, it is difficult to enumerated one by one, for the change caused by the fault, you can do is to let the test cases as comprehensive coverage to every detail, including dependencies, project design stage considering risk more, Release according to the process, but do not give up eating for fear of choking, making the release process heavy and rigid, slow response to new business needs, in fact, this is a difficult degree to manage.

In addition is to establish the perfect monitoring system, including abnormal log collection analysis, business process link checking, all the machine running status detection (load, QPS, disk, memory, network, running water level), the analysis of the historical data, abnormal alarm, etc., for service-oriented architecture, still need to improve the service management, including weak dependency management, Call relationship (who called whom, who was called by whom), call frequency, abnormal status, which is actually a systematic work, but also a relatively basic work, with these, you can timely perceive the abnormal status of the system, timely positioning problems, repair problems.

CSDN: When building a website, you can choose to open source or develop your own website. The former will have to start from scratch in case the open source solution you choose doesn’t meet certain features in the future, while the latter seems to duplicate the wheel. What do you think about that?

Chen Kangxian: Dialectically speaking, the use of open source can reduce a lot of work, but there are also potential risks, especially for some technologies that are not widely verified. Even those that are widely used, such as Struts and SSL, sometimes have some shocking vulnerabilities. For small companies, Using open source technology can quickly build a web site can be used to be, even the problems encountered in the process of using open source software, the switching costs may not be very high, but for a big company, switching costs may become very high, because of the business and dependency relationship is too complicated, once widespread use, influence range can be very wide, Therefore, we have to be very careful before making a choice. Of course, there will be some special needs that open source software cannot meet, or go ahead of open source, and these tools and middleware will need to be developed by ourselves.

In addition, some features of open source software are actually far from our expectations, and their own architecture may not be easy to expand, but these features are very critical to us. For example, Hadoop, whose MapReduce, HDFS and Hive provide a set of massive data analysis solutions, But the low-level permissions were weak, so we had to put a lot of effort into developing an alternative, and a lot of effort into migrating data and jobs from Hadoop to the new platform. For the selection of the open source technology, we are more inclined to choose some relatively mature software community is active, it is best to have some more into the size of the successful cases, this kind of risk will be smaller, after all, for a mature e-commerce sites, the stability is more than everything, not with time of the system is of direct relevance to clinch a deal amount, every minute in the past time, Real money.

For open source technologies introduced in core applications, we also spend a lot of effort to understand them in depth and do some bugfixes to avoid stepping on some holes.

Another point is that many scenarios in large companies can be very specific, such as MySAL database row locking in high concurrency scenarios, JVM memory reclamation when large objects are resident in memory, and some software may sacrifice performance in specific scenarios to meet common requirements. Therefore, for us, after understanding these, there is also a certain optimization space, including from the implementation to avoid, or to transform open source software, and the premise of doing this is to understand open source software.

CSDN: The problem faced by the general website is the problem of load, when the number of people, resulting in slow speed is the main problem to solve, what is your suggestion?

Chen Kangxian: Compared with traditional enterprise, most Internet companies will face a big challenge, with the continuous expansion of the user, the system pressure will be more and more big, in the start-up stage, system architecture design is often the basics or no architecture, rapid iteration, priority to meet the business, and affected by the market environment of business is often changeable, Therefore, the business logic is highly coupled, the system is not extensible, and the code structure is bloated. In this case, it has to be refactored.

“Distributed” is to deal with large flow core idea, first of all, the system was ready to support the extension, level, especially in the model and data, because the data of the split, capacity, data migration is the most trouble is the most time-consuming, careless slightly, can also lead to inconsistent data, damage may be irrevocably, the design must be cautious, Split in the data migration at the same time, a steady stream of new data is written to, the old data is also facing a high concurrent updates, often as a result, the industry often compared data resolution capacity to is in a high-speed flight plane in the engine, it is also the whole expansion process, the most complex, the highest technical content, the most challenging tasks.

And the application of centralized business logic, the original small team size, volume is small, may be in a few application heap on a lot of code and business logic, and as the growth of the company size, business development rapidly, team size bigger and bigger, the application of centralized maintenance will be very difficult, both development and deployment, the author had developed an application, Change a few lines of code, native compilation packaging need 10 minutes, local deployment and need more than ten minutes, this greatly reduces the efficiency of development, in addition such a big project will also takes up a lot of server resources, and stand-alone hardware resources and could not have unlimited upgrade, it will also be a problem, moreover is the business logic of the coupling is not easy to reuse, to repeat around the wheel, Waste of resources, which leads to another area, which is service within the enterprise.

SOA architecture includes the popular concept of microservices, which solves the problem of reuse of resources within enterprises, avoids information islands and duplicate wheels, improves system maintainability, reduces business trial and error and system construction costs, and enhances enterprise competitiveness. General standard of unified communications SOA, including communication protocols, serialization deserialization methods, can simplify the realization of the SOA architecture, the service of automatic registration, routing, soft load can reduce operational costs, with the increase of service, by artificial alone more and more difficult to service governance, derived a service management system, and calls to the service, rely on, Manage information such as exceptions in a unified manner. When applications become stateless, it is very easy to scale up, with load-balancing hardware and software infrastructure, or the soft load mechanism of SOA, to easily increase or decrease the capacity of machines as needed, and this capability is almost linear (at a certain scale). Of course, most of the scenarios and technical solutions, foreign Yahoo, Google, Facebook, Linkedin, Twitter… In fact, BAT, the well-known domestic Internet companies, have largely been the pioneers and later followers, just following in the footsteps of their predecessors, and the risk of architecture has been greatly reduced.

Architect skills or literacy, should the architect write code at all?

CSDN: What are the skills or qualities required to be an architect?

Chen Kangxian: Here represent only personal point of view, the design meets the requirements of the system are the basic skills of the architect, functionality, availability, scalability, and capacity for the team, project execution risk, running environment require comprehensive consideration, the architect’s capability more embodied in the technique on the integrated use of, so the technical details for the project need of understanding must be comprehensive, Only in this way can the most appropriate technology be used where it is most needed, and it is also necessary to be forward-looking with technology, discover potential risks through experience and accumulation, and understand problems beyond the surface. Logical thinking and abstract thinking are important qualities of an architect.

As an architect, of course, also need to be a very important skill, is full of communication, to complete the design of the system is just the first step on the long journey, design ideas need fully convey to the team, and get the corresponding feedback from the team, to adjust plan, constantly improve, only everyone in the team after understanding your design, Follow-up including the advance of the implementation of the project to become smooth, detail is the devil, and in the subsequent execution process, may face various problems, involved in the plan adjustment, communication and coordination is inevitable, as architects, there needs to be well prepared, a good architect can lead the team to a tear, and incompetent architect will ultimately lead to contradictions, For collaborative teams, it is necessary to identify possible risks, including interfaces, time nodes, compatibility, and technical problems that docking may encounter. For the risks that may be encountered, the architect must be aware of them, prepare in advance, and deal with them calmly.

Linus Torvalds says,

Talk is cheap, show me the code.

But what I’m trying to say is,

Talk is not cheap, talk is important too!

Many people will ask whether architects should write code at all. First of all, PERSONALLY, architects need to write code, but time is limited. The larger the scale of the project, the more details need to be considered, and the more time will naturally be spent. On top of that, you, as an architect, need to spread the word, tell everyone what you understand architecture looks like, how to do it, what the core goals are, what the core risks are; Architects also need to coordinate all rely on related system, tell others what you want to do a thing, need how to cooperate with others, why do it value and others would like to cooperate with you, this is also need to spend a lot of time, so, in the rest of the few, the architect can write code may not much, but, in order to make your design system is not divorced from reality, You must write code, Review key code, make sure that the overall architectural approach is followed, make sure that your design is easy to implement, make sure that potential risks are properly controlled, especially with the introduction of new technologies, prototyping is a necessary step. There is a view that the architect must be the most code, one written code, natural need not unified thought, but in fact it’s very hard to do, as an architect you have to remember, is not you a person in the struggle, don’t let yourself become the bottleneck of the team, but, I also don’t agree with the architect no coding, no experience, Some risks are difficult to judge in advance. Besides, technology itself is also developing, and today’s experience may not be effective tomorrow. As the most basic skills of programmers, coding is the most direct way for you to learn and accumulate.

The architect is also an ordinary man, only 24 hours a day, need to spend a lot of time for the project design, technical rationality thinking, prototype test and verify, also need to spend a lot of time to team communication design ideas and goals, why this design, what is the advantage of this design, don’t design will have what kind of problem, so people is one of the most complex organisms, Programmers are very individual character and very clever, unified ideological goals is a very difficult task, one thousand people. There are one thousand Hamlet in the heart, also, do one thing may have different methods, way too much is not good, sometimes as an architect, need to find the most appropriate method, and let it get you for approval, This is not the process of transforming personal goals into team goals, but the most suitable solution for the current business scenario found on the premise of full participation after continuous communication, improvement and evolution.

CSND: What are your plans and expectations for the future?

Chen Kangxian: this road is meant to be the royal road to learning technology, code nongda most of the time of life is dull, and it is a learning industry, upgrading of technology very quickly, the tangle of failure, contemplate the helpless, the joy of success, the suantiankula, I think only real code farmers can experience.

In the near term, should continue to focus on live, ali has accumulated rich experience in the field of electricity, but also for live belongs in the realm of a yet to be mature, there is plenty of room to improve, the technical challenge is bigger, subsequent hope to be able to do something, to reduce the threshold of the live, reducing the consumption of resources, improve the stability of the service. Like in the movie “Lord of war” Yuri Orlov (Yuri Orlov) of the classic lines, people always wanted to do something, in the course of his life is just a temporary also didn’t want to good what to do, of course, I don’t like Yuri to peddle munitions, the development of society and change quickly, is hard to predict what you’re after five years will focus on, However, as a person who loves technology and likes specialized research, I should still do something related to technology and engineering.

CSDN: Finally, what would you like to say to the readers of this article?

Chen Kangxian: When graduation to find work, the truth also never thought that a work to do for such a long time, and later may have to continue to do it for a long time, actually graduated from the first job is very important, because when you start to work, you will no longer be a blank sheet of paper and you later to find a job again, the previous work experience will be a very important reference, Your first job will have a big impact on the overall direction of what you do next, and making a transition later can be a lot harder and more risky.

First of all, you have to know what kind of work you like to do, because doing what you like to do, you are more willing to pay, do not feel hard, do not feel painful, but enjoy it. Work is a long-term thing, therefore, it is worth you to think about what you like to do.

The other is to see the space to grow up and subsequent A drop of water in A glass of milk, water becomes A milk, A drop of milk into A glass of water, milk is turned into water, more than likely A company A to give you offer for 1 to 2 k, and the company B is able to provide you with A bigger stage to play, to grow, and provide you with A system of training and growth of the system, And are surrounded by the industry, the choice at this time, testing your wisdom.

Short-term 1-2 k, in the long run, actually really don’t consider as what, but the losses may be long-term development space, that some people say that it’s not the high of the salary to give that pay more attention to yao, said yes, but there is no room for advancement to go to a company, or is already a sunset industry, maybe you really very good, may be your height on behalf of the company, the highest level, Then where is your space, this is actually a head of a dog than the tail of a lion PND tail-on choice problem, have more space to choose a more potential company, may start in your team is not so good, but work hard a few years later, you go out with their peers than, difference is very big, and the company will have a set of mature and excellent relative fair evaluation mechanism, generally speaking, People who are good enough to create more value for the company will be rewarded accordingly, so there is no need to worry about the problem of return. In fact, HR is not stupid, you may get the sum of financial return + growth space, and the two parts combined, most companies should offer about the same price.

Choice is very important, persistence is sometimes very important, do things to be able to calm down, do not be afraid of difficulties, life is like riding a bicycle, to keep balance, you have to move on.


The original post was published on March 1, 2018

Author: Chen Kangxian

This article is from the cloud community partner “Architecture House”. For relevant information, you can follow the wechat official account of “Architecture House”

If you find any content suspected of plagiarism in our community, you are welcome to send an email to [email protected] to report and provide relevant evidence. Once verified, our community will immediately delete the content suspected of infringement.

Use the cloud habitat community APP, comfortable ~

The original link