Author | White MANna

At the site of 2021 Hangzhou Computing Conference, Li Guoqiang, product leader of Aliyun intelligent cloud native application platform, gave a comprehensive interpretation of aliyun cloud native product innovation practice with the theme of “Transformation of Enterprise Internet Architecture – Upgrade and Release of Aliyun Middleware”. In the past year, it has become a trend to restructure application architectures in response to increasingly fierce industry competition. According to authoritative data, more than 80% of users have used or plan to use microservices, and more than 68% of organizations use containers in production environments. More than 85% of users use distributed tracking, monitoring tools, and logging. These changes highlight enterprises’ strong demands for application architecture cloud biology, deployment operation and maintenance cloud biology, and stability upgrade.

As a beneficiary of cloud native technology, Alibaba Group fully obtains the bonus of cloud computing technology through cloud native technology, and realizes the world’s largest cloud native practice. All businesses run 100% on public cloud and apply 100% cloud native biotechnology. Based on the integration of soft and hard container optimization, the online business deployment scale of millions of containers has brought the technical value of 30% increase in CPU resource utilization, 80% reduction in the cost of ten thousand transactions and 20% increase in r&d operation and maintenance efficiency. Based on this, Alibaba shares these best practices and solutions to the society, helping tax, human resources, banking, insurance, petroleum and petrochemical, retail FMCG, automobile manufacturing, Internet platform and many other industries to tap more social value. After years of technology precipitation, Ali Cloud provides more than 300 cloud products, nearly a thousand solutions. Among them, message queue MQ, application real-time monitoring service ARMS, enterprise distributed application service EDAS and other services have become essential components in distributed Internet architecture for many enterprises. And this cloud conference is also the first time to expose the new features of these products.

RocketMQ5.0 is a major upgrade

Message queues, as the communication infrastructure of contemporary applications, are the core dependencies of microservices architecture applications, enabling users to build distributed, high-performance, resilient and robust applications more efficiently through asynchronous decoupling capabilities. In terms of data and value, message queues are becoming increasingly valuable. The core business data flowing in message queue involves different links and scenarios such as integrated transmission, analysis, calculation and processing. As it continues to evolve, we can expect message queues to generate new value and create new “chemistry” in data channels, event integration-driven, analytical and computational scenarios.

Alibaba Cloud RocketMQ released version 5.0 and upgraded to a one-stop “message, event and stream” integration processing platform with the following two highlights:

(1) Message core scenario extension: covering event-driven and message stream processing and many other scenarios; (2) Architecture iteration of one-stop fusion processing technology: realizing a message storage to support streaming computing, asynchronous delivery, integration-driven and other fields.

RocketMQ5.0 brings three new features in addition to two:

(1) RocketMQ infrastructure new upgrade lightweight SDK open and full link observables system upgrade Message level load balancing Multiple network access support massive storage tiering (2) Lightweight message ETL feature in Streaming processing scenarios lightweight dependency free development low barrier Serverless Flexibility (3) Best practices on EDA Cloud – EventBridge unified standardized event integration ecosystem Global Event Interoperability network Serverless low code development

The micro service product family upgrades again

As an important representative of the current applied Internet architecture, microservices are increasingly integrated with containers, and it can be seen that enterprises are increasingly clear about the application architecture and business requirements of microservices. Architecture, such as Spring Cloud, Dubbo java-based microservices, and Service Mesh technology, which is emerging with the emergence of multiple trends, have become the mainstream. In terms of demand, business development and design oriented to microservices, software infrastructure native container, application production operation and maintenance upgrade bird’s eye view have become the core demands. Ali Cloud perfectly supports these two kinds of different micro-service systems through MSE and ASM service network.

Under the micro-service architecture in the virtualization period, the business usually adopts the two-layer architecture of traffic gateway + micro-service gateway. The traffic gateway is responsible for the north-south traffic scheduling and security protection, and the micro-service gateway is responsible for the east-west traffic scheduling and service governance. However, in the cloud native era dominated by containers and Kubernetes, Ingress becomes the gateway standard of Kubernetes ecosystem, giving gateway a new mission and making it possible to combine traffic gateway and micro-service gateway into one.

This time, the cloud native gateway released by Ali Cloud MSE changes the two layers of gateway into one layer under the condition that the capacity is not discounted, which can not only save 50% of the resource cost, but also reduce the operation and maintenance and use costs. MSE cloud native gateway is built on Envoy and Istio to realize unified control surface control, directly connect to back-end services, support Dubbo3.0, Nacos, connect to ali cloud container service ACK, and automatically synchronize service registration information.

MSE cloud native gateway has long been tempered inside Alibaba. At present, it has been used in Alipay, Dingpin, Taobao, Tmall, Youku, Feizu, Koubei and other Ali business systems, and has passed the test of massive requests in 2020 Double 11. It can easily carry 100,000 requests per second and the daily request volume reaches ten billion level.

As the first fully hosted ISTIO-compatible service grid product in the industry, Ali Cloud Service Grid (ASM for short), as a unified management micro-service application traffic, isTIO-compatible hosting platform, focuses on creating a fully hosted, secure, stable and easy-to-use service grid. Supports unified governance of multi-cluster and multi-cloud hybrid cloud services across regions, enabling ubiquitous application services to easily communicate with each other across multiple heterogeneous computing infrastructures. Today is the release of ASM Pro professional edition, covering more application scenarios, including:

  • Support Dubbo and other micro-service frameworks and extension protocols: provide more scenario-based capabilities to meet customers’ demands for grayscale release, Canary release, lossless offline service traffic, and full-link grayscale.
  • Full integration of multiple service registries: Full integration of high availability of Nacos service registries, multi-language service interoperability across registries, and high-performance, large-scale scenario support.
  • Unified service grid capability of integrated cloud and Edge: support the unified governance of services on cross-region multi-cluster and multi-cloud hybrid cloud, support ACK Edge Edge cluster and explore the service grid scenarios in Edge computing.
  • Optimize existing applications for modernization: Support mixed deployment of heterogeneous computing infrastructures, such as containers and VMS, to facilitate vm application migration; Enhance the dynamic execution of OPA policies, achieve zero trust security without code modification, and simplify the management of multiple types of computing infrastructure applications.
  • Full-stack optimization: Reduce service communication delay and encryption overhead through operating system and hardware integration, and improve the efficiency of TLS encryption and decryption as well as data performance.

Through functions such as flow control, grid observation and communication security between services, service grid ASM simplifies service governance in an all-round way and provides unified management capability for services running on heterogeneous computing infrastructure. Applies to ubiquitous Kubernetes clusters, Serverless Kubernetes clusters, ECS virtual machines, and self-built clusters.

Finally, in the development process of micro-service applications, a full-station platform is needed to cover the whole system of application architecture design, development, testing, on-line, operation and maintenance. One-stop cloud native application research and development support is of great significance to users’ efficiency improvement. Therefore, ADD, the cloud native application design and development platform, emerged at the right moment to help enterprises quickly develop native applications and manage cloud native applications in the full life cycle from the perspective of applications, with the following features:

I. Application development & Architecture Design: Implement drag-and-drop design supporting application architecture diagrams, provide preset and enterprise custom application architecture templates. 2. Cloud native asset store: improve out-of-the-box middleware services for enterprises, precipitate common business components and common technical middleware of enterprises, and realize standardization, productization, sharing and reuse of enterprise software assets.

At the same time, EDAS V4.0, the enterprise-level distributed application service, reconstructs the whole process of user application release and online, realizes the bird ‘s-eye view operation and maintenance and dual-mode governance, helps the modernization of application operation and maintenance, and speeds up the original biochemical of online business cloud.

ARMS 3.0 – Enterprise Observables All in One

As an important part of an enterprise’s technology architecture, different communities and organizations are increasingly converging in their views of trends in the observable realm:

  • Full stack integration: when a request enters the business system, from the front end to the application layer to the fixed resources, how the enterprise connects the whole link in series and integrates the vertical link and horizontal data becomes the key ability of the operation and maintenance team.
  • Cloud native observables Standardization: When the observables open source fields Grafana, Prometheus, and OpenTelemetry become de facto standards, enterprises build cloud native observables more efficiently and with traceability.
  • AIOps: With the continuous expansion of each enterprise’s technology, the scale and dimension of operation and maintenance data are constantly increasing, including massive indicators, logging and tracing data. AI plays a huge role in this process, finding and resolving anomalies and problems faster and more efficiently.

In order to meet the above trend and demand, Ali Cloud released ARMS 3.0 to help enterprises realize the All in One observable system, and realize unified access, unified indicators, unified links, unified metering, unified panel, unified alarm.

  • Support 50+ technical components, from access experience, service applications to the infrastructure layer vertical full link;
  • Metric, Logging, Tracing are connected horizontally to speed up problem diagnosis.
  • Fully support Prometheus, Grafana, OpenTelemetry cloud native observability three open source standards;
  • Supports access to 10+ alarm monitoring systems, implements unified management of discrete alarm messages, and provides intelligent noise reduction and root cause analysis capabilities based on algorithms and Ali experience.

It is worth mentioning that Ali Cloud became the only cloud vendor selected in Gartner APM Magic Quadrant 2021 by ARMS, and its product capabilities and strategic vision were highly recognized by Gartner analysts.

High availability

AHAS, a member of the High Availability family, has also undergone a major product upgrade. Application High Availability Service (AHAS) focuses on improving the High Availability of applications and services. It provides three core capabilities: traffic prevention, fault testing, and multi-active DISASTER recovery. Each module in this upgrade has greatly improved the stability and resilience of the customer’s business.

First of all, in terms of traffic protection, the innovative cluster protection function is provided to help customers solve typical cluster flow control problems such as uneven single-node traffic and small cluster traffic. At the same time, in the gateway protection scenario, the nginx plug-in scheme based on the C/C++ native version is currently supported. While stably supporting Sentinel core flow control and API grouping capabilities, the performance loss is greatly reduced, with throughput loss within 5% and CPU usage within 0.8 core. In addition, the alarm monitoring capability and protection scenarios have been greatly improved and optimized in terms of service scenarios and ease of use.

Chaos is a cloud native Chaos engineering platform, which provides large-scale, low-cost, controllable and diversified fault drill services. Chaos provides one-stop architecture analysis, fault inspection, fault injection, system steady-state measurement and other functions to help users enhance fault tolerance and recovery of distributed systems, and help the system stabilize the cloud. The fault drill platform has been upgraded in terms of the drill scenario, drill format, ease of use and open source compatibility.

  • In a drill scenario, a Windows-based drill node is supported. Supports one-stop disaster recovery network disconnection drills such as precheck, network disconnection, recovery, and recovery. Microservice walkthroughs have also been upgraded to 2.0 to support automated validation of strong and weak dependency of service levels.
  • In terms of drill form, this blockbuster release visual drill supports one-click drill based on business architecture topology.
  • Open source compatibility: Supports online hosting of community edition to enterprise edition and one-click upgrade to enterprise edition.

The multi-active Dr (MSHA) solution is upgraded from the remote multi-active Dr Solution to the multi-active Dr Solution, which is more compatible, stable, and simple.

Compatible with richer Dr Architectures and service components.

Added the same-city hypermetro/Multi-active Dr Architecture, remote hypermetro Dr Architecture, and remote application hypermetro Dr Architecture. Added support for multi-active Dr For components such as MQTT, ScheduleX, K8S, and PolarDB.

The core DISASTER recovery capability is strengthened, and the stability is improved by more than 50%.

By optimizing and hardening the multi-active Dr Architecture at the access layer, service layer, message layer, task scheduling layer, and data layer, top-down traffic penetration optimization improves the overall Dr Stability by more than 50%.

Zero reconstruction in the same city, and more than 20% reduction in remote disaster recovery reconstruction work.

In the same-city scenario, services are not modified and the same-city multi-active Dr Service is online within 3 hours on average. In remote container service scenarios, agents can be rapidly integrated based on Pilot, greatly reducing Dr Transformation costs.

This comprehensive upgrade gives the business technology team more choices. Through simple, rich, open and low-cost PaaS services, it helps enterprise customers carry out cloud innovation more easily and efficiently, and builds a technology system that is more in line with business needs and team conditions