Abstract: ali cloud is available for ECS instance system events, when you receive notice to ali cloud scheduled maintenance, can take advantage of the ECS plans for instance related maintenance operation plan, and according to your business characteristics to choose the appropriate timing ops failover operation, reduce the effect on the system reliability and business continuity.
As a leading and trusted cloud computing service provider, Ali Cloud provides and guarantees the availability, stability and security of computing, storage, network resources and underlying infrastructure. According to your own strategic development and business requirements, you can design highly available IT architecture on the cloud, choose appropriate products and services on Aliyun to build and deploy business systems, and manage the data in them. On this basis, ali Cloud can realize rapid resource allocation, build multiple sets of environments, automatic deployment and other IT operation and maintenance capabilities through various means such as API, monitoring and choreography provided by Ali Cloud.


Compared with common IDC rooms and server vendors, Aliyun uses stricter IDC standards, server access standards, and operation and maintenance standards to ensure the high availability of the entire cloud computing infrastructure, data reliability, and cloud servers. On this basis, Ali Cloud provides multi-availability area services in various regions. When you need higher availability, you can use the multi-availability area of Ali Cloud to set up your own active/standby service or hypermetro service. For industries with higher requirements for business continuity, such as finance, higher availability services can be built through multiple regions and multiple availability zones, and higher RTO and RPO data assurance capabilities can be achieved. For a single ECS instance, Ali Cloud promises that the service availability of a single ECS instance within a service cycle will not be less than 99.95%. For the single region multiple availability area, Ali Cloud promises that the service availability of the single region multiple availability area within a service cycle is not less than 99.99%. In order to ensure a high level of service availability, ali cloud will be active to carry ECS instance running physical server to do daily maintenance and repair of potential system failures, such as hardware and software in order to enhance the system reliability, performance, and safety protection ability, and detect the physical server exists the problems online hot migrating instance to the health of the server, Keep the ECS instance healthy.


However, as a user of Ali Cloud, you may still receive such message notification, reminding your ECS instance needs to be maintained due to the risk of failure of the physical server where it is located. Ali Cloud has set a system plan event for instance restart, and the instance will be restarted and migrated to a secure physical machine in 2 days.


Why, you may wonder, are you receiving this information? In fact, this is a maintenance notice automatically triggered by the active operation and maintenance of Ali Cloud platform. In the active operation and maintenance process, instances cannot be migrated online due to some hardware and software faults. In this case, Aliyun will send the above notification to users to remind you that the system is about to restart instances for migration. In order to improve your operational efficiency and experience of ECS instance, ali cloud will release ECS instance system has the function of events, when you receive the notification, can be used in the ECS console or OpenAPI check system planning events, and according to the needs of business execution system events to choose the appropriate time points (in some cases, can only wait for the system event scheduled time). This eliminates the need to contact customer service through work orders, reduces risks, and provides a basis for automated failover based on system events, making operation and maintenance more efficient.


So what types of system events do ECS instances have? Aliyun will preferentially release Reboot events triggered by system active O&M, and then provide more diversified event types to meet various O&M scenarios. If a system schedule event exists, a prominent flag appears on the ECS console pending event button to alert you to it. On the page of Scheduled Events > System Scheduled Events, you can view instance information, including the instance ID, region, and running status, system events to be executed, recommended user operations, and available operation keys. You can also by calling OpenAPI DescribeInstanceFullStatus query manually or automatic polling instance system plan events.


As you can imagine, when ECS instance carrying key business, any unexpected instance restart may pose a threat to the system availability and business continuity or serious impact, so we suggest you in structures, application system can make full use of the available area, load balancing, and other functions and services to enhance the overall usability of the architecture and service. On this basis, ali Cloud will usually send you 48 hours’ notice of system events triggered by system failures. Therefore, you can take advantage of the user operation window before the scheduled event time to perform prepared load and failover operations and restart the instance, for example, In a clustered environment will load in time from the planned event instance transferred to other instances, or backup in advance, transfer the data of local disk, or take the initiative to adjust the load balance and flexibility of configuration, and based on the business logic to do the order start-stop instance active operations such as operation, minimize instance reboot the impact on the business continuity.


The types and scenarios of ECS system events will continue to be improved and expanded. We hope that in this way, we can gradually improve your operation and maintenance efficiency and experience on Ali Cloud, and provide more complete interfaces and services to support users to achieve uninterrupted operation and sustainable business on Ali Cloud.


The original link
To read more articles, please scan the following QR code: