This is my fourth day of the August Challenge.

The paper

The previous articles always talk about “operation and maintenance management”, “operation and maintenance automation”, maybe everyone is tired of listening to the truth, everyone will say, can we do something to put these “nonsense” into practice?

I am also constantly wondering whether I should share these, because they are my own personal understanding in the process of work, and they are all wild ways. But on the other hand, the work of operation and maintenance is not a simple tinkering, but to empower the business, let oneself realize value, so the following article is more about landing.

Operational framework

In the article “Operation and Maintenance Thinking: Operation and Maintenance Management and Automation”, we extracted the operation and maintenance framework (red represents missing) from the operation and maintenance work, which is composed of the infrastructure layer, data layer, application layer, management layer and display layer, and generated our final operation and maintenance system.

Below we proceed from the following several questions to discuss in depth.

1. Why should the operations framework be layered? In my opinion, there are the following:

  • Operation and maintenance is oriented to the team rather than the individual. Layering enables everyone in the team to find their own work focus and clarify the management ideas and goals of operation and maintenance.
  • Layering is the logical dismantling of operations and maintenance to form a context. So some of the operations that we do are not isolated, they involve different levels, they can have a lifecycle.

For example: server shelf, it involves the following layers: (1) infrastructure layer: server, operating system, etc. (2) application layer: basic components, middleware, etc. (3) management layer: unattended, CMDB, monitoring, etc. (4) display layer: Without layering, zabbix, Blue Whale, etc., we can isolate and repeat operations, ignoring the fact that we can completely solve the problem with a set of automated processes.

  • Layering can help us better sort out knowledge points and make inventory, and form a benign supplement to the operation and maintenance work.
  • And you will say I blow NB, but at least in my eyes is very important, help me clear management ideas.

2. If the operations framework is so important, how is it generated?

The final operational framework was not created overnight, but evolved gradually. The initial version is as follows:

The initial version of the o&M framework was coarse in granularity, but its core elements were:

  • It is divided into infrastructure, system application and platform service
  • Base component, business component, common component
  • Develop technology stack classification

These elements will exist in one form or another no matter how the operational framework evolves, so we need to sort them out at this stage to lay a solid foundation for the future.

The disadvantage of this phase is that the system application services are deviated and related to the business. Although the operation and maintenance is to support the business, the operation and maintenance framework aims to sort out the operation and maintenance architecture and provide architectural support for the operation and maintenance. Therefore, in the following separate separation of application layer, from the business implementation of basic services, business applications, middleware three common characteristics.

Operational specification

Finally come to the point, how is the operation specification generated?

  • O&m specifications are never made up and need to be generated from facts extracted from fragmented o&M work
  • Fragmented o&M work exists at all levels of the o&M framework, so o&M specifications are extracted according to the framework layers

With these two points in mind, we can extract them according to the various levels in the operations framework. Of course, due to the continuous evolution of the o&M framework, o&M specifications are continuously generated and added to o&M work.

1. Infrastructure services

  • Operating system installation specifications
  • Catalogue Management Specification
  • System configuration (initialization) specification
  • JDK Installation Specifications
  • Network device configuration specifications
  • , etc.

2. System application specifications

  • System on-line specification
  • Process management specification
  • Backup Management Specifications
  • Hosts specification
  • , etc.

3. Platform service specification

  • Monitoring management specification
  • System inspection specification
  • Log Collection Specifications
  • Jumping machine management standard
  • CI/CD specification
  • , etc.

Specification generation is shown in figure:

The specification is critical

When you have a specification, do you apply it for a while and then stop updating it? If this happens, I think it is mainly for the following reasons:

  1. Normative summarization becomes a burden of work;
  2. The standard style is not unified, different team members because of the format is multifarious, very chaotic;
  3. The standard text is too much, reading time, become decoration;

In addition, the specification must be sustainable. Combined with the above issues, when the final specification is generated, the operations team needs to clarify the purpose of the specification and make it light.

To solve this problem, I have customized a specification to the specification itself:

conclusion

The operation and maintenance specification is only the premise of automation, and the specification is only the first step to complete the long march, then we only need to strictly follow the specification to implement, constantly optimize, the rest is the natural thing.

Finally, my “wild way” is so to come, I hope to inspire you, do not like spray!