background

Custom monitoring of Ant Financial services is an important function of ant Financial’s monitoring products. It mainly solves the real-time monitoring requirements of Ant Financial services by customizing log data sources and configuring the large market. In terms of product functions, users can create, organize, manage and configure a series of log data sources to easily and quickly organize a multi-dimensional monitoring market. This capability was an innovative capability at that time, which well solved the pain points of ant Financial’s complicated business monitoring in terms of functions and product experience.

However, with the iterative update ant gold suit monitoring products, as well as cloud native observability for monitoring the high demand of the market, to custom monitoring experience demands more and more, including more convenient interaction and rich graphics, more data sources, more extension points, etc., thus to upgrade the market is an unavoidable problem for the monitoring.

This paper will introduce the innovative design and attempt of Ant Financial’s monitoring products in the field of market monitoring. The new version of barad-DuR aims to become the best experience of the industry. There are many innovations in interaction, experience and design concept. It can simultaneously serve ant Financial’s internal and external monitoring systems.

The product experience

WYSIWYG

Currently, excellent monitoring products are equipped with a “WYSIWYG” editor, which has been missing in Ant Financial’s monitoring products. In ant Financial’s monitoring products, the configuration of the large market is still through the traditional form, which is very unfriendly to users, with steep learning curve and low configuration efficiency. As a result, users often submit the market configuration to the monitoring team as a requirement, and the “market configuration experts” of the monitoring team carry out the configuration, which not only has a high communication cost, but also adds a great burden to the monitoring team.

In the new version of Barad-DUR, a lot of work has been done on the interactive experience of WYSIWYG editor, striving to achieve the best editing experience on the market.

Experience 1: Zoom

Barad-dur can be scaled on the sides and corners, while the most common large-scale products only support the lower right corner of the zoom. Since the coordinate system usually uses (left, top, width, height) to define a rectangle, the most easy to achieve is the lower right corner of the scale, just need to change the width and height two parameters can be. The most difficult thing to achieve is the zoom of the upper left corner. The four parameters need to be changed at the same time, and the relationship is complicated. Especially after the introduction of grid layout, it is more difficult to automatically “adsorb” adjacent grid points when scaling.

Experience 2: Dragging

Barad-dur’s chart drag allows for a one-step swap of chart positions, whereas a common market product requires multiple drags to swap two charts. Barad-dur does not have this problem, and the overall layout of the chart can be disturbed during dragging.

Experience 3: Automatic relayout

Barad-dur’s automatic relayout feature is powerful enough to allow real-time layout previews (as well as the common large-sized products), while the layout of the large-sized products is adjusted according to the direction of the specific operation (zoom, drag). The typical mass-market product can only be arranged vertically because the algorithm is so simple that it simply “pushes” all the charts onto the page.

Experience 4: Anywhere

The layout of Barad-Dur allows charts to be placed in any position, which is not possible on the market due to the simple algorithm described above, and all charts must be stacked at the top of the page.

Experience 5: Reset the layout

Barad-dur’s automatic relayout has the ability to “push back” other charts as adjustments are made to a single chart, and then, even more powerful, to reset the pushed chart. Here is a comparison of open source layout frameworks that are used directly from common large-scale products in the market. The framework actually provides the above arbitrary position function, however, because there is no layout reset function, resulting in the function once enabled, will make the whole market layout in the editing process is disrupted, the user can not help any, so the common market products did not enable this function.

Experience 6: Text editing

Barad-dur supports adding static text to the platter and editing text. Static text can be used for announcements, titles, instructions, and other common market scenarios.

Functional comparison

Barad-Dur A common product in the market
Any drag ✔ ︎ ✔ ︎
Any zoom ✔ ︎
Various chart ✔ ︎ ✔ ︎
Real-time chart editing ✔ ︎ ✔ ︎
Chart import and Export ✔ ︎ ✔ ︎
Any layout ✔ ︎
To add text ✔ ︎

In summary, it can be seen that Barad-Dur’s WYSIWYG editor has been ahead of the common market products in various functions.

The controller

The market, The e Dashboard (in an automobile or similar Vehicle) is a panel beneath the front window having various evacuation and accessories for the use of the driver; The instrument panel. Its original intention is to refer to the dashboard on the car, the dashboard here includes two types of components: monitor, controller. On the dashboard, you can not only see the current state of the car, but also control the car. This is the original intention of the market, but as far as it is concerned, all the monitoring products on the market are missing this important part of the controller, leading to the monitoring of the market is actually just monitoring the market. If only used for monitoring, the market independent existence is meaningless, just like the car dashboard only tachometer, speedometer, odometer, but no throttle, brake, gear lever.

Let’s look at the big market of a few industrial products:

Mass-produced products aimed at the general consumer

Mass production products for professional consumers

Customized products for experts

The controller is an integral part, even more important than the monitor. Barad-dur provides the function of setting control buttons in the tray, which can implement some simple controls, such as turning off/on alarm, opening the Nail chat window, starting control plan, etc. In the future, more powerful control functions will be continuously added to make ant Financial’s monitoring system into a complete monitoring system.

The technical implementation

Custom data source

As mentioned above, Barad-DUr supports secondary development, supports custom data sources, and requires little work to access your own data sources:

  1. AbstractDatasource and doRequestData interface;
  2. Call registerDatasource to register the data source with Barad-dur (if barad-Dur’s data source editor is used, you can specify a custom data source editor at registration time);

Barad-dur wraps all data sources, providing caching, incremental loading, request merging, and so on.

Unified sequential data source

To enable custom data sources to be correctly represented in any chart, Barad-Dur defines a universal sequential data format that supports multiple keys and multiple values. All sequential data sources (and possibly non-sequential data sources in the future) will convert query results to this format, and all charts will be presented in this data format.

The advantage of using a uniform data format is that both charts and data sources are implemented according to the same data interface (convention), so the charts and data sources can vary independently. That is, the chart can be switched without changing the data source configuration, and the data source can be switched without changing the chart configuration. This is a common market products can not do.

Another big advantage is computing. Barad Dur – support simple front-end calculation data source (e.g., calculating the ratio of the scene will need to deduct data A and B), after the use of A unified data format, the calculation is also regarded as A sequential data source, its input is A set of time-series data source, that is A data source can reference another calculation data source. This is also a common market products can not do.

Scene Graph

The concept of Scene Graph is often used by game engines to render scenes. Since each node in the scene has a parent-child relationship and the spatial relationship of the child node is often expressed by the amount relative to the parent node, a data structure is needed to transform the amount of the local space (translation, rotation) into the amount of the global space, which can finally be converted into the amount of screen space for rendering. This father-son relationship corresponds exactly to the relationship between the individual charts in the market and the overall market. To take one of the most common requirements, for example, there is a global playback feature on the platter (this is a very important feature, without which the platter is useless for troubleshooting), and each chart has its own Settings:

  • Time span: Minute charts and second charts do not show the same range of data;
  • Time offset: there are different delays in chart data generation;

We can use a data structure similar to Scene Graph to store each chart’s own timeline configuration as well as the global timeline configuration, and finally calculate the time parameters required to query the data.

Also in the future will be the concept of a technology stack, a predefined group of charts that can be placed directly into a custom large plate with minimal configuration. For example, you can create a CPU, Memory, and Disk monitoring chart of a physical machine in one step by modifying the IP parameters of the chart group.

Therefore, the design concept of Scene Graph is used for reference in Barad-DUr and integrated into the design requirements of the large plate.

The overall structure is a tree structure, but each node has an MVC structure to separate data source, view and control data, and control flow from data flow. At the same time, the data source part can be interdependent, so that Barad-DUR can optimize the data query, cache, incremental query, merge query, etc.

future

Currently, Barad-DUr has built-in support for OpenTSDB, CeresDB (ant developed high-performance, distributed, highly reliable timing database, supporting PromQL) and some ant Financial internal data sources, and plans to be compatible with more data sources. Common monitoring data sources include PromQL, InfluxDB, and MySQL. As mentioned in this article, you can predefined a set of charts and a set of variables, quickly add corresponding chart components when creating a market, and also support the import of the market directly exported from other market products, so that users can quickly and smoothly migrate.

I hope the introduction of this article can bring you some thinking and inspiration in the design of cloud native monitoring field, and welcome to pay attention to the excellent you in this field, to exchange more ideas with us ~

About us

Welcome to the world of ant Intelligent Operation and Maintenance. This public account is produced by ant Intelligent monitoring team. For students who are concerned about intelligent operation and maintenance technology, we will share with you from time to time ant Financial’s thinking and practice on intelligent monitoring architecture design and innovation in the cloud native era.

Ants intelligent monitoring team, responsible for the ant gold service infrastructure and business application of monitoring requirements, are trying to build a level support millions of machines, hundreds of millions of cluster size service invocation scenario, cover index, log, performance, and links, such as monitoring data, including collecting, cleaning, computing, storage, and even the market show, off-line analysis, alarm and returning for positioning, and other functions, At the same time, it is equipped with one-stop and integrated monitoring products capable of intelligent AIOps, and serves ant Financial’s many businesses and scenarios.

If you have any topic about “intelligent operation and maintenance”, please leave a comment and let us know.

PS: Ant Intelligent monitoring is recruiting AIOps experts. Welcome to join us. If you are interested, contact [email protected]

Public account: Ant Intelligent operation and maintenance