Real-time data characteristics

In general, visual reports communicate the information behind the data to us in a more efficient way. Using a simple BarChart, we can easily compare the sales difference of an item in the second quarter. By using a simple LineChart, it is easy to see the distribution of average hours worked by employees in a given month. These reports are more or less time related: over time, a metric changes for a variety of factors.

On the other hand, there are areas where we need more timely reporting. For example, the online index analysis of the product: how many users are currently online, how the load of the main site is, how many online transactions are being formed and so on. In addition, a lot of operation and maintenance data also hope to have higher real-time, such as the current load of the server, what is the load situation of the past five minutes and so on.

The characteristics of such statements are:

  • High timeliness
  • For fine-grained metrics, the amount of data can be large
  • After a certain period of time, the value of data plummets

For example, here’s a real-time report of CPU usage on a Mac, showing the computing load on each core over time. This information is constantly being generated and discarded, and no one is looking at the CPU usage of the last hour, as long as it shows the last few minutes.

Based on these features, how to access data, how to analyze metrics, how to scroll through historical data, and so on are different from other charts. In addition, because the visualization of real-time data is strongly time-dependent – it has to be essentially a dynamic chart, which is different from other chart types. In this article we will discuss these issues and common solutions to them.

Data indicators

For real-time data, we focus on the number of times different events occur, and how long they last when they occur. We first need to define some objects:

  • Counter (counter)
  • Timer
  • Scalar (Gauge)

counter

Counters refer to the number of events that need to be recorded (usually incrementing/subtracting the counter by one each time they occur). This type of data increases/decreases with a fixed pattern, such as:

  • In response to a request of 200 –response.code === 200
  • Requests generated from a session –session.id === 'b1b2b3bab22123bb1a'

The timer

A timer involves all the events that should be recorded for a certain length of time. Usually, we can introduce an interval to calculate some statistical information, such as average, variance, standard deviation, maximum, minimum and so on. Such as:

  • Request response time –response.time
  • Residence time –stay.time

scalar

There is also a quantity that is often used where we don’t care how it tends to change over time, just the number/state at a moment in time, for example:

  • Node availability
  • The number of processes at a given moment
  • CPU load/memory usage at a certain time

Typical flow of data processing

In a production environment, real-time data can be provided either in the form of a log or from an event database. Log is one of the most common form, almost all of the system will log in a variety of ways, most of the log will provide rolling mechanism: log is recorded to a fixed size of the file, the old log will be rolling written to another file (usually there will be the timing of form a complete set task to clean up the earlier log, etc.). On the other hand, in many event-based software systems, events are written to a database, which can also be used as a source for real-time data visualization.

Raw data can not be directly used for visual presentation. Usually, we need to do some pre-processing, including:

  • Raw data acquisition
  • structured
  • Preliminary statistics
  • Higher order statistics

Data structure

There are many tools to help us achieve these steps. For example, with a simple configuration, logStash can automatically write the continuously generated log data to STATSD (and eventually periodically to the Graphite database) :


12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25Copy the code
input { stdin {} } filter { grok { match => { "message" => "%{DATA:time} %{DATA:status} %{NUMBER:request_time} %{DATA:campaign} %{DATA:mac} %{DATA:ap_mac} %{GREEDYDATA:session}" } } } output { stdout { codec => rubydebug } statsd {  host => 'localhost' increment => "airport.%{session}" } statsd { host => 'localhost' increment => "airport.%{status}" } }Copy the code

Logstash is a very flexible and highly customizable tool. All we need to do is specify the input data source, the matching rule, and the output data source: For the input, if there is any data that meets the matching rule, we write it to the output. Does This sound a bit like what the IFTTT(If This Then That) tool does?


1
Copy the code
tail -f /var/logs/nginx/access.log | logstash -f log.conf
Copy the code

In this example, we use the standard input as the data source. If there is a string such as time Status request_Time Campaign MAC AP_MAC session in the input, it is considered to have been matched successfully. Write it to localhost via the statsd plug-in. The increment instruction increments the counter data type mentioned above, i.e., each time a match occurs, the corresponding value increases by one.

For example, if the log content is:


1
Copy the code
1529242838 403 0.02 f3715a7f52d8cef53fef1f73134e487a 00:6 1:7 1:53: ff: b0 T2 - CL * * 2293-49 - D c8e9-8801 - b - 485-9 f1d - 9 e5a7f5a8965Copy the code

Matching results are as follows:


12 3 4 5 6 7 8 9 10 11 12 13Copy the code
{" campaign "= >" f3715a7f52d8cef53fef1f73134e487a ", "request_time" = > "0.02", "status" = > "403", "session" => "2293c8e9-8801-485b-9f1d-9e5a7f5a8965", "The message" = > "1529242838 403 0.02 f3715a7f52d8cef53fef1f73134e487a 00:6 1:7 1:53: f4:0 b T2 - CL13-49 - D87 2293c8e9-8801-485b-9f1d-9e5a7f5a8965", "@version" => "1", "host" => "juntao-qiu.local", "ap_mac" => "T2-CL*-49-D*", "Time" = > "1529242838", "MAC" = > "00:6 1:7 1:53: ff: b0", "@ timestamp" = > 2018-06-17 T13: gate. 023 z}Copy the code

At this point, the logstash counter will add one to airport.2293C8e9-8801-485b-9F1D-9e5a7f5a8965:


1
Copy the code
counter["airport.2293c8e9-8801-485b-9f1d-9e5a7f5a8965"] += 1
Copy the code

statistical

For structured data, we need to conduct preliminary statistics on the data according to a certain period. For example, for the timer type data, we need to calculate the sum, average, variance, standard deviation, median and so on. For counters, you need to accumulate their values. We can do this via statsd.

StatsD is essentially a very simple UDP-based service. For the most part, unreliable (but efficient) connections using UDP are more reliable in order to avoid the huge overhead of TCP when the data volume increases. StatsD will maintain some counter and timer counts locally after receiving the client request, and then regularly (default is 10 seconds) will send data to Graphite.

visualization

There are many considerations for visualizing real-time data. For example, will the results be rendered on a Desktop or a Web page or mobile device? For Web interfaces, do you need to consider responsive design to match different screen sizes? Is there a need for dynamic interaction, or just a static presentation?

On the other hand, what are the requirements of the project for data accuracy? Need to be accurate to the minute scale, or every 15 minutes? For data with different precision, the requirements for storage capacity are completely different. A common strategy is to reduce the precision of outdated data and only provide high precision for the most recent data.

For sales results, accurate to the day statistics are real-time data. Online transactions need to be monitored down to the second level. When designing this kind of data visualization, the requirement of real-time data should be fully considered.

A typical Web-based real-time data visualization Dashboard is shown above.

tool

To achieve real-time data visualization, we need a number of tools. In short, we need a database to actually hold the data, a client-side API to hold/query the data, and finally a library or framework for client-side rendering.

Time series database

For the storage of real-time data, there is actually a specific database category: Time Series DBMSS. A time series database is a subdivision of a key-value database. Usually, it maintains objects such as timestamp, index name and index Value. In addition, implementations often provide Query languages to facilitate the retrieval of metrics.

Here are some common implementations of sequential databases (or being used as sequential databases) :

  • Graphite
  • Influxdb
  • Promethous

Feeder / API

Although most sequential databases provide Native apis for data storage and retrieval, most of the time people prefer to use simple HTTP apis or client libraries.

For example, StatsD is a Node.js service. Through its client API, we can easily create counters, timers and other indicators.

  • StatsD
  • Graphite

Visualization of data

Grafana is a very powerful client framework already in use. It allows you to easily integrate data from multiple data sources into a single Dashboard. For example, CPU/Memory load information comes from Graphite, while users’ online status may come from InfluxDB or Promethous.

If you need more customizability, such as drawing a live chart on your own page, consider using a combination of D3.js + Cubism. It can periodically fetch data from different back-end apis and render it on SVG/Canvas in real time.

  • Grafana
  • Cubism

Direct presentation of data

Logstalgia

Logstalgia can analyze log files in a specific format and present them in a cool way, like playing a classic game to break bricks.

Logstalgia requires some required fields:

  • A UNIX timestamp
  • The hostname/IP of the request
  • The path to the requested resource
  • Response code (201, 500, etc.)
  • Size of response

1
Copy the code
1529206121 | 12.21.18.246 | / dispatcher/campaigns / 2 de808e08dccec2c7e55e41ecbd5a421 | | 20 200Copy the code

If the original log is not in this format, you can write a simple converter to fit it:


12 3 4 5 6 7 8 9 10 11 12 13 14Copy the code
const source = '[$time_local] "$remote_addr - $remote_user" "$request" $status $body_bytes_sent "$http_referer" $request_time "$http_user_agent"';

const NginxParser = require('nginxparser');
const parser = new NginxParser(source);

const moment = require('moment');

parser.read('-', (row) => {
  const ts = moment(row.time_local, "DD/MMM/YYYY:HH:mm:ss Z").unix();
  const parsed = row.request.split(/\s+/)
  console.log(`${ts}|${row.ip_str}|${parsed[1]}|${row.status}|${row.body_bytes_sent}`);
}, (err) => {
    throw err;
});
Copy the code

Note that the input and output of this script are standard input and output, meaning you can pipe it into the command line. Such as:


1
Copy the code
tail -f /var/log/nginx/access.log | node adaptor.js | logstalgia
Copy the code

However, Logstalgia only runs on the Desktop, and the look and feel of the application is relatively fixed, with little customization (such as the configuration of animation effects). More often than not, we want to be able to put the rendering of real-time data on the Web side to improve customizability.

Real-time data presentation

For real-time rendering, we can read the logs directly and send them to the client via WebSocket. The benefit of this approach is real-time, such as an error corresponding to 500, an exception of a failed trade, etc., can be presented very intuitively. Its disadvantages are also obvious. On the one hand, if the amount of information is too large, i.e. the log writing speed is too fast for the front-end to handle, on the other hand, the raw information that is amortized may be too coarse for statistical analysis.

WebScoket + D3.js


12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24Copy the code
const _ = require('lodash');
const { spawn } = require('child_process');
const generator = spawn('./generator.sh');

const WebSocket = require('ws');

const wss = new WebSocket.Server({ port: 8080 });

function parse(data) {
  //...
}

wss.on('connection', (ws) => {

  const output = (data) => {
    ws.send(JSON.stringify(parse(data)));
  }

  generator.stdout.on('data', output);

  ws.on('close', () => {
    generator.stdout.removeListener('data', output);
  });
});
Copy the code

We use spawn to start a shell script in the child process that continuously reads logs from log files on the remote server and outputs them to the console. When a new WebSocket connection is established, we write data generated on the generator child process to the connection, of course, parsing the row-based log into structured data consumed by the client and returning it in JSON format before writing.

Finally, we need to remove the event listener function on the generator when the client disconnects voluntarily.

The generator.sh content here can be any script that reads information from the log and outputs it to the console. For example, the simplest might look like this:


1
Copy the code
tail -f /var/logs/nginx/access.log
Copy the code

If there is no access traffic locally, we can point it to the test environment:


1
Copy the code
ssh qa-env tail -f /var/logs/wifi-portal/wifi-portal-2018-06-13-access.log
Copy the code

For the client, it’s just a matter of setting up a WebSocket connection when the page loads and re-penalizing the drawing interface when the data arrives. A d3.js real-time reporting plug-in is used here.


12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30Copy the code
var ws = new WebSocket("ws://localhost:8080");

ws.onopen = function() {
  console.log('connected');
};

ws.onmessage = function (evt) {
  const event = JSON.parse(evt.data);
  categroies.push(_.truncate(event.campaign, { 'length': 8 }));
  const campaigns = _.uniq(categroies);

  chart.yDomain(campaigns);
  chart.yDomain().forEach(function(cat, i) {
    var now = new Date(event.date);

    var mills = event.mills * 200;

    const obj = {
      time: now,
      color: color(mills),
      opacity: 1,
      category: _.truncate(event.campaign, { 'length': 8}),
      type: "circle",
      size: mills,
    }

    chart.datum(obj);
  });

};
Copy the code

The corresponding legend in the figure above is:

  • The horizontal axis is time
  • The vertical axis represents the specific resource being requested (such as an API, or a static image)
  • Each time a resource is requested it forms a point on the canvas
  • The size of the dots reflects the response time

Presentation of statistics

The use of Graphite

Graphite comes with a visual interface, you can choose to show multiple indicators in the same interface:

In addition, Graphite also provides a more powerful renderAPI, using this API, you can get a variety of output formats, such as: CSV, JSON, convenient secondary development. On the other hand, you can use the target argument to derive more complex metrics from the expression.

Such as:


1
Copy the code
http://localhost/render/?format=json&target=stats.jc.airport.campaigns.1565ae2c79aee5e635e55d73354c7cd3
Copy the code

Where format specifies JSON and target specifies the name of the metric. In fact, target can be much richer:


1
Copy the code
http://localhost/render?format=raw&target=alias(sumSeries(stats.jc.airport.campaigns.*)%2C%27%27)&from=1529245830&until= 1529245929Copy the code

The value of the target for the alias (sumSeries (stats. Jc. Airport. Campaigns. *), ‘ ‘), said to all with stats. Jc. Airport. Campaigns at the beginning of values of the indicators are summed up. You can specify the start and end time by from and until to obtain all data in this period. In this way, you can gradually display statistics of indicators in real time through client polling.

Graphite provides rich functions to aggregate indicators, such as average, variance, extreme value and other conventional operations, but also for two/more indicators arithmetical operations, so as to obtain new data sets and so on. Here’s a full list.

Use Horizon Chart to present real-time data

Cubism is a plug-in for D3.js that presents real-time, time-based reports. In fact, Cubism has a lot of research and papers behind it, called the Horizon Chart. The graph is constantly refreshed at a fixed frequency, and the data appears to be moving to the left, with the left-most, older data disappearing; New data is constantly coming in and being plotted to the right.

You can specify different data sources for horizon maps. In Graphite’s case, you can specify:


One, two, three, four, fiveCopy the code
var graphite = context.graphite("http://localhost");

var api_metrics = [
  graphite.metric("sumSeries(stats.jc.airport.campaigns.*)").alias("Campaigns Freq")
];
Copy the code

Cubism sends periodic requests to the Graphite server:


1
Copy the code
http://localhost/render?format=raw&target=alias(sumSeries(stats.jc.airport.campaigns.*)%2C%27%27)&from=1529245830&until= 1529245929Copy the code

Cubism then refreshes the chart based on actual data:


1
2
3
4
Copy the code
d3.select("body").selectAll(".horizon")
    .data(api_metrics)
  .enter().insert("div", ".bottom")
    .attr("class", "horizon").call(horizon.extent([0, 50]));
Copy the code

In fact, since the Horizon chart takes up so little space vertically, you can easily merge multiple tables together to form a multi-row chart.

summary

This article introduces some typical scenarios for real-time data visualization, as well as common methods for preparing and presenting data. With some existing tools or simple scripts, we can feed the data generated in real time into a temporal database for statistical purposes and then render it in different ways. In general, statistics based on fixed time intervals are more meaningful, such as number of requests per unit of time, average latency of requests, etc. On the other hand, only presenting the data in real time is also very meaningful in some scenarios, such as the real-time number of online people in the system, the proportion of login exceptions, the load of some nodes over 90% and so on.

The resources

  • Horizon Chart
  • Visualisation Papers
  • Cubism

Other information

  • Setup Graphite in Docker
  • An concrete example for using cubism with graphite