Author: TJ, Tang Jianfa, CTO of TapData Tiplaten Data, chairman of MongoDB Chinese community, former chief architect of MongoDB Greater China, lecturer of Geek Time MongoDB video course.

“How do you build a data center?” In the data processing industry, customers often ask such a question.

What is the data center? Is it a product, a technology, or an architecture? In the concept of data in the middle of platform is ubiquitous, let’s talk about the data in the middle of platform architecture, technical implementation, as well as how to implement in the enterprise, actually solve the problem.

I. Modern enterprise data architecture and pain point — data island: the root cause of inefficiency and difficult utilization — application bottleneck: the insufficiency of traditional solution data warehouse and data lake

Let’s take the scenario of an airline company as an example: the Marketing Department of an airline company plans to launch a new product or a customer activity, and would like to know which channel is the most commonly used for a certain type of customer? When you think about it, the airlines have too many customer contacts.

PSDP itinerary orders, complaints, luggage system, frequent traveler system, mobile APP system and so on. These systems are built by airlines at different stages and in different business sectors. These applications are deployed only for their own business, with no consideration for how well the rest of the enterprise works together. If the data in these applications is not consistent, it can take days or weeks to get the results, and it’s not even clear where to get the data. Sometimes, even if they know, they have to coordinate with other business departments to give it correctly.

Let’s look at a small program of insurance policy loan case. When the customer applies for cash loan through this policy loan mini-program, if the customer has bought excessive disease insurance, life insurance or property insurance in the insurance company, the system can judge the appropriate type of cash loan to the customer in one minute according to the amount of the customer’s policy.

At the time of on-line discovery, this policy loan small program quickly developed, but the data in the life, serious disease, property insurance and other different systems, some also need to recommend the system and label system. So it takes a lot of time, weeks or even months, to do the data docking. Because it’s not just about data, it’s about permissions and so on.

The above situation is a common problem of data island in enterprises, and with the development of IT construction, this problem will become more and more common.

Data island is caused by the fact that business departments take their own business construction as the core rather than data construction as the goal when building IT services. Secondly, the common database such as Oracle, SQLServer, DB2, Sybase, these relational databases have always had the bottleneck of performance expansion. Lead to a large system, or increase in the number of customers, the need to use the way of library and table. Because a single library can not support too much business. This also creates large data silos. The impact of data silos is a serious impediment to new businesses reusing existing data:

  • It takes a lot of time to dock and synchronize;
  • Degraded user experience, incomplete and real-time data;
  • Repeated construction, low reuse rate.

In order to solve the problem of data island, the current solutions include application level ESB enterprise bus, MQ, etc. From a storage perspective, there are warehouses of Teradata, Greenplum, and Data Lake. All of these solutions address the problem at some level, but they have limitations:

First of all, these schemes are oriented to analysis scenarios, and most of them are in T+1 mode for data extraction, that is to say, the data acquired by the business is produced by the system yesterday. These data are processed in the data warehouse and data lake to form a large number of reports and result data, which are delivered through downloading, exporting and other ways, with extensive forms. All the big data platforms on the market at present, most of the scenarios are focused on analysis, mainly used for BI, reports and dashboards to gain insights into the operation of enterprises and customers.

For enterprise operation, the key and core capability is not back-end analysis, but interaction with customers, business and process in the front end. Based on the above situation, the data center came into being.


TAPDATA TiPt data

  • A new generation of real-time data fusion platform products and solutions provider
  • The leading provider of real-time synchronization solutions for heterogeneous databases contact us for enterprise Demo: [email protected] Experience online heterogeneous database synchronization service now: cloud.tapdata.net