Background and problems of blockchain data governance

Data governance ensures the integrity and security of data through specific mechanisms, including management of quality, efficiency, and security. Data governance is not a static state, but a sustainable process.

As blockchain has gradually penetrated into People’s Daily life, it has been applied in digital government affairs, financial services, social governance, public welfare, environmental protection, judicial arbitration and other fields.

After the data in these fields is linked, it needs to be analyzed and processed to mine the value. Data on and off the chain are collected and then enter the data lake, which can provide data support for the upper application. Data is often used in business analysis, large screen display, regulatory audit, business statements, etc. These functions provide capability support for the blockchain application. Through this cycle, data “flows” around the data lake.

In many traditional Internet enterprises, data governance problems mainly occur in the whole process of data production, collection, processing and storage, data application and data destruction, and all kinds of risks may occur in each link.

  • At present, the integration process of blockchain technology and big data governance also faces new challenges.
  • The cost of node storage is high. With the continuous increase of the amount of data on the node, the cost of node storage grows exponentially.
  • The data synchronization time is long. When the data volume of nodes is very large, the synchronization data cycle of new nodes is long and they cannot join the network quickly.
  • The node query performance is low, and the transaction execution efficiency decreases with the increase of business and data volume on the node.
  • Big data processing cannot be carried out. Due to the specific chain storage structure of block chain, big data processing and complex query cannot be carried out on the chain.
  • The development cost of data export is high, and the business analysis needs to analyze the data according to the smart contract, so the development cost cycle is long.
  • Unreusable scalability is poor, and when the business changes, the on-chain data parsing and export also need to be redeveloped.

With the continuous development of blockchain business and the continuous improvement of the refinement of enterprise operations, blockchain data has an increasing impact on enterprises. As companies create value from these assets, they are increasingly demanding the quality, efficiency and security of their data.

Data governance component technology architecture and solution advantages

The technical architecture of the data governance component revolves around the bottom layer of the blockchain, which is divided into two layers: operation and maintenance component, development component and business component, to jointly realize the governance of the blockchain data.

The operation and maintenance layer contains the data-stash, which is responsible for Data expansion, backup, clipping and synchronization. Development and business components mainly include data-export and data-reconciliation. Among them, the data export component mainly solves the complex query, analysis and processing problems of blockchain big data. The business reconciliation component mainly provides the reconciliation solution based on the blockchain data.

The data governance component solution has the following advantages:

It provides full data backup and supports efficient synchronization of node data. Data export provides efficient real-time query capability and supports multi-threading and multi-active processing to improve processing performance.

Second, it supports extensibility of different storage media. Data storage supports different storage media such as MySQL and ES, and provides extensible protocol interface at the bottom to ensure scalability.

Third, the data service is safe, stable, reliable and verifiable, and the data is backed up based on multiple nodes to ensure the integrity of the data.

Fourthly, distributed storage supports big data analysis and query. Data governance components, based on smart contracts, export data on the chain to storage media that is convenient for big data analysis and query, and provide universal query capability, supporting database and table division and master and slave backup.

Fifth, low code development almost zero cost, the components are mainly for developers, we try to reduce the development of code, as long as through simple configuration can complete the basic application of components;

Sixth, universal design ensures that the scheme can be reused. Component design or development will give consideration to universality as much as possible, and there is no need to repeat development for different scenarios. At the same time, we will also provide some personalized configuration.

Data governance component application scenarios and component introduction

Advantage of data governance components is closely related to the use of scenarios.

In the operation and maintenance management scenario, the front-end data service can realize full backup, data clipping, fast synchronization and cold data query. In the business function scenario, it mainly involves data analysis, market display, supervision and audit, and business statements. In industrial application scenarios, it mainly includes digital government affairs, financial services, social governance, judicial arbitration, etc.

The following is a detailed description of the data governance components.

Data-Stash Data warehouse component

Data-Stash is a Data warehouse component based on FISCO BCOS, which mainly provides the capability of scaling, backup and tailoring blockchain Data. It generates a backup of the node by parsing the node’s Binlog log, which enables the node to separate hot and cold data, and provides the ability to clip and synchronize data quickly.

Through the analysis of the node Binlog, Data-Stash realizes the capabilities of full backup of node ledger, multi-dimensional ledger verification, trusted storage of backup Data and breakpoint transmission.

Data-Stash mainly has the following functional features:

(1) Separation of cold and hot data

Over time, nodes accumulate more and more ledger data. If the volume of the node grows uncontrollably, the node server will eventually erode away, causing adverse effects.

For this, data separation can be achieved through data warehouse services. Start data-stash service, and import node Binlog into database to realize Data backup. Developers can partition the data on the chain, delete the data that is not used frequently, and retain the recent data. In order for the node to run undisturbed, the user needs to ensure that the node is enabled.

(2) Achieve efficient node migration

During the operation of blockchain services, the demand for node expansion or upgrade often occurs. For example, if the server needs to go offline or replace the disk due to some failure, we can use data-stash to quickly synchronize the Data of the node.

(3) Supervision, auditing and traceability

For the supervisor, it is necessary to ensure the integrity and queryability of ledger data. Because the ledger database of the blockchain itself may not meet the demand, we can make a complete backup through the data warehouse component at this time. We can use a relational database to better query the data; In order to better meet the requirements of supervision, we adopt multi-dimensional verification mechanism to prevent malicious tampering of nodes.

The data-export component

Data-Export is also a Data Export tool based on FISCO BCOS platform. Users almost need no coding and can Export structured Data to relational database or ES database as long as simple configuration, so as to facilitate subsequent business analysis and processing.

At the same time, it supports multi-active deployment, data database and tables, visualization of exported data, application supervision and other functions, and can adapt to various complex business scenarios.

Data-export mainly has the following functional features:

(1) Support smart contract data export

Contract-related methods and event Data can be parsed and exported through data-export. The exported Data is more intuitive and can be used for display and analysis.

(2) Complex data query and analysis

In terms of Data storage, Data-Export currently supports MySQL, ES storage, and provides extended interfaces. At the same time, it supports a variety of Export strategies. After the data is exported off the chain, it can be queried and further analyzed.

(3) Technical architecture to support read-write separation

The use of data-export can separate the up-chain write operation from the read operation. By linking the Data Export down to provide the reading ability, the pressure on the chain node reading operation can be alleviated, and the technical architecture of read-write separation can be realized.

(4) Provide monitoring and other visual capabilities

It can export the data on the chain to the database table, provide data display through the visualization ability, present the core process and value of the data, and realize the ability to monitor the blockchain data.

Data-Reconcile components

The reconciliation of accounts between traditional enterprises mainly depends on the centralized books of both parties. Based on the transitivity, immutability and actuation of blockchain itself, we can find a credible objective basis.

Data-Reconcile is a blockchain-based Data reconciliation component that provides a universal Data reconciliation solution based on blockchain-based smart contract ledgers.

Data-Reconcile mainly has the following functional features:

(1) Support dynamic, extensible and customized development

On the one hand, Data-Reconcile provides some generalizations. On the other hand, further customization development is supported in different business scenarios.

(2) Flexible and configurable data reconciliation rules

Reconciliation rules can be customized for configuration to provide scheduling management of reconciliation tasks.

(3) The account reconciliation process is pluggable and extensible

Provides extension interface, function and process pluggable.

For a detailed demonstration of the three main components, Data-Stash Data warehouse, Data-Export Data Export, and Data-Reconcile Data reconciliation, please click here to see the demonstration.

Experience WeBankBlockchain – Data: WeBankBlockchain – Data – component https://github.com/WeBankBlockchain/Data-Stash WeBankBlockchain Stash Data warehouse – Data – Export Data Export components https://github.com/WeBankBlockchain/Data-Export WeBankBlockchain – Data – Reconcile Data reconciliation components

“Hyperphone Blockchain”


“Super Blockchain” is a live broadcast activity launched by the FISCO BCOS open source community. Every Thursday at 8 PM, the community invites a technology geek or application pioneer to be a guest in the broadcast room to share development practice or application experience. As a fixed column in the community, “Super Blockchain” has been held nearly 100 times, from technical seminars to industrial applications have touched, welcome to recommend yourself or recommend friends to the broadcast room to share. The dialog box of the public account replies the small assistant to join the group to watch the live broadcast.