Description: Dataphin has released V2.9.4.3, an updated version of the product. The product features and user experience have been optimized and improved to provide users with a more complete product capabilities and experience, to speed up the enterprise data center construction process.

1 Product Profile

Dataphin is the product output of Alibaba Group OneData data governance methodology based on internal practice. It provides one-stop data acquisition, construction, management and full life cycle management capabilities, so as to help enterprises significantly improve the level of data governance and build an enterprise level data center with reliable quality, convenient consumption, safe production and economic production. Dataphin provides a variety of computing engine support and extensible open capabilities to adapt to the platform technology architecture and personalized needs of a wide range of industries.

Overview of Version 2

In June 2021, Dataphin released version V2.9.4.3, upgrading several product capabilities. • Expand the richness of computing engine and OpenAPI coverage • Data integration module, expand the supported version of MySQL data source and the coverage of one-click table to improve the efficiency of configuration • Monitoring ability, optimize the configuration of alarm receiving rules, improve the flexibility to adapt to more monitoring scenarios • asset center, Optimize the logical table preview and sensitive field identification rules, improve the asset link • data service expansion API paging query capability, to expand the query scope, improve service response efficiency and link stability

This version has been optimized and improved in terms of product functions and user experience, aiming to provide users with more perfect product capabilities and experience, so as to accelerate the construction process of enterprise data center.

Details of key features of the new version

Attribute 1: Support for CDH6 is added to the computing engine

Added adaptations for CDH6 computing engines to improve multi-engine compatibility; As of the current release, Dataphin supports the following types of computing engines: MaxCompute, CDH5, CDH6, EMR.

Attribute 2: MySQL data source supports version 8.x

At present, 8.0 is a more mainstream and widely used version of MySQL database on the market. On the basis of already supporting MySQL 5.6 and 5.7 data sources, Dataphin newly supports MySQL 8.0 version. Data synchronization, data service and other modules can be configured based on this data source to improve the coverage of business data.

Attribute 3: Data integration supports one-click table creation in the Oracle target repository

Support one-click creation of data and tables in Oracle target database to simplify the configuration process and improve the efficiency of data synchronization configuration. As of the current release, one-click table building covers four target data sources: MaxCompute, Oracle, Hive, and AnalyticDB for PostgreSQL.

Attribute 4: Task performance monitoring and quality monitoring support assigning different alerts and receive modes to different receivers

Before upgrading, the same alert receiving mode needs to be configured for all selected recipients. After upgrading to this version, you can specify different receive modes for different receiver types to achieve differentiated alerts according to the actual situation. For example, the person in charge of the task needs to know the general situation of the task he is responsible for, but does not need to deal with the abnormal situation immediately, so he can choose SMS warning. The person in charge on duty needs to find the abnormality in time and deal with it, and can choose the telephone as a strong reminder. The project manager needs to keep an alert profile on a regular basis. Email alerts can be selected to facilitate recording and statistics.



Feature 5: Added logical table data preview function to asset map and asset security support to manually trigger sensitive field identification

Added logical table preview function, for the field with authority, directly show the sampled data records, if the field has set desensitization rules, only show the desensitized data; For fields with no permissions, prompt “no permissions” and provide a quick link to the application. With this capability, Dataphin improves the full link of logical tables from development to asset precipitation to consumption preview, enhancing the modeling experience.

By default, the asset security module is configured with sensitive data identification rules, and daily scans begin the next day at regular intervals. This time, on the basis of daily timed scanning, users are added with the support to manually trigger the operation of sensitive data identification task, so as to realize the immediate effect of the new rules, and the timely update of records in the case of temporary changes, so as to improve the sensitive data identification coverage scenario.

Attribute 6: Data services support API paging queries created based on Impala data sources to extend query scope and improve query stability

In the historical version, considering the query performance, the API created based on Impala data source can only return a maximum of 1000 results for a single query, which cannot meet the scenario of large data volume query and affects the use of downstream business. This paper provides paging query capability for API created based on Impala data source, supports setting paging conditions through limit or offset statement to ensure service connection stability and corresponding efficiency, and supports large data volume query scenarios.

4. Summary and prospect

In the release of V2.9.4.3, Dataphin has been iterated and upgraded around computing engine, data source, data integration, monitoring alarm, and data service. In the next version, we will focus on supporting FusionInsight computing engine adaptations, data extraction upgrades, OpenAPI extensions, operational and maintenance data completion capabilities, data service multi-projects, and more. Stay tuned!

This article is the original content of Aliyun, shall not be reproduced without permission.