background

In big data business system, all technology stack ecology is expanded around storage. Currently, the mainstream open source storage technology stack mainly includes the following three types:

· HDFS: Hadoop series suite, including Hive, HBase, Phoenix, etc.

· ElasticSearch: Includes Logstash, ElasticSearch, Kibana, etc.

· Kudu: including Impala, etc.;

No matter what kind of storage ecosystem, the outer layer of the technology stack responsible for data computing is common. For example, Spark and Flink support reading and writing data in almost all storage ecosystems. The choice of storage ecology for a production environment often depends on the shape of the business and the skill level of the business unit with the different technology stacks.

For many services in the big data ecosystem, Cloudera (CDH, HDFS, Kudu ecosystem) and Ambari (HDP, HDFS ecosystem) provide the ability to deploy, manage, monitor, and operate big data service components and big data nodes. However, as Cloudera and Ambari merged into one, CDH and HDP merged into the CDP distribution and no longer provided free functionality in subsequent releases. This undoubtedly increases a lot of development, operation and maintenance costs for the big data business of domestic enterprises. If we continue to use the free old version, we will not be able to get updated support for new features and timely repair of bugs.

In this context, UCloud, based on years of big data platform development experience, recently released a new one-stop intelligent big data platform USDP free version for privatization deployment scenarios. The free version of USDP supports the whole ecology of HDFS, Kudu and ES, which helps enterprises improve the efficiency of big data development, operation and maintenance, and quickly build the analytical and processing capacity of big data business.



Compatible with the most extensive one-stop intelligent big data platform

USDP covers many open source big data components such as HDFS, Hive, HBase, Spark, Flink, Presto, Atlas, Ranger, etc., and supports full-stack big data development, operation and maintenance management for these components, including operation and maintenance of these components, mid-station construction, data development, and business visualization. The USDP is delivered to users in a lightweight, easy-to-use, foolproof format that enables the separation of different modules to achieve a high level of customization that flexibly matches the needs of various vertical industry scenarios.

At present, the services supported by UCloud one-stop intelligent big data platform USDP are shown in the table. Meanwhile, more open source ecological component services are being continuously expanded.

Compared with Cloudera (CDH, CDP) and Ambari (HDP), USDP supports more rich big data services including Flink, Kylin, LiLive, Phoenix, Tez, Elasticsearch, Kibana, Azkaban, Presto, Atlas, Kafka Eagle, Zkui, etc. It covers almost all the mainstream technology frameworks, and the supported big data services are fully compatible with each other. Users can choose flexibly and use them on demand. In addition, the UCloud big data technical team continues to follow the progress of the open source community and product feedback, and fixes potential bugs in a timely manner, so that users no longer have to consider the problem of big data service adaptations. Under the CDH subscription model, the free version of USDP is undoubtedly the best choice for big data development, operation and maintenance at present!

Self-developed management components, higher safety and reliability

As a one-stop intelligent big data platform independently developed by the UCloud big data team, the overall structure of USDP is shown in the figure below:

In the figure above, Manager Server is a USDP management side service, which needs to be equipped with a MySQL instance to store cluster-related metadata information. Agent is the slave node control terminal service for USDP, which is used to manage and operate the node and the big data service on the node. Among them, BigData Service is all kinds of BigData services (such as HDFS, YARN, etc.).

InfluxDB, Prometheus and Grafana are monitoring services that aggregate and display the monitoring data of the entire cluster.

USDP supports a cluster scale of at least 3 nodes or up to thousands of nodes. Meanwhile, it allows Manager Server, Agent and other related services to be deployed on the same node. In this way, while meeting the needs of large businesses, it also tries to help users meet the demands of small businesses for data analysis with a small cost.

The core advantages of USDP one-stop intelligent big data platform

Don’t worry about business binding

The big data services and components included in USDP all meet the Apache 2.0 open source protocol. After a lot of compatibility tests, the UCloud big data team actively returns to the community and releases the compiled compatibility package to the public. Because it keeps up with the pace of open source community, users can carry out independent replacement, independent construction, independent data migration, cluster migration, etc., at any time, so there is no need to worry about the binding of big data business and closed source service.

Dumb deployment

In order to enable users to experience a minimalist big data deployment operation, maintenance, management solution, USDP provides rich and detailed deployment, operation documents, and users do not have to worry about the installation of a lot of content, the initialization environment is just a few steps, can be automatically configured.

1- Environmental inspection



2- Service deployment

Comprehensive and rich monitoring indicators

USDP preset monitoring indicators mainly include three parts:

• JMX full index collection

• HTTP common metrics collection

• Custom indicator collection

The above three sections of monitoring data are finally aggregated in the USDP Promethues, and the most commonly used monitoring metrics are displayed in the overview page of each service. At the same time, in Grafana, the user can view the most detailed monitoring metrics through the USDP official preset monitoring template (Dashboard). If the preset USDP monitoring ICONS do not meet the business requirements, users can customize the required monitoring charts.

Flexible and convenient alarm service

USDP provides preset alarm templates, which can be configured to send cluster indicator alerts to different targets (WeChat, pins, mail, interface calls, etc.). Similar to the design of monitoring metrics, users can customize the warning template to be modified or add new warning rules if they feel that the preset warning template does not meet the business needs.

Professional technical support

The UCloud big data team has accumulated many years of experience in big data operation and business tuning of public cloud. Through the constantly updated document knowledge base, the UCloud big data team provides expert technical support to users to solve the worries of using USDP.

Feed back the open source community

The open source, fully compatible and optimized service pack used in the USDP Free Edition will feed back to the open source community and provide developers with a free download channel.

USDP Unlocks Rich Big Data Scenes

Using the USDP one-stop intelligent big data platform, the following application scenarios can be realized in all walks of life.

The data warehouse

At present, the commonly used data warehouse model in China is dimensional data warehouse, that is, data warehouse and data mart are constructed according to fact table and dimension table. Through the USDP one-stop intelligent big data platform, users can deploy various services needed to build dimensional data storehouses, and help enterprises quickly build data middle platform.

Machine learning

Machine learning uses algorithms to analyze a large amount of data, dig out the laws contained in it, and apply them to predict or classify things, which has a large number of computing requirements. Through distributed computing frameworks such as Spark and Flink supported by USDP one-stop intelligent big data platform, machine learning application development can be carried out efficiently.

Information retrieval

Fast retrieval of required information from massive data has always been an important field of data application. The USDP one-stop intelligent big data platform integrates distributed search and analysis engine ElasticSearch, real-time retrieval database HBase, database warehouse service Kylin, etc., which can provide efficient data retrieval capability. Can be used to build enterprise-class search engine, log management system, etc.

Finally, knock on the blackboard to highlight: UCloud one-stop intelligent big data platform USDP free version sincerely invites you to refer to the following ways to download and use.

USDP resource download address:

• US3:

https://s3-cn-bj.ufileos.com/…

• Baidu online disk:

Links:

https://pan.baidu.com/s/1mlic…

The extracted code: spp9

Scan code to add UCloud big data technology experts

Note “the USDP”

Invite you to join the big data product technology exchange group