Hu Haiyang: Hive Metastore Federation in didi practice

Product | technology drops

The author | hai-yang hu

This paper is from the offline engine group of big data architecture of Didi Basic platform, aiming at the practice of internal Hive metadata storage scheme with hundreds of millions of levels. This architecture fundamentally improves the stability and scalability of Hive metadata services.

▍ background

Apache Hive is a data warehouse based on Apache Hadoop. It provides an EASY-TO-use SQL-like query language for storing and querying large-scale data and is widely used.

Hive Metadata Contains Metadata information about databases, tables, and partitions created using Hive. Such meta information is typically stored in relational databases, such as Derby, MySQL, and so on.

In Didi, we store metadata in MySQL. With the continuous growth of the business, Hive metadata becomes larger and larger, and more than 100 million records are stored in a single table. In this case, MySQL query pressure is too large, and the CPU usage of the machine often reaches 100% during peak periods. Service stability is affected.

To this end, we began to investigate the metadata Federation scheme to achieve metadata horizontal expansion capability, decompress MySQL and improve Hive stability.

Federation program introduction

2.1 Hive Architecture Evolution

Hive prior to Federation:

The Hive client, Hivesever2 service, Spark engine, and Presto engine used by users access the unified Hive Metastore service to obtain Hive metadata.

Hive Metastore service consists of LVS and multiple Hive Metastore instances. All Hive Metastore instances share a primary and secondary MySQL environment to store the DB as Hive metadata.

The early stage of the work

In order to relieve the query pressure of Hive metadata store DB (MySQL), we did a lot of metadata governance work, including library, table, partition cleaning, and so on. Metadata access restrictions include query partition table partition restrictions and other functions, but these do not fundamentally solve the MySQL pressure problem.

Project research

The Hive metadata structure contains only one database, which contains over 100 million records in a single table. So why not consider MySQL sub library, sub table, etc.? If the MySQL database and table are used, a large number of changes to the Hive Metastore interface are required, which is risky and costly. In addition, the Hive version upgrade will bring more work.
Implement Federation (Waggle_dance) based on Hive Metastore layer, implement the idea is mainly based on Hive DB segmentation, in Hive DB level distributed metadata in multiple sets of MySQL environment storage, In this way, Hive Metastore itself does not need to be modified, and this solution is easy to maintain.

Hive Metastore Federation Hive Metastore Federation

2.2 introduce waggle_dance

Reference:

Cwiki.apache.org/confluence/…

Github.com/HotelsDotCo…

Waggle-dance is an open source project developed by Hotels.com. The project combines multiple Hive Metastore data query services and implements a unified routing interface to solve metadata sharing problems among multiple Hive Metastore environments.

2.2.1 Architecture Flow chart

From the architecture diagram, Waggle_dance plays the role of Router. Multiple Hive Metastore environments are configured on the back end to receive metadata requests from clients. Requests are forwarded to the corresponding Hive Metastore environment based on the routing relationship between the DB and Hive Metastore.

These operations are completely transparent to clients, who only access a set of Hive Metastore environments.

2.2.2 Internal Component Parsing

Waggle_dance is implemented based on the Spring-boot framework and consists of the following modules:

WaggleDance container

The service starts the class, mainly initializes the container Spring Boot, loads the Listener, and each key class is annotated and initialized.

YamlFederatedMetaStoreStorage module

Maintenance depends on the Hive Metastore environment configuration information and the route information between the Hive Database name and the Hive Metastore service.

Currently, you can configure one primary Hive Metastore and multiple secondary Hive metastores.

Access-control-type attribute is differentiated between primary Hive Metastore and secondary Hive Metastore configurations.

Primary Hive Metastore defines attribute access-Control-type: Supports the previous four policies for Hive metadata operations in the current environment.

Define attribute access-Control-type from Hive Metastore: Only the READ_ONLY policy is supported for Hive metadata operations in the current environment.

File configuration management module

WaggleDanceConfiguration: ThriftServer service configuration properties
GraphiteConfiguration: Graphite monitoring configuration

ThriftServer module

Implements the ThriftServer service and provides RPC services based on the Hive ThriftHiveMetastore API. The create_DATABASE, drop_DATABASE, and create_table interfaces are used to query summary information, such as get_all_DATABASES and set_UGi.

In this way, it is fully compatible with Hive clients and Hivesever2 service request protocols. Finally, the route is resolved to the corresponding Hive Metastore to establish a connection and invoke the method of the same name to perform operations.

A few caveats:

Interface operations are determined by the access-Control-type attribute of Hive Metastore defined in the path configuration file.

Interfaces that cannot be routed by library name are forwarded to the main Hive Metastore environment.

The Monitor module

MonitoringConfiguration, MonitoredAspect: Service monitoring logic, mainly USES the Java class library Metrics implementation (project’s official website http://metrics.dropwizard.io/), the library supports the related monitoring information through Ganglia and Graphite tools such as show, Monitors JVM memory, thread execution status, and Hive metadata operations.

FederationsAdmin Administrative module

FederationsAdminController: restful interface for the administrator use, can be dynamically completed the Federation Metastore registration, cancellation and Hive with Metastore routing information to modify the Database name.

Didi Practice

3.1 Service Transformation

The current waggle_dance function cannot fully meet our requirements and needs to be extended. The improvements are as follows:

The waggle-dance-federation.yml extension supports multiple primary Hive Metastores because metadata can be read and written to multiple Hive Metastores.
MappingEventListener Adds the MULTI_PRIMARY policy. Implement Hive Database names based on multiple primary Hive Metastore modes ->Metastore Mapping routing information classes.
Modify the processing logic of FederatedHMSHandler (compatible with Hive 1.2.1 and 2.x Hive ThriftHiveMetastore API). In this way, Metastore services of multiple back-end Hive versions can be fully compatible with clients that use Hive 1.2.1, facilitating Metastore service version upgrade.
Modify the monitoring module. Because the company uses Ganglia tools for internal monitoring, Graphite was transformed into Ganglia and the monitoring indicators were optimized.
Related performance optimization: In order to avoid the memory operation of creating a Database->Metastore Mapping routing information during each client connection, Each WagGLE_dance instance starts with a cache of Hive Database->Metastore Mapping routing information, which is shared by each client connection request thread.

For the maintenance of Database->Metastore mapping routing information, each Waggle_dance instance periodically requests multiple sets of Metastore services on the back-end to update data.

Flowchart of the modified Waggle_dance architecture:

3.2 Deployment

To distribute Hive metadata in online MySQL to multiple MySQL environments, you need to deploy a new MySQL environment (corresponding to the new Hive Metastore environment).

After internal stress testing, we concluded that a Waggle_dance instance could be plugged into a Hive Metastore environment. Considering the horizontal expansion of the Hive Metastore environment and service stability, a Waggle_dance cluster consisting of LVS plus four Waggle_dance instances is deployed.

In the future, Hive libraries and table metadata information stored in MySQL will be gradually migrated to the new MySQL environment. To reduce the impact on users during the migration, functions such as WAGgle-dance table-based routing will be developed in the future.

▍ summary

The current solution has been running stably for several months. The new architecture will support the horizontal expansion of multiple MySQL environments, fundamentally solving the performance and service stability problems caused by a single MySQL environment.

▍ END

Didi-Cloud

Hai-yang hu drops | foundation platform of senior engineer

In 2017, HE joined Didi and worked in the Big Data Architecture Department of basic platform. He has been engaged in the r&d of massive data processing platform for a long time. Focus on distributed computing, scheduling and storage systems.