Presto is well known for its fast query speed. Based on the MPP architecture, Presto can quickly query Hive data and supports Connector extension. Connector supports Mysql, MongoDB, Cassandra, Hive and other databases. Is our common SQL on Hadoop solution. So let’s take a look today at some of the questions we need to consider when we choose Presto as our query engine.

Presto performance tuning and stability

Problems with Presto

Coordinator single point problem (common solutions: IP address drift, nginx proxy dynamic access, etc.)

OOM(0.186+ version supports dump to disk not verified)

No fault tolerance, no retry mechanism

The Presto deployment environment is complex, and the MPP architecture is susceptible to single machines

Insufficient Presto concurrency

Tuning strategy

Multiple Coordinators are deployed to avoid single point problems, and the upper-layer layer encapsulates the query service to avoid JDBC direct connection

Retry the query service if necessary (task status needs to be determined)

Properly configure worker-related memory parameters to avoid OOM

When Presto resource queues are enabled, a query queue can be constructed in line with service scenarios to control concurrency and query priorities, ensuring efficient task completion

Develop Presto monitoring system to monitor Presto cluster status and give early warning. Dynamically adjust the Presto cluster size

Memory tuning

Presto is divided into three memory pools: GENERAL_POOL, RESERVED_POOL, and SYSTEM_POOL.

The SYSTEM_POOL is the memory reserved by the system for worker initialization and task execution. The default is Xmx_0.4. Presto cuts the current largest query in good memory to this memory region, default Xmx_0.1 is configured by Query.max-memory-per-node

GENERAL_POOL Other query memory, that is, other query memory except the maximum query memory. The size is xmx-system_pool-reserved_pool

The overall memory configuration is affected by the following scenarios:

User query data volume, complexity (determining how much query memory to use)

Concurrency of user queries (determining how large the JVM heap should be)

Note that simply increasing the value of RESERVED_POOL does not solve the Presto Query problem, because the RESERVED_POOL is not evaluated most of the time and is used only for one Query.

The GENERAL_POOL node is blocked, that is, the memory is insufficient

The RESERVED_POOL is not used

If the concurrency is large, keep the default SYSTEM_POOL or a slightly larger one, and increase the RESERVED_POOL to 1/8.

Presto jvm.config:

-XX:G1ReservePercent=15-XX:InitiatingHeapOccupancyPercent=40-XX:ConcGCThreads=8

Presto monitoring

The Presto monitoring page displays only the status of the current Presto cluster and some recent queries, which cannot meet requirements. Data collection for query information is required:

Querying basic information (status, memory usage, total time, error messages, etc.)

Query performance information (time of each step, data input and output, including stage details and details of tasks under stage)

Abnormal warning

Presto follow-up optimization

Controls the maximum number of queried partitions in a partition table

Controls the maximum number of splits generated by a single query to prevent excessive consumption of computing resources

Automatically finds and kills long-running queries

Presto query flow limiting (limiting queries that exceed XX data)

Enable Presto resource queues

Unified query engine

Current version of Presto memory limits and management

Single dimension

{Query exceeded local Memory limit of x} {Query exceeded local memory limit of x} {Query exceeded local memory limit of x} It will only cause the current query to fail. Also, a node is considered a Block node if its GENERAL_POOL has available memory and its recoverable memory is 0.

The RESERVED_POOL can be considered as the largest SQL query. If the RESERVED_POOL can satisfy the GENERAL_POOL memory limit policy, it must satisfy the RESERVED_POOL policy (the GENERAL_POOL policy is reused).

RESERVED_POOL The current version does not allow memory limitation. Therefore, if the concurrency is high and scan data is very large, OOM problems may occur. However, with Resource Group, the memory setting is reasonable, and the OOM problem is basically avoided.

Cluster dimensions

Presto considers the cluster to be out of memory when two things happen:

The GENERAL_POOL is blocked (Block node)

The RESERVED_POOL has been used

When a CLuster exceeds CLuster Memory, there are two ways to manage Memory:

Each query is iterated one by one to determine whether the total memory used by the current query exceeds query.max-memory(as specified in config.properties). If so, the query is failed.

If query.max-memory is improperly configured and the value is so large that the first case may not be satisfied after 5 seconds (the default time), the second method will be used to manage the query. The second management method is divided into two types of small management. Kill query policies are determined according to LowMemoryKillerPolicy, which is divided into total-reservation and total-reservation-on-blocks-Nodes. The total-reservation command kills the query that consumes the most memory. While total-reservation-on-blocks-Nodes kills queries that use the most memory on nodes that are running out of memory (blocking).

Resource Groups

Resource Groups can be thought of as Presto implementing a weak Resource restriction and isolation feature. It can specify the queue size, concurrency size, and memory usage size for each group. Set a reasonable hardConcurrencyLimit, softMemoryLimit, and maxQueued for each group to reduce the impact of different services and avoid OOM problems. Of course, good use of users and secondary development will enable Presto to support multiple users sharing the same group and authentication functions.