The original:Scaling Our Analytical Processing Service: Sharding a PostgreSQL Database with Citus

Online advertisers are making more and more decisions based on performance data. Whether it’s choosing an audience or an idea to invest in, or enabling algorithmic optimisation of advertising campaign budgets, decisions increasingly rely on ready-to-use data. Our development teams have built powerful tools to help our customers analyze performance data and make better decisions.

Our solution consists of highly customizable reports, including drum-down tables and charts supported by our own extremely flexible query language. Data services that support query languages process terabytes of data. In addition to serving as the back end of our user-facing analytics tool, it supports all of our automatic optimization features and some of our internal BI systems. In this blog post, I’ll show you how we solved the scaling problem by sharding the database used by the back-end system.

Massive databases amount to scaling trouble

Our analytical data processing service, called Distillery, uses the PostgreSQL database. This service securely converts jSON-formatted queries into SQL queries that will eventually run at the database level. Most of the data processing takes place in the database, so the Distillery back end mainly converts our own query language into SQL queries. The complexity of the original API queries made some of the generated SQL queries complex and made them demanding at the database level. Therefore, we were not surprised when we encountered scaling problems during the development of the reporting system.

In the past, we vertically extended our master replica database architecture, but it became clear that we had reached the limits of this approach. Our database has accumulated nearly 5 terabytes of data in its three years of operation and has become unmanageable. Large size makes updating heavy applications slow to write and difficult to perform maintenance tasks. Finally, the biggest problem was that our data center could not provide larger servers.

Solution: Use Citus sharding PostgreSQL database

When vertical scaling failed, we had to start scaling our reporting database horizontally. This means we need to split the data and process it across multiple database servers. We also had to shrink the large database tables that contained statistics from each individual database instance.

This method of slicing database data into smaller units is called database sharding. Our team decided to use the PostgreSQL Citus plug-in to handle sharding. This wasn’t the only option — we considered using custom application-level sharding, but decided to use the Citus plug-in because:

  • We have a lot of complex queries that need to use multiple different shards at the same time.CitusThe plug-in automatically processes these complex queries and distributes processing between shards.
  • It also supports a wide range of what we need to run complex report queriesPostgreSQLFunction.
  • This extension makes shard management relatively easy, so we don’t have to spend much effort managing shard tables in separate database instances.

Citus is based on the PostgreSQL database instances of coordinator and worker. Workers hold database table shards, and coordinators plan SQL queries so that they can run across multiple shard tables between workers. This allows large tables to be distributed across multiple servers and into smaller, more manageable database tables. Writing to smaller tables is more efficient because database index maintenance costs are lower. In addition, the write load is parallelized and shared between database instances. Citus addressed two of our biggest pain points: write inefficiencies and the imminent end of vertical scaling.

Citus’s database sharding brings additional benefits because the new architecture speeds up our reporting queries. Some of our queries hit multiple worker instances and shards, which can be optimized by the Citus extension to run them in parallel across different database instances. Since smaller table indexes and more resources can be used for query processing in a separate worker, the query only for a single worker fragment will also be faster.

Migrating large databases and complex report queries to this type of sharded database architecture is no easy task. It involves careful preparation and planning, which we will examine next.

Migrate to a new database

In the past, we ran report queries through the old PHP singleton. We were building an updated reporting back end using Ruby on Rails long before the database extension issue. After deciding to only handle SQL query migration in the new backend, we started phasing out the old backend. This enabled us to optimize new report queries specifically for Citus. It makes migration from the application level easier, because we only need to migrate this service to work with Citus sharding PostgreSQL.

Sharding database has certain requirements on database schema. The schema must have a value as a sharding condition. Sharding logic uses this value to distinguish which shard the data is on. In CitUS-PostgresQL, sharding is controlled using the table primary key. This compound primary key contains one or more columns, where the first column defined is used as the sharding value:

ALTER TABLE ad_stats ADD PRIMARY KEY (account_id, ad_id, date);
SELECT create_distributed_table('ad_stats'.'account_id'); -- Defines sharding for Citus cluster
Copy the code

Here the Account ID column is used as a shard key, which means we are assigning data based on our customer accounts (a single customer can have multiple accounts). This means that the data for a single account is in a single table shard. We must ensure that all primary keys are in this format and that the table contains account ID information. We also had to change some foreign keys and uniqueness constraints because they also had to contain shard columns. Fortunately, all of these changes can be safely applied to the running production database without any performance or data integrity issues, although we had to do some more extensive database index rebuilds.

The second step is to make the SQL queries generated by our report back end compatible with the shard database. First, the query must contain shard values in the SQL WHERE clause. This means, for example, that filters must take the following form

SELECT * FROM campaigns WHERE account_id = 'xxx' AND name = 'yyy'
Copy the code

If we did not have the account_ID condition, the Citus distributed query planner would have no information about which shard to find the relevant row from. Reading from all possible shards is not as efficient as reading from a single shard.

In addition, Citus imposes restrictions on the types of joins you can perform between shard tables. Normally JOIN requires that fragments be listed in JOIN conditions. For example, this will not work:

SELECT *
FROM
  campaigns
  LEFT JOIN ads ON campaigns.id = ads.campaign_id
WHERE
  campaigns.account_id = 'xxx'
Copy the code

This results in an error:

ERROR: cannot run outer join query if join is not on the partition column&

This means that SQL joins require one-to-one matches between table fragments that Citus cannot determine from the query. Therefore, the query needs to include shard columns in the JOIN condition, from which Citus can detect that the range of ADS table joins is within a shard:

SELECT *
FROM
  campaigns
  LEFT JOIN ads ON campaigns.account_id = ads.account_id -- Use sharding column
    AND campaigns.id = ads.campaign_id
WHERE
  campaigns.account_id = 'xxx'
Copy the code

Various other SQL query optimizations were made to enable Citus query Planner to run our complex statistical report queries efficiently. For example, we organize queries using common table expressions (CTE), which allows the Citus query planner to select the best plan for heavy queries that involve reading multiple shards simultaneously. These queries for multiple accounts are also highly parallelized in the Citus Worker cluster to improve data processing efficiency. In addition, we have contributed to the Citus extension by adding support for PostgreSQL JSON(B) aggregation, which our report queries use for certain data pre-aggregation steps. You can check out PR on Github.

PR

  • Github.com/citusdata/c…

A new database system running

Our database system was completely moved from a single master replica configuration to coordinator + 4 worker servers, each replicated for high availability. This meant that our old database containing 5 terabytes of data was split into a cluster where each database server held about 1 terabyte of data. Citus allows us to add more worker servers fairly easily to further fragment the company as it continues to grow. We can also isolate the most demanding customers with lots of statistics to their own database servers.

Database schema before migration.

The migrated database schema.

The figure above depicts the database architecture before and after the migration. We now have a total of 10 database servers, compared to our previous state of having two large database servers. These smaller database instances are easier to manage because most of the data exists in a separate database worker server. The coordinator holds smaller amounts of data, such as some metadata and shard insensitive data. The second figure also shows the database copy we used to ensure quick recovery in the event of a database instance failure. This failover from primary Master server to replica server is handled by the PGPool component. The replica also shares some of the read load from the master server.

Finally, our PivotTable report queries, which are the most demanding in terms of data processing, saw a 2-10x performance improvement from the new database system. The database queries generated by this feature are complex because we allow users to define how the data is grouped, filtered, and aggregated. It also allows queries to run freely across shards because users can define any combination of accounts. The benefits of Citus sharded databases are really in these specific queries. Database migration was necessary because our old database infrastructure was almost overwhelmed by the complex queries it generated.

The figure shows the duration of 90 percentage points of performance gains for certain types of queries during a database migration project.

More and more

  • Distributed PostgresQL-CITus architecture and concepts