Reading from memory is always more efficient than reading from disk, so with all database technologies, you want to use as much memory as possible. This can cause high memory utilization and even oom problems if you are not sure about the configuration, or if the configuration is incorrect.

In this blog post, we’ll learn how to check PostgreSQL memory utilization and the parameters to consider when tuning it. To that end, let’s take a look at an overview of the PostgreSQL architecture.

PostgreSQL architecture

PostgreSQL consists of three parts: process, memory, and disk.

Memory can be divided into two categories:

Local Memory: It is loaded by each back-end process (one for each client connection) for its own query processing. It is divided into several sub-regions:

  • Work MEm: Working memory for sorting tuples with ORDER BY and DISTINCT operations, and joining tables.

  • Maintenance work MEm: This area is used for certain types of Maintenance operations. For example, VACUUM, if you don’t specify autovacuum_work_mem.

  • Temp buffers: Used to store temporary tables.

Shared Memory: Allocated by the PostgreSQL service at startup and used by all processes. It is divided into several sub-regions:

  • Shared Buffer pool: PostgreSQL loads pages containing tables and indexes from disks and reads data directly from memory, reducing disk access.

  • WAL buffer: WAL data is a PostgreSQL transaction log containing changes in the database. Data in the WAL buffer is written to the WAL file on the disk at regular intervals. This predefined time is called a checkpoint. Avoid data loss in the event of a server failure.

  • Commit log: Stores the state of all transactions and is used for concurrency control.

How to check

If you have high memory utilization, first of all, you should identify which process is generating the consumption.

Use the “top” command

This command displays information about processes that consume too much memory.

When you confirm that PostgreSQL is the cause, the next step is to check the cause.

PostgreSQL logs are used

When memory runs out, you will see the following message:

Out of memory: Kill process 1161 (postgres) score 366 or sacrifice child
Copy the code

Or some other error message when the database has some abnormal behavior

FATAL:  password authentication failed for user "username"
ERROR:  duplicate key value violates unique constraint "sbtest21_pkey"
ERROR:  deadlock detected
Copy the code

Therefore, logs can be used to detect such problems. You can automatically perform monitoring by parsing log files to look for information such as “FATAL”, “ERROR” or “Kill” and receive alerts when this happens.

Using Pg_top

Another tool, PG_top, can be used if you already know that the PostgreSQL process is very memory efficient, but the log does not show any problems.

This tool is similar to the top tool, but it is specific to PostgreSQL. As a result, you can use it to get more detailed information about the database service and even to terminate the query or run the query plan if an error is detected.

If any errors are still not detected and the database is still using a large amount of memory, you may want to check the database configuration.

Which configuration parameters to check

If all is well, but there are still high utilization issues, you should check the configuration to make sure it is correct. Therefore, in this case, the following parameters should be considered.

shared_buffers

This is the size of the shared memory buffer used by the database service. If this value is too low, the database will use more disk, which will result in slower performance, but if it is too high, it may result in high memory utilization. According to the documentation, if you have a dedicated database server with 1GB or more OF RAM, a reasonable starting value for shared_buffers is 25% of the system memory.

work_mem

It specifies how much memory ORDER BY, DISTINCT, and JOIN will use before writing to temporary files on disk. As with shared_buffers, if we set this parameter too low we can have more operations going to disk, but too high is dangerous for memory usage. The default value is 4 MB.

max_connections

Work_mem is also closely related to the max_connections value, because each connection will perform these operations simultaneously, and each operation will be allowed to use the memory specified by Work_mem before starting to write data to temporary files. This parameter determines the maximum number of concurrent connections to our database. If we have a large number of connections configured, the default value is 100.

temp_buffers

Temporary buffers are used to store temporary tables used in each session. This parameter sets the maximum amount of memory for this task. The default value is 8 MB.

maintenance_work_mem

This is the maximum memory that operations such as clearing, adding indexes, or foreign keys can consume. Only one of these types of operations can be run in a session, and it is not the most common thing to run multiple of these operations simultaneously on the system. The default value is 64 MB.

autovacuum_work_mem

The default is maintenance_work_mem, but we can use this parameter to separate them. Here we can specify the maximum amount of memory used by each Autovacuum worker.

wal_buffers

The amount of shared memory used for WAL data that has not been written to disk. By default, this parameter is set to 3% of shared_buffers but is not less than 64kB and is not larger than the size of a WAL segment, which is usually 16MB.

conclusion

There are many reasons for high memory utilization, and finding the cause of the problem can be a time-consuming task. In this blog post, we’ve covered different ways to check PostgreSQL memory utilization and what parameters you should consider to tune it to avoid excessive memory usage.