DolphinDB is a high-performance distributed time-series database that supports multiple users, multiple tasks, and concurrent operations. Efficient memory management for big data is one of the reasons for its excellent performance. Memory management covered in this tutorial includes the following aspects:

  • Memory management of variables: Provide and reclaim the memory required by the programming environment for the user.
  • Cache management of distributed tables: Multiple sessions share partitioned table data to improve memory utilization.
  • Stream data cache: The stream data sending node provides persistence and send queue cache, and the subscription node provides receive data queue cache.
  • DFS database write cache: Data written to DFS is written to WAL and cache first to improve throughput through batch write.

1. Memory management mechanism

DolphinDB requests memory blocks from the operating system and manages them itself. If the requested memory block is idle, the system periodically checks and releases it. Currently, memory allocation for vectors and tables and all strings is included in DolphinDB’s memory management system.

MaxMemSize specifies the maximum memory usage of a node. This parameter specifies the maximum available memory of a node. If the setting is too small, the performance of the cluster will be severely limited. If the setting is too large, such as exceeding physical memory, the operating system may forcibly shut down the process. If the vm memory is 16GB and only one node is deployed, you are advised to set this parameter to 12GB.

DolphinDB requests 512MB of memory from the operating system when a user requests memory for a query or program. If the operating system cannot provide large contiguous memory blocks, smaller memory blocks such as 256MB or 128MB are used.

DolphinDB caches as much database partition data as possible when the node memory usage is less than maxMemSize to speed up the next access. When the memory is insufficient, the system automatically removes some caches.

Every 30 seconds, free memory blocks are returned to the operating system: When a user releases a variable in memory or uses the clearAllCache function to release the cache, if the memory block is completely free, it is returned to the operating system. If a small part of the memory block is still in use, for example, 10MB of the 512MB memory block is still in use, it is not returned to the operating system.

2. Variable memory management

2.1 Creating variables

On the DolphinDB node, create user user1 and log in. Create a vector of 100 million INT elements, about 400MB.

Example 1. Create a vector variable

Admin createUser("user1","123456") login("user1","123456") v = 1.. 100000000 sum(mem().blocksize - mem().freesize) // Displays the memory usage resultCopy the code

The result is 402,865,056, and the memory usage is about 400MB, which meets the expectation.

Create a table with 10 million rows, 5 columns, 4 bytes each, about 200MB.

Example 2. Create the table variable

n = 10000000
t = table(n:n,["tag1","tag2","tag3","tag4","tag5"],[INT,INT,INT,INT,INT])
(mem().blockSize - mem().freeSize).sum()
Copy the code

Results: 612,530,448, about 600MB, as expected.

2.2 Releasing variables

The memory of a variable can be freed through the undef function.

Example 3. Use the undef function or assign to NULL to release the variable

undef(`v)
Copy the code

or

v = NULL
Copy the code

In addition to manually releasing variables, when a session is closed, such as closing GUI and other API connections, all memory of that session is reclaimed. If no operation is performed within 10 minutes, the system closes the session and automatically reclaims the memory.

3. Cache management of distributed tables

DolphinDB manages distributed tables by partition. The cache of distributed tables is globally shared, and different session or read transactions will see the same data copy (version may vary) in most cases, which greatly saves memory usage.

Historical databases exist in the form of distributed tables, and users often interact directly with distributed tables in their daily query operations. Memory management for distributed tables has the following features:

  • Memory is managed in partition columns.
  • Data is only loaded to the node where it resides and is not transferred between nodes.
  • When multiple users access the same partition, the same cache is used.
  • If the memory usage is less than maxMemSize, cache as much data as possible.
  • When the cache data reaches maxMemSize, the system automatically reclaims it.

The following examples are based on clusters: deployed on two nodes in single-copy mode. 10 million rows * 8 bytes/column = 80M, 10 million rows * 80 bytes/row = 800M, 10 million rows * 80 bytes/column = 800M The entire table has 300 million rows and is 24GB in size.

The clearAllCache() function clears the cached data and is used to clear all caches on the node before each of the following tests.

3.1 Memory is managed by partition column

DolphinDB uses column storage. When users query data from distributed tables, only the required partitions and columns are loaded into memory.

Example 4. Calculate the maximum tag1 value for partition 2019.01.01. The partition is stored on node1. You can check the partition distribution on controller using getClusterChunksStatus(), and each column is about 80MB. Execute the following code on node1 and check the memory footprint.

Select Max (tag1) from loadTable(dbName,tableName) WHERE day = 2019.01.01 sum(mem().blocksize - mem().freesize)Copy the code

The output is 84,267,136. We only query one column of 1 partition, so we load all of that column into memory, and the other columns are not loaded.

Example 5. Query the first 100 items of 2019.01.01 on node1 and check the memory usage.

Select top 100 * from loadTable(dbName,tableName) WHERE day = 2019.01.01 sum(mem().blocksize - mem().freesize)Copy the code

The output is 839,255,392. DolphinDB takes only 100 data, but the smallest unit for loading DolphinDB data is the partition column, so you need to load all the data in each column, that is, the entire partition, which is about 800MB.

Note: Partition properly to avoid “out of memory” : DolphinDB manages memory by partition, so the amount of memory used depends on the partition. If the user partition is not uniform, resulting in a large amount of data in a partition, even the entire memory of the machine is not enough to accommodate the entire partition, then when the query calculation of this partition is involved, the system will throw an “out of memory” exception. As a general rule, if maxMemSize is set to 8, the sum of common query columns in each partition is 100-200MB. If the table has 10 columns of common query fields with 8 fields each, then each partition has about 1 to 2 million rows.

3.2 Data is only loaded to the node where it is located

In the case of a large amount of data, transferring data between nodes is a time-consuming operation. DolphinDB data is distributed. When computing tasks are performed, DolphinDB data is sent to the node rather than transferred to the node. This reduces data transfer between nodes and improves computing efficiency.

Example 6. Calculate the maximum value of TAG1 between the two partitions on node1. The 2019.01.02 array is stored on Node1, and the 2019.01.03 data is stored on Node2.

Select Max (tag1) from loadTable(dbName,tableName) WHERE day in [2019.01.02,2019.01.03] sum(mem().blocksize - mem().freeSize)Copy the code

The output is 84,284,096. Memory usage on node2 is 84,250,624. Each node stores 80 MB of data. That is, node1 stores 2019.01.02 data and Node2 stores 2019.01.03 data.

Example 7. Query all data for 2019.01.02 and 2019.01.03 on node1. We expect 2019.01.02 data to be loaded on node1 and 2019.01.03 data to be loaded on node2.

Select top 100 * from loadTable(dbName,tableName) WHERE day in [2019.01.02,2019.01.03] sum(mem().blocksize - mem().freeSize)Copy the code

The output on node1 is 839,279,968. The output on node2 is 839,246,496. The results were in line with expectations.

Note: Use “SELECT *” without filtering conditions with caution, as this will load all data into memory. Pay attention to this point when there are many columns. You are advised to load only required columns. If you use “Select Top 10 *” without filtering criteria, all data from the first partition will be loaded into memory.

3.3 The Same Cache is Used when multiple Users access the same Partition

DolphinDB supports concurrent queries for large amounts of data. For efficient memory utilization, only one copy of data in the same partition is kept in memory.

Example 8. Open two GUIs, connect node1 and node2 respectively, and query data for partition 2019.01.01, where data is stored on Node1.

Select * from loadTable(dbName,tableName) WHERE date = 2019.01.01 sum(mem().blocksize - mem().freesize)Copy the code

No matter how many times the above code is executed, the memory display on node1 is always 839,101,024, and there is no memory footprint on node2. Since partitioned data is only stored on Node1, node1 loads all data, while Node2 takes up no memory.

3.4 Relationship between Node Memory Usage and cache data

3.4.1 If the node memory usage does not exceed maxMemSize, cache as much data as possible

DolphinDB caches as much DolphinDB data as memory allows (memory usage does not exceed the maxMemSize set by the user) to improve the efficiency of accessing subsequent data.

Example 9. MaxMemSize =8 for the data node. Nine partitions are continuously loaded, each of which is about 800 MB. The total memory usage is about 7.2GB. Observe the memory trend.

days = chunksOfEightDays();
for(d in days){
    select * from loadTable(dbName,tableName) where  = day
    sum(mem().blockSize - mem().freeSize)
}
Copy the code

Memory changes as the number of loaded partitions increases, as shown in the figure below:

When traversing each partition data, the partition data is fully cached in memory, as long as the memory usage does not exceed maxMemSize, so that the data can be supplied directly from memory the next time the user accesses it, rather than being loaded from disk again.

3.4.2 When the node memory usage reaches maxMemSize, the system automatically reclaims the node memory

DolphinDB Server memory that does not exceed maxMemSize is not reclaimed. DolphinDB uses the LRU recycling strategy when the total memory usage reaches maxMemSize to free up enough memory for the user.

Example 10. The above use case only loaded the data of 8 days. Now we continue to iterate the data of 15 days in total to check the memory usage when the cache reaches maxMemSize. As shown below:

As shown in the figure above, when the cached data exceeds maxMemSize, the system automatically reclaims the memory, and the total memory usage is still less than the maximum memory size of 8GB set by the user.

Example 11. When the cache data is close to the maxMemSize set by the user, apply for the memory space of the Session variable to check the system memory usage. Check the memory usage of the system:

sum(mem().blockSize - mem().freeSize)
Copy the code

The output is 7,550,138,448. The memory usage exceeds 7GB, and the maximum memory usage set by the user is 8GB, we continue to apply for 4GB space.

v = 1.. 1000000000 sum(mem().blockSize - mem().freeSize)Copy the code

The output is 8,196,073,856. The value is about 8GB, which means that if a user defines variables, memory reclamation of cached data is also triggered to ensure that there is enough memory for the user to use.

4. Stream data message cache queue

When data enters the stream data system, it is first written to the stream table, then to the persistence queue (assuming the user has set asynchronous persistence) and to the send queue (assuming asynchronous persistence), which asynchronously writes to disk and sends the send queue to the subscriber.

When the subscriber receives the data, it puts it into the receiving queue, and a user-defined handler retrieves the data from the receiving queue and processes it. If the handler process is slow, data accumulates in the receiving queue, occupying memory. As shown below:

Configuration options related to streaming data memory:

  • MaxPersistenceQueueDepth: flow table persistence to the maximum number of message queue. For publication flow tables with asynchronous persistence, data is first put into persistence queues and then asynchronously persisted to disk. This option defaults to 10 million. When disk writes become a bottleneck, queues pile up data.
  • MaxPubQueueDepthPerSite: Maximum message publishing queue depth. For a subscriber node, the publishing node establishes a message publishing queue, and the messages in the queue are sent to the subscriber. The default value is 10 million. When the network is congested, the send queue will accumulate data.
  • MaxSubQueueDepth: Maximum queue depth of messages that can be received per maximum subscriber thread on the subscriber node. Subscribed messages are placed in the subscription message queue first. The default value is 10 million. When the handler process is slow and cannot process the subscribed messages in a timely manner, the queue will accumulate data.
  • Capacity of the stream table: specified in the fourth parameter of the function enableTablePersistence(), this value represents the maximum number of rows in the stream table that are kept in memory, at which point half of the data is removed from memory. If there are many flow tables on a flow data node, set this parameter properly to prevent memory shortage.

Run the procedure to see the size of the flow table and the depth of the individual queues using the function getStreamingStat().

5. Provide cache for writing to DFS database

DolphinDB uses the common practice of writing WAL and cache first to improve read and write throughput and reduce read and write latency, and then writing in batches when a certain number of dolphindbs are accumulated. This reduces the number of interactions with disk files and improves write performance by more than 30%. Therefore, some memory space is also needed to temporarily cache these data, as shown in the following figure:

When transactions T1, T2, and T3 are completed, the data from all three transactions is written to the DFS database disk at once. It is recommended that the size of the Cache Engine be 1/8 to 1/4 of the maxMemSize. The size can be adjusted based on the maximum memory size and data write volume. The size of CacheEngine can be configured using the chunkCacheEngineMemSize configuration parameter.

  • ChunkCacheEngineMemSize: specifies the capacity of the cache engine. After the cache Engine is enabled, data is written to the cache first. When the amount of data in the cache reaches 30% of chunkCacheengInemememsize, data is written to disks.

6. Use memory efficiently

DolphinDB often acts as a stream data center and historical data warehouse for business people to query and calculate data in production environments. When there are a large number of users, the Server memory may be used up improperly, and an “out of Memory “exception may be thrown. You can follow these suggestions to avoid improper use of memory.

  • DolphinDB is properly partitioned: DolphinDB loads data in partitions, so partition size has a huge impact on memory. Reasonably uniform partitioning has a positive effect on both memory usage and performance. Therefore, when creating a database, plan the partition size appropriately according to the data size. The amount of common field data for each partition is about 100MB.
  • Release variables with a large amount of data: If you create variables with a large amount of data, for example, V = 1.. 10000000, or assign the query result containing a large amount of data to a variable T = select * from T where date = 2010.01.01, V and T will occupy a large amount of memory in the user’s session. If not released in a timely manner, exceptions may be thrown when other users apply for memory due to insufficient memory.
  • Query only as many columns as you need: Avoid select *, which loads all columns of the partition into memory. In practice, only a few columns are often required. Therefore, to avoid memory waste, try to write all query columns explicitly rather than using * instead.
  • DolphinDB retrieves data by partition. If DolphinDB does not filter data by partition, all data is scanned. When data is large, memory is quickly exhausted. If there are multiple filter criteria, write the partition filter criteria first.
  • Release unnecessary variables or sessions as soon as possible: According to the above analysis, the user’s private variables are stored in the created session. When the session closes, this memory is reclaimed. Therefore, use the undef function or close the session as early as possible to free memory.
  • Properly configure the cache for stream data: Generally, capacity of stream data directly affects the memory usage of the publishing node. For example, if the capacity is set to 10 million, half of the memory usage will be reclaimed when the capacity exceeds 10 million, that is, about 5 million will be retained in the memory. Therefore, the capacity of the flow table should be reasonably designed according to the maximum memory of the publishing node. Especially in the case of multiple release tables, careful design is required.

7. Memory monitoring and FAQs

7.1 Memory Monitoring

7.1.1 Controller Monitoring the memory usage of nodes in a cluster

The function getClusterPerf() is provided on the controller to display the memory usage of each node in the cluster. Include:

MemAlloc: The total memory allocated on a node. It is approximately the total memory allocated to the operating system.

MemUsed: used memory of a node. This memory includes variables, distributed table caches, various cache queues, and so on.

MemLimit: indicates the maximum memory limit for a node, that is, the user-defined maxMemSize.

The 7.1.2 mem() function monitors the memory usage of a node

The mem() function displays the memory allocation and usage for the entire node. The function outputs four columns, blockSize indicates the size of the allocated memory block, and freeSize indicates the size of the remaining memory block. Sum (mem().blocksize -mem ().freesize) is used to obtain the total memory size used by the node.

7.1.3 Monitoring memory Usage of Different Sessions on a Node

You can use the function getSessionMemoryStat() to view the amount of memory occupied by each session on a node, which contains only variables defined within the session. If the memory usage on a node is too high, you can use this function to check which user uses a large amount of memory.

7.1.4 Checking the Memory Usage of an Object

Use the memSize function to view how much memory an object occupies, in bytes. Such as:

v=1.. 1000000 memSize(v)Copy the code

Output: 4000000.

7.2 Common Problems

7.2.1 The Node Memory Usage is Too High

DolphinDB caches as much data as memory allows. Therefore, it is normal to show only that the node memory footprint is too high, close to maxMemSize, with no other memory-related errors. GetSessionMemoryStat () can be used to check the memory usage of each session, and clearAllCache() can be used to manually release the cached data of the node.

7.2.2 The displayed value of MemAlloc is different from the actual displayed value of the operating system

DolphinDB is a C++ program that requires some basic data structures and memory overhead. MemAlloc displays do not include DolphinDB memory.

7.2.3 “Out of Memory” Is Reported During Query

This exception is often caused by the memory required by the Query being greater than the memory available to the system. The possible causes are as follows:

  • The query does not have partition filtering criteria or the criteria are too wide, resulting in a large amount of data involved in a single query.
  • The partition is not uniform. A partition may be too large, and the data in the partition exceeds the maximum memory configured for the node.
  • A session holds large variables, resulting in a small amount of memory available to the node.

7.2.4 When querying information, the DolphinDB process exits and no Coredump is generated

DolphinDB is often forcibly removed from the operating system when the memory allocated to the node exceeds the system’s physical memory limit. On Linux, you can view the cause in the operating system logs.

7.2.5 MemUsed does not decrease significantly after the clearAllCache() function is executed

You can use getSessionMemoryStst() to see how much memory each session occupies. It is possible that a session does not release variables that occupy a large amount of memory. As a result, the memory cannot be reclaimed.