Welcome to cloud + community, get more Tencent mass technology practice dry goods oh ~

Author: Tencent Cloud database kernel team

MyRocks/RocksDB — STATISTICS and background threads

0. Intro

In facebook’s version of MySQL (hereafter referred to as MyRocks), RocksDB is the optional storage engine. One important advantage RocksDB has over the InnoDB engine is that it uses less disk space. In production systems, especially Internet applications with more than 100 million users, disk space is one of the larger costs, and RocksDB’s ability to use less disk space is certainly attractive. However, using a new storage engine in a production system has its potential risks. In addition to getting various performance data from external benchmark tools, a full range of internal metrics can help us really understand what is happening inside the database, which has guiding significance for performance tuning and development. MyRocks provides comprehensive internal metrics in the form of SHOW ENGINE ROCKSDB STATUS and multiple INFORMATION_SCHEMA tables.

This article describes how to implement STATISTICS and background threads in SHOW ENGINE ROCKSDB STATUS. With an understanding of how the implementation works, it’s easy to extend the functionality to make it work better for us.

Calling the SHOW ENGINE ROCKSDB STATUS directive returns multiple rows of data, including:

  • STATISTICS: Total count/time of all operations performed by all threads of the RocksDB engine, such as rocksdb.block.cache.hit and rocksdb.db.write.micros.
  • BG_THREADS: Status of background threads.
  • DBSTATS: Statistics of database operations.
  • CF_COMPACTION: Collects statistics about indicators when this compaction happens to each Column family.
  • MEMORY_STATS: Memory usage.

Calling SHOW ENGINE ROCKSDB STATUS returns several rows of data that are not previously stored in a table, Instead, the rocksdb_show_status function in the rocksdb/ha_rocksdb.cc file is called to normate the values in memory back to the user.

1. STATISTICS

According to the official documentation of RocksDB, enabling STATISTICS increases the cost by 5%-10%.

STATISTICS Records the total count/time of all operations performed by all threads of the RocksDB engine. The RocksDB engine has many buried points in its code for various operations such as Put/Get/Delete.

Take the GetEntryFromCache function, which returns an available block cache. In particular, you can see that statistics is an argument to GetEntryFromCache and block_cache->Lookup. That’s right, with statistics it collects data from all over the place. When a block cache is available, RecordTick is called three times to increment the count for three of the statistics. No block cache is available. Also add count for BLOCK_CACHE_MISS and block_cache_MISS_ticker.

Cache::Handle* GetEntryFromCache(Cache* block_cache, const Slice& key,
                                 Tickers block_cache_miss_ticker,
                                 Tickers block_cache_hit_ticker,
                                 Statistics* statistics) {
  auto cache_handle = block_cache->Lookup(key, statistics);
  if(cache_handle ! = nullptr) { PERF_COUNTER_ADD(block_cache_hit_count, 1); // overall cache hit RecordTick(statistics, BLOCK_CACHE_HIT); // total bytesread from cache
    RecordTick(statistics, BLOCK_CACHE_BYTES_READ,
               block_cache->GetUsage(cache_handle));
    // block-type specific cache hit
    RecordTick(statistics, block_cache_hit_ticker);
  } else {
    // overall cache miss
    RecordTick(statistics, BLOCK_CACHE_MISS);
    // block-type specific cache miss
    RecordTick(statistics, block_cache_miss_ticker);
  }

  return cache_handle;
}Copy the code

1.1 STATISTICS interface for RocksDB

Using STATISTICS is also simple.

Its header file is located at:

include/rocksdb/statistics.h
monitoring/statistics.hCopy the code

Usage:

Options options;
options.statistics = rocksdb::CreateDBStatistics();Copy the code

Optional statistics levels:

  • KExceptDetailedTimers: Time to remove mutex wait and compression
  • KExceptTimeForMutex: Timing to remove mutex wait
  • All kAll:

There are two types of data statistics:

  • Ticker: counts. The type is a 64-bit unsigned integer. Use counters (e.g. “rocksdb.block.cache.hit”), Cumulative bytes (e.g. “rocksdb. Bytes. Written”) or time (e.g. “rocksdb. L0. Slowdown. Micros”).
  • Histogram: Statistical distribution of statistical data, including maximum, minimum, mean, median, and standard deviation.

Interface of statistical function:

  • MeasureTime: Function name is ambiguous. In fact, the value is recorded into histogram.
  • RecordTick: Adds a ticker.

Interface to get results:

  • The Statistics: : getTickerCount: specify the ticker type for the count.
  • Statistics: : histogramData: specify the Histograms type, returns a histogramData structure, member Statistics, including maximum, minimum, mean, median, standard deviation.
  • The Statistics: : getHistogramString: specify the Histograms type, histogram readable string.
  • Statistics::ToString() : Returns a readable string, including all ticker and histogram.

1.2 Implementation of STATISTICS for RocksDB

RocksDB implements the StatisticsImpl class, which inherits the Statistics interface.

Main interface:

  • getTickerCount
  • histogramData
  • getHistogramString
  • getAndResetTickerCount
  • recordTick
  • measureTime
  • ToString

Member variables:

  • TickerInfo tickers_[INTERNAL_TICKER_ENUM_MAX];
  • HistogramInfo histograms_[INTERNAL_HISTOGRAM_ENUM_MAX];

The TickerInfo and HistogramInfo data structures are similar: a thread-local counter or time; Add a non-thread local statistic to add up counter or time.

The TickerInfo type contains two arguments:

Thread_value of type ThreadLocalPtr (real type ThreadTickerInfo), containing:

  • Value of the integer type
  • Pointer to merged_sum
  • Merged_sum of integer type
  • The HistogreamInfo type takes two arguments:

Thread_value of the ThreadLocalPtr type (real type ThreadHistogramInfo), which contains:

  • A value of type HistogramImpl
  • Pointer to merged_hist
  • Pointer to merge_lock
  • Merged_hist of type HistogramImpl
  • Merge_lock of type Mutex

In fact, the statistics-related implementation is pretty neat, and is key to only increasing the use of STATISTICS by 5-10%. To avoid frequent CPU cache invalidation due to data sharing between threads, both merged_sum and merged_hist are empty when initialized, and if and only if the thread exits, The mergeThreadValue function is called to add thread-local variables from TickerInfo and HistogreamInfo to merged_sum and merged_hist.

1.3 Use of MyRocks

MyRocks uses the interface provided by RocksDB for data statistics. The variable rocksDB_STATS is declared and initialized with the rocksdb_init_func function as the RocksDB engine starts.

rocksdb_stats = rocksdb::CreateDBStatistics();
rocksdb_db_options->statistics = rocksdb_stats;Copy the code

In addition to using all RocksDB engine layer statistics, MyRocks is also defined

commit_latency_stats = new rocksdb::HistogramImpl();Copy the code

Rocksdb_commit_by_xid and rocksdb_commit count the time spent for each commit through timing.

rocksdb::StopWatchNano timer(rocksdb::Env::Default(), true); . commit_latency_stats->Add(timer.ElapsedNanos() / 1000);Copy the code

The rocksdb_show_status function outputs Statistics as follows:

  1. If rocksDB_STATS is defined, then rocksdb_stats->ToString() is called to convert the statistics into a readable string;
  2. Commit_latency_stats is a histogram type that outputs the values of the corresponding 50%, 95%, 99%, and 100% loci.
  3. If Property variables such as IS-write-stopped or actual-delayed-write-rate are defined, they will also be output.

2 Background Threads

The result associated with BG_THREADS can be obtained by calling SHOW ENGINE ROCKSDB STATUS, which produces output similar to:

Type: BG_THREADS
Name: 140173379593984
Status:
thread_type: Low Pri# #
cf_name: default
operation_type: Compaction
operation_stage: CompactionJob::ProcessKeyValueCompaction
elapsed_time_ms: 6172.244 ms
BaseInputLevel: 0
BytesRead: 992806363
BytesWritten: 992071408
IsDeletion: 0
IsManual: 0
IsTrivialMove: 0
JobID: 1936
OutputLevel: 5
TotalInputBytes: 1586832446
state_type:Copy the code

You can see more information: This thread is a Compaction, in CompactionJob: : ProcessKeyValueCompaction stage, has spent 6172.244 ms, read the number of bytes to 992806363, to write the number of bytes for 992071408. It does not, however, include information that might interest you about the source and destination files that are running this Compaction. As mentioned at the beginning of this article, understanding the implementation principles allows us to extend better.

2.1 Interface and implementation of Thread Status

The SHOW ENGINE ROCKSDB STATUS directive in MyRocks shows that the BG_THREAD mechanism uses the ROCKSDB interface for thread STATUS.

Its header file is located at:

include/rocksdb/env.h
include/rocksdb/thread_status.h
util/thread_operation.h
monitoring/thread_status_updater.h
monitoring/thread_status_util.hCopy the code

Key categories:

ThreadStatusUpdater: Stores a pointer to the state of each background thread and the state of all background threads. ThreadStatusUtil: This class has only static variables and static methods. It is recommended to update the state in ThreadStatusUpdater using methods of this class.

Usage:

  • To add the thread statistics ThreadStatusUpdater: call ThreadStatusUtil: : RegisterThread
  • Delete this thread statistics from ThreadStatusUpdater: call ThreadStatusUtil: : UnregisterThread
  • For other functions that modify thread status, see Monitoring /thread_status_util.h

The state of the current background thread can be obtained by calling env’s GetThreadList() function. The state values are stored in a vector. Display the contents like the following:

As you can see from the code, thread Status is implemented to show the running status of Flush and compaction. Of course, we can also store the state of the user thread to Thread Status, which is displayed by calling the SHOW ENGINE ROCKSDB status directive.

In particular, we can see that the typical state values of this compaction are:

enum CompactionPropertyType : int {
    COMPACTION_JOB_ID = 0,
    COMPACTION_INPUT_OUTPUT_LEVEL,
    COMPACTION_PROP_FLAGS,
    COMPACTION_TOTAL_INPUT_BYTES,
    COMPACTION_BYTES_READ,
    COMPACTION_BYTES_WRITTEN,
    NUM_COMPACTION_PROPERTIES
  };Copy the code

The unique state values of flush are:

  enum FlushPropertyType : int {
    FLUSH_JOB_ID = 0,
    FLUSH_BYTES_MEMTABLES,
    FLUSH_BYTES_WRITTEN,
    NUM_FLUSH_PROPERTIES
  };Copy the code

2.2 Use of MyRocks/RocksDB

In RocksDB thread pool implementation, every start of background thread by invoking ThreadStatusUtil: : RegisterThread join was observed in the collection of background threads.

ThreadPoolImpl::Impl::StartBGThreads-->BGThreadWrapper-->ThreadStatusUtil::RegisterThreadCopy the code

The rocksdb_show_status function outputs BG_THREAD as follows:

  1. Get the set of ThreadStatus for all background threads by calling GetThreadList(&thread_list).
  2. Output the state of each background thread in turn by iterating through the set of ThreadStatus.

3. Summary

This article describes STATISTICS and BG_THREAD in the SHOW ENGINE ROCKSDB STATUS directive.

reading

MySQL kernel deep optimization

【 Tencent Cloud CDB】 in-depth analysis of MySQL binlog

· MySQL binlog group commit and multi-threefold-slave

This article has been authorized by the author yunjia community published, reproduced please indicate the source of the article;