This is the third day of my participation in the August More text Challenge. For details, see:August is more challenging

The body of the

Use this blog to understand this article: an article on the internals of HBase

A read operation

  1. Find the region location of the Meta table from ZooKeeper and read data in the hbase: Meta table. The hbase: Meta table stores region information about user tables
  2. Based on the namespace, table name, and Rowkey information to be queried, locate the Region to which data is written
  3. Locate the RegionServer corresponding to the Region and send the request
  4. Find the corresponding Region
  5. Look for data from MemStore first, if not, then read it from BlockCache

The memory of RegionServer on HBase is divided into two parts: MemStore, which is used for writing. .

The other BlockCache is used to read data.

  1. If it is not found in the BlockCache, it is read from the StoreFile(HFile)

Instead of returning the resulting data directly to the client after reading from the StoreFile, the data is first written to the BlockCache to speed up subsequent queries. Then return the result to the client.

The write operation

  1. First, locate the Region location of the hbase: Meta table from ZooKeeper, and then read data in the hbase: Meta table. The hbase: Meta table stores Region information about user tables
  2. Locate the Region based on the namespace, table name, and Rowkey information
  3. Locate the RegionServer corresponding to the Region and send the request
  4. Write data to HLog and MemStore separately
  5. When MemStore reaches the threshold, data is flushed to disk and StoreFile is generated
  6. Example Delete historical data from HLog.