HBase architecture

The relationship between region – store – columnFmily

Logical layer: HRegion consists of one or more stores

Table (HBase table) Region (Regions for the table) Store (Store per ColumnFamily for each Region for the table) MemStore  (MemStore for each Store for each Region for the table) StoreFile (StoreFiles for each Store for each Region for the table) Block (Blocks within a StoreFile within a Store for each Region for the table)Copy the code

Physical hierarchy: Each store holds a Column family

1. Write operations

Client writes -> stores to MemStore until MemStore is full -> Flush to a StoreFile, When the number of StoreFiles reaches a certain threshold -> Trigger Compact merge -> Merge multiple StoreFiles into one StoreFile, merge versions and delete data at the same time -> Compact Gradually forming larger and larger storefiles -> When the size of a single StoreFile exceeds a certain threshold, the Split operation is triggered to Split the current Region into two regions. The original Region is offline. The two child regions created by the Split are allocated to the corresponding HRegionServer (load balancing) by the HMaster, so that the pressure of the original Region is distributed to the two regions. In this process, HBase only adds data and performs updates and deletes. This is all done in the Compact phase, so user writes only need to be in memory and immediately returned, ensuring high I/O performance.

Second, read operation

Client -> ZooKeeper ->.ROOT->.META-> User data table Is recorded by ZooKeeper. ROOT path information (ROOT has only one region), recorded in.ROOT. META Region information (.META may have multiple regions), region information is recorded in.META.

In HBase, all storage files are divided into small storage blocks. These small storage blocks are loaded into memory during GET or SCAN operations. They are similar to storage unit pages in RDBMS. The default size of this parameter is 64K. You can set it in the following ways:void setBlocksize(int s);(Note: The default size of hfiles in HBase is 64K, which has nothing to do with 64M blocks in HDFS.) HBase reads ** a data block sequentially to the memory cache. When it reads adjacent data, it can read it from the memory instead of from the disk again, effectively reducing the number of disk I/ OS. This parameter defaults to TRUE, which means that each time a block is read, it is cached in memory. However, if the user reads a particular column family sequentially, it is a good idea to set this property to FALSE to disable cache caching. The reason described above is that if we access a particular column family, but we enable this function anyway, our mechanism will load the data of other column families that we do not need into memory, adding to our burden. We use the condition that we fetch adjacent data. void setBlockCacheEnabled(boolean blockCacheEnable);

To optimize the

1: Forbid automatic brushing.

When we have a lot of data to insert, if we don’t disable it, Put instances are sent to the Regio server one by one. If the user disables the auto-write function, Put operations are not sent until the write buffer is full.

2: Use scan cache.

If HBase is used as an input source for a MapReduce job, it is best to use setCaching() to set the cache of the scanner instance that is used as input for a MapReduce job to a greater number than the default value of 1. Using the default value means that the Map task requests the Region server for each record. However, if the value is 500, you can send 500 pieces of data to the client for processing at a time, depending on your situation. This is row level, and it’s explained on page 119.

3: limits the scan range.

This is easy to understand, for example, when we are dealing with a large number of rows (especially as input sources for MapReduce), where we have scans.addfamily () for scan; If we only need to get to a couple of columns in this column family, then we have to be precise. Because too many columns can lead to a loss of efficiency.

4: Disable resultScanner

Of course it doesn’t improve our efficiency, but it does if it’s left on.

5: Block cache usage

Our first block cache is scans.setCachebolcks (); We should use cache blocks for frequently accessed rows, but mapReduce jobs that scan large numbers of rows should not. (This block cache is different from the one I mentioned in Section 4).

6: Optimize the way to obtain the health

Of course, the only way to use this is if we only need the keys in the table. So there are instructions on page 411 on how to use it.

7: Disables WAL on Put

The book says so, but I personally think it’s better not to use this feature because if we turn it off, the server will not write put to WAL, but directly to memStore, so that if the server fails, our data will be lost.

Iii. Functions of HLog

In a distributed system, errors or system outages are unavoidable. Once HRegionServer exits unexpectedly, memory data in MemStore will be lost. HLog is introduced to prevent this situation.

Working mechanism: Each HRegionServer has an HLog object. HLog is a class that implements Write Ahead Log. When a user writes Memstore, a data is also written to the HLog fileIt will roll out regularlyAnd delete old files (data persisted to StoreFile). When HRegionServer terminates unexpectedly, the HMaster uses Zookeeper to sense the HLog file. The HMaster splits the log data of different regions into the corresponding region directory. Then, the invalid regions (with the newly split logs) are redistributed. The HRegionServer that obtains these regions discovers that historical Hlogs need to be processed during the Load of the regions. Therefore, the HLog data is replayed into the MemStore. And then flush to StoreFiles,The data is restored.

Fourth, Region is StoreFiles. StoreFiles are composed of HFiles, and hFiles are composed of hbase data blocks. A data block contains many keyvalue pairs, and each keyvalue stores the required value.

Five,

You can cash back from the figure above

A table has two column families (one red, one yellow). A column family has two columns. As you can see from the figure, this is the biggest feature of a column database. Finally, we also find that there are values like > R1: Rowkey, CF1 :column Family, C1 :qualiter(column), T1: versionId (version number), value (the last picture shows where value can be stored). From this point of view, we found that we could save a lot of storage space if we used shorter names for r1: Rowkey, CF1 :column Family, c1:qualiter. Also, we should also get from this picture this: I have to see the second picture, the field screening efficiency markedly reduced from left to right, when the design of the keyvalue so users can consider to put some important screening information left to the right place, and without changing the number According to the amount of cases, improve query performance. To put it simply, users should try to store query dimensions or information in the keys, because it is the most efficient way to filter data. \

After getting the above understanding, we should also have such consciousness:

HBase data is stored sequentially in a specific region. Because HBase data is stored sequentially, it is stored in the same region. A region can only be managed by one server. The cluster performance deteriorates. There are ways to solve this problem. What I can think of is, for example, if we have 9 servers, we can go back to the current time, and then mod 9 or inverse, add the prefix Rowkey to it. This will be evenly distributed to different region servers. Because connected data is distributed to different servers, users can read data in parallel through multiple threads, which improves query throughput. \

For our version control, we either have multiple serversTime synchronizationOr simply set a client-side timestamp when put inserts data instead. (Because if we don’t display the add-on, people will add our own time to our own server.)

Write cache

  Operations with small amounts of data: every oneThe operation of the putIt’s actually an RPC operation that passes data from the client to the server and back; Large data operations: If an application needs to store thousands of rows of data per second to an HBase table, put processing is not appropriate. The HBase API is configured with a write buffer for the client. – The buffer is responsible for collecting put operations and then calling RPC operationsA one-timeSend put to the server. By default, client buffers are disabled. Buffers can be activated by setting auto brush to FALSE.

Table.setautoflush (FALSE); // Buffer can be activated by setting automatic flush to FALSE. Void flushCommits () throws IOException void flushCommits () throws IOException The default size is 2MB. Void setWritaeBufferSize(long writeBufferSize) throws IOException.Copy the code

The size of the write buffer. The default size is 2MB. This is also moderate. This allows clients to more efficiently group a certain amount of data to be executed over a single RPC request. It is also a hassle to set a write buffer for each user’s HTable. To avoid the trouble, users can use **Hbase-site.xml** sets a large default value for the user.

<property>

<name>hbase.client.write.buffer</name>

<value>20971520</value>

</property>
Copy the code

7. Hbase supports a large number of algorithms and compression algorithms at or above the column family level. Unless there are special reasons, use compression as much as possible. After some testing, we recommend SNAPPY for our hbase compression