1. Origin of HBase:

HBase is based on Google’s BigTable paper. Inspired by this paper, HBase is developed and maintained as a sub-project of Hadoop to support structured data storage.

  • Official website :hbase.apache.org

  • Google published a white paper on BigTable in 2006

  • Started HBase development in 2006

  • In 2008, when Beijing successfully hosted the Olympic Games, programmers quietly made HBase a subproject of Hadoop

  • HBase became an Apache top-level project in 2010

1.1 HBase Roles

  1. HMaster

    1. Monitor the RegionServer
    2. Process RegionServer failover
    3. Handle metadata changes
    4. Process region allocation and removal
    5. Load Balancing
    6. Zookeeper publishes its location to the client
  2. RegionServer

    1. Stores HBase data
    2. Process the region assigned to itself
    3. Refresh the cache to HDFS
    4. Maintaining Hlog(Restoring Data)
    5. compression
    6. Handling Region Fragments
  3. component

    1. Write-Ahead logs

    HBase modification records: When data is read or written to HBase, data is stored in the memory for a period of time (the time and data volume threshold can be set) rather than written to disks. However, keeping data in memory may have a higher probability of causing data loss. To solve this problem, data is written to a file called write-Ahead logfile before being written to memory. So in the event of a system failure, data can be reconstructed from this log file.

    1. HFile

    This is the actual physical file that holds the raw data on disk, the actual storage file.

    1. Store

    HFile is stored in Store. A Store corresponds to a column family in an HBase table.

    1. MemStore

    As the name implies, it is a memory store, located in memory and used to hold current data operations, so when data is stored in WAL, RegsionServer stores key-value pairs in memory.

    1. Region

    Hbase table fragments. Hbase tables are divided into different regions based on RowKey values and stored in RegionServer. A RegionServer can have multiple regions.

HBase schematic diagram:

Sorry guys, if you want to see this, please go to Apache official,


Under the current address, view the Chinese document, super full ~