What is the Hbase

HBase is a distributed, column-oriented storage system based on the HDFS. HBase approaches scalability issues from a different Angle. It scales by adding nodes from bottom to top in a linear fashion. HBase is not a relational database and does not support SQL, but it has its own strengths that RDBMSS cannot handle. HBase cleverly places large, sparse tables on commercial server clusters. HBase is an open source implementation of Google Bigtable. Similar to Google Bigtable using GFS as its file storage system, HBase uses HDFS as its file storage system. Google uses MapReduce to process massive data in Bigtable, and HBase uses Hadoop MapReduce to process massive data in HBase. Google Bigtable uses Chubby as a collaborative service, and HBase uses Zookeeper as a corresponding service.

Storage structure

  • HBase stores data in tables consisting of rows and columns that are divided into column families or column families.

  • Table structure

    Row Key column-family1 column-family2
    key1 F1: the first field of column cluster 1; F2: column cluster 1 second field F1: column cluster 2 first field; F2: column cluster 2 second field
    key2 F1: the first field of column cluster 1,; 2: column cluster 1 second field F1: column cluster 2 first field; F2: column cluster 2 second field

HBase cluster architecture

  • Architecture diagram

  • Core components
    • HMaster

      1. There is no single-node HMaster problem. Multiple HMasters can be started in HBase. The Master Election mechanism of Zookeeper ensures that only one Master is running to manage tables and regions. How do I start multiple HMasters? Run the hbase-daemons.sh command to start the hbase-daemons.sh file as follows: 1) Edit backup-masters in the hbase/conf directory. 2) Edit the content to its own host name; 3) Run the bin/hbase-daemons.sh start master-backup command
      2. Manage user operations to add, delete, modify, and query tables
      3. Manage HRegionServer load balancing and adjust Region distribution (There is a tools command in the command line. Tools is a grouping command that is actually done by the Master
      4. After Region Split, new regions are distributed
      5. After HRegionServer is stopped, Region migration is performed on the failed HRegionServer
    • HRegionServer

      1. Maintain HRegion, process I/O requests from HRegion, and read and write data to the HDFS
      2. Responsible for partitioning hregions that become too large during operation
      3. The Master does not need to access HBase data from the Client (address access to ZooKeeper and HRegionServer, data read and write access to HRegionServer). The HMaster only maintains metadata information of tables and regions with low load

Hbase Cluster Construction

Hbase clusters rely on Hadoop and ZooKeeper. Before installing an Hbase cluster, prepare the Hadoop and ZooKeeper clusters

  • Server list

    IP host service
    10.19.3.194 hadoop01 HMaster
    10.19.3.195 hadoop02 HRegionServer
    10.19.3.196 hadoop03 HRegionServer

    Note: Host is configured on the appropriate server

  • The preparatory work

    • Download the installation package. Download Hbase
    • Decompress the installation package tar -zvxf hbase-2.1.1-bin.tar. gz
  • Installation steps

    • Accessing the installation directory

          [app@hadoop01 hbase]$ pwd
          /usr/local/hbase
          [app@hadoop01 hbase]$ ls
          bin  CHANGES.md  conf  hbase-webapps  LEGAL  lib  LICENSE.txt  logs  NOTICE.txt  README.txt  RELEASENOTES.md
          [app@hadoop01 hbase]$ 
      Copy the code
    • Edit hbase-site.xml in the conf directory

      < configuration > < property > < name >. Hbase rootdir < / name > < value > HDFS: / / 10.19.3.194:9000 / hbase < value > / < / property > < property > < name > hbase. Zookeeper. Quorum < / name > < value > 10.19.3.194, 10.19.3.195, 10.19.3.196 < value > / < / property > < property >  <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value>
              </property>
      
          </configuration>
      Copy the code
    • Add the configuration at the end of hbase-env.sh in the conf directory

          exportJAVA_HOME = / opt/jdk1.8.0 _131export HBASE_MANAGES_ZK=false
      Copy the code
    • Edit regionServers in the conf directory

          hadoop02
          hadoop03
      Copy the code
    • Copy the hbase configuration to the other two servers

      SCP -r hbase [email protected]: / usr /local/hbase
          scp -r hbase [email protected]:/usr/local/hbase     
      Copy the code
    • Start the

          ./bin/start-hbase.sh
      Copy the code
    • Check whether related processes are running properly

      1. hadoop01
            [app@hadoop01 hbase]$ jps
            24113 SecondaryNameNode
            23878 NameNode
            9398 Jps
            28685 HMaster
            24335 ResourceManager
            [app@hadoop01 hbase]$ 
        Copy the code
      2. hadoop02
            [app@hadoop02 hbase]$ jps
            20049 Jps
            16779 DataNode
            9308 HRegionServer
            16910 NodeManager
            [app@hadoop02 hbase]$ 
        Copy the code
      3. hadoop03
            [app@hadoop03 hbase]$ jps
            20049 Jps
            16779 DataNode
            9308 HRegionServer
            16910 NodeManager
            [app@hadoop03 hbase]$ 
        Copy the code
    • You can view cluster information by visiting http://10.19.3.194:16010

  • Validate the cluster (Java code)

    • Introduce POM dependencies

      < the dependency > < groupId > org. Apache. Hbase < / groupId > < artifactId > hbase - client < / artifactId > < version > 2.2.1 < / version > </dependency> <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-server</artifactId> < version > 2.2.1 < / version > < / dependency > < the dependency > < groupId > org. Apache. Hbase < / groupId > < artifactId > hbase - common < / artifactId > < version > 2.2.1 < / version > < / dependency >Copy the code
    • The code determines whether the table test_table exists

          Configuration configuration = HBaseConfiguration.create();
          configuration.set("hbase.zookeeper.quorum"."10.19.3.196 10.19.3.194, 10.19.3.195,");
          configuration.set("hbase.zookeeper.property.clientPort"."2181");
          Connection connection = ConnectionFactory.createConnection(configuration);
          Admin admin = connection.getAdmin();
          TableName tableName = TableName.valueOf("test_table");
          System.out.println(admin.tableExists(tableName));
          admin.close();
          connection.close();
      Copy the code
    • Console output

          false
      Copy the code

The attached

  • Hbase official address