What is the Hbase
HBase is a distributed, column-oriented storage system based on the HDFS. HBase approaches scalability issues from a different Angle. It scales by adding nodes from bottom to top in a linear fashion. HBase is not a relational database and does not support SQL, but it has its own strengths that RDBMSS cannot handle. HBase cleverly places large, sparse tables on commercial server clusters. HBase is an open source implementation of Google Bigtable. Similar to Google Bigtable using GFS as its file storage system, HBase uses HDFS as its file storage system. Google uses MapReduce to process massive data in Bigtable, and HBase uses Hadoop MapReduce to process massive data in HBase. Google Bigtable uses Chubby as a collaborative service, and HBase uses Zookeeper as a corresponding service.
Storage structure
-
HBase stores data in tables consisting of rows and columns that are divided into column families or column families.
-
Table structure
Row Key column-family1 column-family2 key1 F1: the first field of column cluster 1; F2: column cluster 1 second field F1: column cluster 2 first field; F2: column cluster 2 second field key2 F1: the first field of column cluster 1,; 2: column cluster 1 second field F1: column cluster 2 first field; F2: column cluster 2 second field
HBase cluster architecture
- Architecture diagram
- Core components
-
HMaster
- There is no single-node HMaster problem. Multiple HMasters can be started in HBase. The Master Election mechanism of Zookeeper ensures that only one Master is running to manage tables and regions. How do I start multiple HMasters? Run the hbase-daemons.sh command to start the hbase-daemons.sh file as follows: 1) Edit backup-masters in the hbase/conf directory. 2) Edit the content to its own host name; 3) Run the bin/hbase-daemons.sh start master-backup command
- Manage user operations to add, delete, modify, and query tables
- Manage HRegionServer load balancing and adjust Region distribution (There is a tools command in the command line. Tools is a grouping command that is actually done by the Master
- After Region Split, new regions are distributed
- After HRegionServer is stopped, Region migration is performed on the failed HRegionServer
-
HRegionServer
- Maintain HRegion, process I/O requests from HRegion, and read and write data to the HDFS
- Responsible for partitioning hregions that become too large during operation
- The Master does not need to access HBase data from the Client (address access to ZooKeeper and HRegionServer, data read and write access to HRegionServer). The HMaster only maintains metadata information of tables and regions with low load
-
Hbase Cluster Construction
Hbase clusters rely on Hadoop and ZooKeeper. Before installing an Hbase cluster, prepare the Hadoop and ZooKeeper clusters
-
Server list
IP host service 10.19.3.194 hadoop01 HMaster 10.19.3.195 hadoop02 HRegionServer 10.19.3.196 hadoop03 HRegionServer Note: Host is configured on the appropriate server
-
The preparatory work
- Download the installation package. Download Hbase
- Decompress the installation package tar -zvxf hbase-2.1.1-bin.tar. gz
-
Installation steps
-
Accessing the installation directory
[app@hadoop01 hbase]$ pwd /usr/local/hbase [app@hadoop01 hbase]$ ls bin CHANGES.md conf hbase-webapps LEGAL lib LICENSE.txt logs NOTICE.txt README.txt RELEASENOTES.md [app@hadoop01 hbase]$ Copy the code
-
Edit hbase-site.xml in the conf directory
< configuration > < property > < name >. Hbase rootdir < / name > < value > HDFS: / / 10.19.3.194:9000 / hbase < value > / < / property > < property > < name > hbase. Zookeeper. Quorum < / name > < value > 10.19.3.194, 10.19.3.195, 10.19.3.196 < value > / < / property > < property > <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> </configuration> Copy the code
-
Add the configuration at the end of hbase-env.sh in the conf directory
exportJAVA_HOME = / opt/jdk1.8.0 _131export HBASE_MANAGES_ZK=false Copy the code
-
Edit regionServers in the conf directory
hadoop02 hadoop03 Copy the code
-
Copy the hbase configuration to the other two servers
SCP -r hbase [email protected]: / usr /local/hbase scp -r hbase [email protected]:/usr/local/hbase Copy the code
-
Start the
./bin/start-hbase.sh Copy the code
-
Check whether related processes are running properly
- hadoop01
[app@hadoop01 hbase]$ jps 24113 SecondaryNameNode 23878 NameNode 9398 Jps 28685 HMaster 24335 ResourceManager [app@hadoop01 hbase]$ Copy the code
- hadoop02
[app@hadoop02 hbase]$ jps 20049 Jps 16779 DataNode 9308 HRegionServer 16910 NodeManager [app@hadoop02 hbase]$ Copy the code
- hadoop03
[app@hadoop03 hbase]$ jps 20049 Jps 16779 DataNode 9308 HRegionServer 16910 NodeManager [app@hadoop03 hbase]$ Copy the code
- hadoop01
-
You can view cluster information by visiting http://10.19.3.194:16010
-
-
Validate the cluster (Java code)
-
Introduce POM dependencies
< the dependency > < groupId > org. Apache. Hbase < / groupId > < artifactId > hbase - client < / artifactId > < version > 2.2.1 < / version > </dependency> <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-server</artifactId> < version > 2.2.1 < / version > < / dependency > < the dependency > < groupId > org. Apache. Hbase < / groupId > < artifactId > hbase - common < / artifactId > < version > 2.2.1 < / version > < / dependency >Copy the code
-
The code determines whether the table test_table exists
Configuration configuration = HBaseConfiguration.create(); configuration.set("hbase.zookeeper.quorum"."10.19.3.196 10.19.3.194, 10.19.3.195,"); configuration.set("hbase.zookeeper.property.clientPort"."2181"); Connection connection = ConnectionFactory.createConnection(configuration); Admin admin = connection.getAdmin(); TableName tableName = TableName.valueOf("test_table"); System.out.println(admin.tableExists(tableName)); admin.close(); connection.close(); Copy the code
-
Console output
false Copy the code
-
The attached
- Hbase official address