Abstract:


A customer’s big data test scenario is as follows: Use Solr image-like data to search for user labels. Use these labels to query details in HBase. Above test functionality and performance.

The HBase data volume is 500 GB and Solr data volume is 5 TB. All data needs to be manually migrated from the other party’s cluster to our own cluster. Since Solr is not integrated in our cluster, HBase data migration is a priority. The following is a summary of various problems encountered in HBase and data migration and solutions.

I. Problems encountered during migration and solutions

HBase Version: Version 0.94.15

Tencent Big data suite HBase Version: Version 1.2.1

Customer private cloud system Version (test) : TLinux1.2

The problems encountered and their solutions are as follows:

1.HBase Running Exception Symptom 1 (Date and hwClock)

HBase occasionally runs abnormally, and components stop running. Logs show the time difference, but the date information is the same. The difference may be caused by hardware time difference. Later, it was confirmed that the initialization script only performed hardware time synchronization for the machine in Tencent cloud environment, which has been optimized.

2.HBase Running is abnormal Symptom 2 (Hostname and /etc/resolv.conf)

The HBase is not running properly again, and components stop running. The following error can be seen from the log

ERROR [regionserver / / 10.0.0.106:16020] regionserver. HRegionServer: Master passed us a company the hostname to use; The was = 10.0.0.106, but now = host – 10-0-0-106. Openstacklocal

Through hostname, it can be seen that the hostname of all machines is Intranet IP. It is speculated that the inconsistency may be caused by what table is queried during network interaction. Check the DNS resolution information as follows

/ root @ 10 ~ # hostname10.0.0.106; Generatedby /sbin/dhclient-script#search OpenStackLocal 0.0.106#nameserver 10.0.0.2#nameserver 10.0.0.3

Note out the search information in resolv.conf, stop the NSCD service, and restart HBase. This error does not occur again, and HBase runs normally.

3. Support the snappy discovery and repair process:

In the process of table migration, the official import/export tool is planned to be used. The first step is to create a table in the target cluster by using DESC information. After the table is created in the target cluster, the table can be seen in list.

org.apache.hadoop.HBase.DoNotRetryIOException: Compression algorithm ‘snappy’ previously failed test.

The snappy compression algorithm must be supported by HBase based on Google query. The Snappy compression algorithm is not supported by hadoop checkNative cluster by default (although snappyrpm is installed).

Native library checking: hadoop: true/data/TBDS – base/usr/HDP / 2.2.0.0-2041 / hadoop/lib/Native/libhadoop sozlib: true /lib64/libz.so.1snappy:falselz4:true revision:99bzip2:falseopenssl:false build does not support openssl.

Create a table manually by using the following desc information to create a table after the list view table information. Table contents cannot be viewed during scan, and the log finds the following error

Desc:

COLUMN FAMILIES DESCRIPTION                                                                {NAME =>’A’, DATA_BLOCK_ENCODING =>’NONE’, BLOOMFILTER =>’NONE’, REPLICATION_SCOPE =>’0′, VERSIONS =>’1′, COMPRESSION =>’SNAPPY’, MIN_VERSIONS =>’0′, TTL =>’FOREVER’, KEEP_DELETED_CELLS =>’false’, BLOCKSIZE =>’65536′, IN_MEMORY =>’false’, BLOCKCACHE =>’true’, METADATA => {‘ENCODE_ON_DISK’ =>’true’}}                      {NAME =>’D’, DATA_BLOCK_ENCODING =>’NONE’, BLOOMFILTER =>’NONE’, REPLICATION_SCOPE =>’0′, VERSIONS =>’2147483647′, COMPRESSION =>’SNAPPY’, MIN_VERSIONS =>’0′, TTL =>’FOREVER’, KEEP_DELETED_CELLS =>’false’, BLOCKSIZE =>’65536′, IN_MEMORY =>’false’, BLOCKCACHE =>’true’, ENCODE_ON_DISK =>’true’}

Error message:

org.apache.hadoop.HBase.DoNotRetryIOException:java.lang.RuntimeException:nativesnappylibrarynotavailable:thisversionofli bhadoopwasbuiltwithoutsnappysupport

In HBase – site. The XML attribute HBase. Regionserver. Value for snappy codecs can, in a test cluster through this method, HBase start to fail

Then confirm that the TLinux1.2 Hadoop cluster supports Snappy: That is, compile hadoop-related native libraries (Native libraries) in a specific system to replace the current Hadoop Native library, and add the Hadoop home directory in the HBase startup environment script

At present, the Hadoop NativesNappy library under TLinux1.2 can be used on the live network. Meanwhile, it is necessary to ensure that this Hadoop library can be referenced to libjjvm (a so file of jre) to directly replace the native directory under Hadoop /lib. Ensure that the SNappy RPM package has been installed. Add HADOOP_HOME={Hadoop home installation directory} to hbase-env. sh. Hadoop CheckNative now supports SNappy. Gradually and fully restart HBase.

Native library checking: hadoop: true/data/TBDS – base/usr/HDP / 2.2.0.0-2041 / hadoop/lib/Native/libhadoop sozlib: true /lib64/libz.so.1snappy:true /usr/lib64/libsnappy.so.1lz4:true revision:99bzip2:falseopenssl:false build does not support openssl.

4. Migration method of HBase0.9.4 cluster table to HBase1.2.1 cluster table

Violence migration reference my.oschina.net/CainGao/blo…

1) Locate the directory of the source table in the HDFS of the source cluster and move the directory to the root directory of the HBase table in the HDFS of the target cluster

2) During the violent migration, the tableInfo information is a file, i.e. Tableinfo.00000001. The 0.9.4 file is in the root directory of the HBase table in the HDFS, while the 1.2.1 file is in the./tabledesc directory of the HBase table in the HDFS. You need to manually create this directory and adjust the location of the file

3) Modify the owner information of the copied table directory file

4) Restart all HBase components

5) After logging in to HBaseshell, you can view the migrated table through the list, but the scan operation fails

6) Use HBase HBCK-fixMeta to repair meta information. HBase HBCk-FixAssignments Restores partitions. Check whether there are exceptions in logs during the two steps. In practice, this method is used for the first time. A large number of errors are found, and the error content is related to SNappy.

7) You need to manually create a target table in the target cluster when migrating by import/export. The table structure of the source cluster is as follows:

Import /export Reference address

COLUMN FAMILIES DESCRIPTION                                                                  {NAME =>’A’, DATA_BLOCK_ENCODING =>’NONE’, BLOOMFILTER =>’NONE’, REPLICATION_SCOPE =>’0′, VERSIONS =>’1′, COMPRESSION =>’SNAPPY’, MIN_VERSIONS =>’0′, TTL =>’FOREVER’, KEEP_DELETED_CELLS =>’false’, BLOCKSIZE =>’65536′, IN_MEMORY =>’false’, BLOCKCACHE =>’true’, METADATA => {‘ENCODE_ON_DISK’ =>’true’}}                      {NAME =>’D’, DATA_BLOCK_ENCODING =>’NONE’, BLOOMFILTER =>’NONE’, REPLICATION_SCOPE =>’0′, VERSIONS =>’2147483647′, COMPRESSION =>’SNAPPY’, MIN_VERSIONS =>’0′, TTL =>’FOREVER’, KEEP_DELETED_CELLS =>’false’, BLOCKSIZE =>’65536′, IN_MEMORY =>’false’, BLOCKCACHE =>’true’, ENCODE_ON_DISK =>’true’}

An error occurred when creating a new table with this DESC information:

Unknown argument ignored for column family A: ENCODE_ON_DISK

ENCODE_ON_DISK: ENCODE_ON_DISK: ENCODE_ON_DISK: ENCODE_ON_DISK: ENCODE_ON_DISK After looking up the code, I found that this field has been abandoned in the new version, but the old cluster version of the customer needs this field, and it cannot be written normally by means of import. After the violent migration succeeds (violent migration is compatible with this field), I checked the desc information of the table as follows:

COLUMN FAMILIES DESCRIPTION                                                                  {NAME =>’A’, DATA_BLOCK_ENCODING =>’NONE’, BLOOMFILTER =>’NONE’, REPLICATION_SCOPE =>’0′, VERSIONS =>’1′, COMPRESSION =>’SNAPPY’, MIN_VERSIONS =>’0′, TTL =>’FOREVER’, KEEP_DELETED_CELLS =>’false’, BLOCKSIZE =>’65536′, IN_MEMORY =>’false’, BLOCKCACHE =>’true’, METADATA => {‘ENCODE_ON_DISK’ =>’true’}}                      {NAME =>’D’, DATA_BLOCK_ENCODING =>’NONE’, BLOOMFILTER =>’NONE’, REPLICATION_SCOPE =>’0′, VERSIONS =>’2147483647′, COMPRESSION =>’SNAPPY’, MIN_VERSIONS =>’0′, TTL =>’FOREVER’, KEEP_DELETED_CELLS =>’false’, BLOCKSIZE =>’65536′, IN_MEMORY =>’false’, BLOCKCACHE =>’true’, METADATA => {‘ENCODE_ON_DISK’ =>’true’}}

Old cluster table structure

COLUMN FAMILIES DESCRIPTION                                                                {NAME =>’A’, DATA_BLOCK_ENCODING =>’NONE’, BLOOMFILTER =>’NONE’, REPLICATION_SCOPE =>’0′, VERSIONS =>’1′, COMPRESSION =>’SNAPPY’, MIN_VERSIONS =>’0′, TTL =>’FOREVER’, KEEP_DELETED_CELLS =>’false’, BLOCKSIZE =>’65536′, IN_MEMORY =>’false’, BLOCKCACHE =>’true’, METADATA => {‘ENCODE_ON_DISK’ =>’true’}}                      {NAME =>’D’, DATA_BLOCK_ENCODING =>’NONE’, BLOOMFILTER =>’NONE’, REPLICATION_SCOPE =>’0′, VERSIONS =>’2147483647′, COMPRESSION =>’SNAPPY’, MIN_VERSIONS =>’0′, TTL =>’FOREVER’, KEEP_DELETED_CELLS =>’false’, BLOCKSIZE =>’65536′, IN_MEMORY =>’false’, BLOCKCACHE =>’true’, ENCODE_ON_DISK =>’true’}

It can be seen that the definition method of ENCODE_ON_DISK field is different between the old version and the new version. Therefore, we tested using the above DESC information to create a table in the new cluster and then import it to HBase by import method. As there is still no data written, you can conclude that this parameter ENCODE_ON_DISK was completely deprecated in HBase1.2.1, and the new version uses a whole field to wrap this information. The official import/export method is temporarily unavailable in HBase0.9.8 to HBase1.2.1 direct migration when the old cluster has parameters.

2. The following

Create tables on HBase0.9.8 cluster set ENCODE_ON_DISK=false (default: true), create tables on HBase1.2.1 without ENCODE_ON_DISK, use export/import migration test to study other HBase data across clusters (version differences, Network failure) migration method.