1. Requirement :(import the following table data into MySQL)

From there, write the following SQOOP import command

Sqoop import - D sqoop. Hbase. Add. Row. The key = true - connect JDBC: mysql: / / 192.168.1.9 / spiders - the username root - password root --table test_goods --hbase-create-table --hbase-table t_goods --column-family cf --hbase-row-key id -m 1

Error during import: 1, Error during import: No primary key could be found for table * No primary key could be found for table * Mysql > insert row with null value for row-key column mysql > insert row with null value for row-key column And the id of the hbase-row-key id must be displayed in the — columns. Error parsing arguments for import Unrecognized argument: : column ID,GOODS_NAME, GOODS_PRICE I added Spaces when separating Columns with commas, so column names displayed after Columns can only be separated by commas without Spaces

After removing the above three problems, my latest import command changes to the following:

Sqoop import - D sqoop. Hbase. Add. Row. The key = true - connect JDBC: mysql: / / 192.168.1.9:3306 / spiders - the username root password root --table TEST_GOODS --columns ID,GOODS_NAME,GOODS_PRICE --hbase-create-table --hbase-table t_goods --column-family cf --hbase-row-key ID --where "ID >= 5" -m 1

Note: There is a small problem here: remember to bring id>=5

Looking at HBase, the data has been imported successfully

Finally I write the command to a XXX file and execute the import command through sqoop — options-file XXX

The error is as follows:

The import - D sqoop. Hbase. Add. Row. The key = true - connect JDBC: mysql: / / 192.168.1.9:3306 / spiders - the username root - password root --table TEST_GOODS --columns ID,GOODS_NAME,GOODS_PRICE --hbase-create-table --hbase-table test_goods --column-family cf --hbase-row-key ID --where "ID >= 5" -m 1

Error: Parameter name and parameter value did not carry a carriage return line wrap

Correct writing:

The import - D sqoop. Hbase. Add. Row. The key = true - connect JDBC: mysql: / / 192.168.1.9:3306 / spiders - the username root - password root --table TEST_GOODS --columns ID,GOODS_NAME,GOODS_PRICE --hbase-create-table --hbase-table tt_goods --column-family cf --hbase-row-key ID --where ID>=5 -m 1

Note: Parameter meaning interpretation

-d SQOOP.HBase.Add.Row. Key = True If you want to write a rowkey to the row family, default is false. You will not see any rowkey fields in the row family. Note that this parameter must be placed after import. Mysql > SELECT * FROM 'Test_Goods' WHERE' goods' = 'goods' WHERE' goods' = 'goods'; --hbase-row-key --hbase-row-key -- rowkey --column-family --hbase-row-key -- rowkey --column-family --hbase-row SQL > split-by CREATE_TIME SQL > split-by CREATE_TIME SQL > split-by CREATE_TIME SQL > split-by CREATE_TIME SQL > split-by CREATE_TIME SQL > split-by CREATE_TIME SQL > split-by CREATE_TIME SQL > split-by CREATE_TIME SQL > split-by CREATE_TIME SQL > split-by CREATE_TIME You can use the parameter -- M 1 to this, the bug solved!!

Second, knowledge expansion, timing incremental import

1. SQOOP incremental import

Sqoop import - D sqoop. Hbase. Add. Row. The key = true - connect JDBC: mysql: / / 192.168.1.9:3306 / spiders - the username root password root --table TEST_GOODS --columns ID,GOODS_NAME,GOODS_PRICE --hbase-create-table --hbase-table t_goods --column-family cf --hbase-row-key ID --incremental lastmodified --check-column U_DATE --last-value '2017-06-27' --split-by U_DATE -- Incremental LastModified incremental import supports columns incremented in both modes APPEND; LastModified timestamp. --check-column increment for import --last-value minimum, which in this example represents the value of the import from 2017-06-27 to today

2. SQOOP job:

Sqoop job - create testjob01 - import - connect JDBC: mysql: / / 192.168.1.9:3306 / spiders - the username root - password root --table TEST_GOODS --columns ID,GOODS_NAME,GOODS_PRICE --hbase-create-table --hbase-table t_goods --column-family cf --hbase-row-key ID -m 1

Set a timer to execute the above SQOOP job using a Linux timer: crontab-e for example every day

0 0 * * * /opt/local/sqoop-1.4.6/bin/sqoop . --exec testjob01

After the data is imported from MySQL into Hive, the data is inconsistent

We run hadoop fs – cat/user/hadoop/student/part – m – 00000, you can see the original between field and field use ‘, ‘separated, this is the default sqoop, at this time, if a field as contained in the’, ‘, The delimiting error occurs when you insert data into Hive again. Because hive is also separated by ‘, ‘. Solution: It is recommended to use ‘001’ for splitting when SQOOP imports data. So –fields-terminated-by

parameter. Example:

sqoop import --connect "jdbc:oracle:thin:@//localhost:1521/student" --password "***" --username "***" --query "select * from student where name='zhangsan' and class_id='003' and \$CONDITIONS" --target-dir "/user/hadoop/student" --fields-terminated-by "\001" --verbose -m 1