0x01 Kylin installation environment

Kylin relies on the Hadoop big data platform. Before installation and deployment, it is confirmed that Hadoop, HBase and Hive have been installed on the big data platform.

1.1 Understand the two binary packages of Kylin

The pre-packaged binary installation package: apache-kylin-1.6.0-bin.tar.gz Special binary package: apache-kylin-1.6.0-hbase1.x-bin.tar.gz

Special binary package is a Kylin snapshot binary package compiled on the HBase 1.1+ environment. It requires HBase version 1.1.3 or higher to install it, otherwise there is a known defect in previous versions of Fuzzy Key filters that causes missing records in Kylin query results: HBase-14269. Also note that this is not an official release (the latest changes on the rebase Kylin 1.3.x branch every few weeks) and has not been fully tested.

0x02 Install and deploy

2.1 download

You can choose the version you want to download, here is pache-kylin-1.6.0-bin.tar.gz

2.2 installation

$mv apache-kylin-1.6.0-bin.tar.gz $mv apache-kylin-1.6.0/home/hadoop /cloud/ $ln -s / home/hadoop/cloud/apache - kylin - 1.6.0 / home/hadoop/cloud/kylin

2.3 Configure environment variables

Configure the Kylin environment variable and a variable called hive_dependency in /etc/profile

Export KYLIN_HOME=/home/hadoop/kylin export PATH=$PATH:$KYLIN_HOME/bin export hive_dependency=/home/hadoop/hive/conf:/home/hadoop/hive/lib/*:/home/hadoop/hive/hcatalog/share/hcatalog/hive-hcatalog-c Ore - 2.0.0. Jar

Enables the configuration file to take effect

# source /etc/profile
# su hadoop
$ source /etc/profile

This configuration needs to be configured on master2,slave1 and slave2 at the same time, because the Hive dependency information is needed when the Hadoop cluster distributes the tasks to the slave nodes after the tasks submitted by Kylin are delivered to the slave nodes. If not configured, the MR task will report an error of: HcatalogXXX cannot be found.

2.4 configuration kylin. Sh

$vim ~ / cloud/kylin/bin/kylin. Sh / / explicitly declared KYLIN_HOME export KYLIN_HOME = / home/Hadoop/kylin // Add the $hive_dependency dependency export to HBASE_CLASSPATH_PREFIX HBASE_CLASSPATH_PREFIX=${tomcat_root}/bin/bootstrap.jar:${tomcat_root}/bin/tomcat-juli.jar:${tomcat_root}/lib/*:$hive_de pendency:$HBASE_CLASSPATH_PREFIX

2.5 Check whether the environment is set up successfully

$ check-env.sh
KYLIN_HOME is set to /home/hadoop/kylin

Kylin. 2.6 configuration properties

Go to the conf folder and modify the various Kylin configuration files kylin.properties as shown below

$vim ~ / cloud/kylin/conf/kylin properties kylin. Rest. The servers = master: 7070 # define kylin for MR jobs's job. The jars and hbase co-processor jar package, Used to improve performance. Kylin. Job. Jar = / home/hadoop/kylin/lib/kylin - job - 1.6.0 - the SNAPSHOT. The jar Kylin. Coprocessor. Local. Jar = / home/hadoop/kylin/lib/kylin coprocessor - 1.6.0 - the SNAPSHOT. The jar

2.7 Configure kylin_hive_conf. XML and kylin_job_conf. XML

Set the number of copies of kylin_hive_conf. XML and kylin_job_conf. XML to 2

<property>
  <name>dfs.replication</name>
  <value>2</value>
  <description>Block replication</description>
</property>

2.8 Start the service

Note: Before starting Kylin, make sure the following services are started

Hadoop’s HDFS/YARN/JobHistory service

start-all.sh
mr-jobhistory-daemon.sh start historyserver

Hive Metadata Database

hive --service metastore &

zookeeper

zkService.sh start

This needs to be done on each node to start the ZooKeeper service for all nodes separately

hbase

start-hbase.sh

Check Hive and HBase dependencies

$ find-hive-dependency.sh
$ find-hbase-dependency.sh

Commands to start and stop Kylin

$ kylin.sh start
$ kylin.sh stop

Web access address: http://192.168.1.10:7070/kylin/login

The default login username/password is ADMIN/KYLIN

0 x03 test

3.1 Test the sample brought by Kylin

Kylin provides an automated script to create the Test Cube, which also automatically creates the corresponding Hive tables. The steps for running the sample example:

S1: Run the ${KYLIN_HOME}/bin/sample.sh script

$ sample.sh

Key message:

KYLIN_HOME is set to /home/hadoop/kylin
Going to create sample tables in hive...
Sample hive tables are created successfully; Going to create sample cube...
Sample cube is created successfully in project 'learn_kylin'; Restart Kylin server or reload the metadata from web UI to see the change.

S2: Check which tables are created by this sample in MySQL

select DB_ID,OWNER,SD_ID,TBL_NAME from TBLS;

S3: View created tables and data volume in Hive client (1W entries)

hive> show tables; OK kylin_cal_dt kylin_category_groupings kylin_sales Time taken: 1.835 seconds, feuding: 3 row(s) hive> select count(*) from kylin_sales; OK Time taken: 65.351 seconds, touchdown: 1 row(s)

S4: Restart Kylin Server to refresh the cache

$ kylin.sh stop
$ kylin.sh start

S5: use the default user name password ADMIN/KYLIN access 192.168.200.165:7070 / KYLIN

Once in the console, select the project with learn_kylin as its name.

S6: Select the test cube “kylin_sales_cube”, click “Action” – “Build” and select a date after 2014-01-01. This is to select all 10,000 test records.

Select a build date and click on the file to indicate that the reconstruction task has been successfully submitted

S7: The monitor monitors the progress of this task until it is 100% complete.

If you switch to the Model console after the task is completed, you will find that the Cube is now in the “ready” state, indicating that you are ready to execute SQL queries. A temporary table will be generated in Hive and will be deleted automatically when the task is 100% complete

0x04 common error

4.1 runcheck-env.shprompt

please make sure user has the privilege to run hbase shell

Verify that the HBase environment variables are configured correctly. Issue resolved after reconfiguration. Reference: http://www.jianshu.com/p/632b…

4.2 hadoop-env.shScript problem

/home/hadoop-2.5.1/contrib/capacity-scheduler/.jar (No such file or directory)

WARNING: Failed to process JAR [JAR :file:/home/hadoop-2.5.1/contrib/ capacitive-scheduler /.jar!/] for TLD files java.io.FileNotFoundException: / home/hadoop - 2.5.1 / contrib/capacity - the scheduler /. Jar (No to the file or directory) at Java. Util. Zip. ZipFile. Open (Native Method) at java.util.zip.ZipFile.(ZipFile.java:215) at java.util.zip.ZipFile.(ZipFile.java:145) at java.util.jar.JarFile.(JarFile.java:153) at java.util.jar.JarFile.(JarFile.java:90) at sun.net.www.protocol.jar.URLJarFile.(URLJarFile.java:93) at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:69) at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:99) at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122) at sun.net.www.protocol.jar.JarURLConnection.getJarFile(JarURLConnection.java:89) at org.apache.tomcat.util.scan.FileUrlJar.(FileUrlJar.java:41) at org.apache.tomcat.util.scan.JarFactory.newInstance(JarFactory.java:34) at org.apache.catalina.startup.TldConfig.tldScanJar(TldConfig.java:485) at org.apache.catalina.startup.TldConfig.access$100(TldConfig.java:61) at org.apache.catalina.startup.TldConfig$TldJarScannerCallback.scan(TldConfig.java:296) at org.apache.tomcat.util.scan.StandardJarScanner.process(StandardJarScanner.java:258) at org.apache.tomcat.util.scan.StandardJarScanner.scan(StandardJarScanner.java:220) at org.apache.catalina.startup.TldConfig.execute(TldConfig.java:269) at org.apache.catalina.startup.TldConfig.lifecycleEvent(TldConfig.java:565) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:117) at org.apache.catalina.util.LifecycleBase.fireLifecycleEvent(LifecycleBase.java:90) at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5412) at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:649) at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:1081) at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1877) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

${HADOOP_HOME}/etc/hadoop/hadoop-env.sh () {HADOOP_HOME}/etc/hadoop/hadoop-env.sh () {HADOOP_HOME}/etc/hadoop/hadoop-env.sh ()

#for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
#  if [ "$HADOOP_CLASSPATH" ]; then
#    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
#  else
#    export HADOOP_CLASSPATH=$f
#  fi
#done

4.3 Clean up Kylin Space

kylin.sh org.apache.kylin.storage.hbase.util.StorageCleanupJob --delete true

4.4 Permission denied

Kylin cube test, an error: org). Apache hadoop. Security. AccessControlException: Permission denied: user=root, access=WRITE, inode=”/user”:hdfs:supergroup:drwxr-xr-x

Solutions:

HDFS – site. 1 configuration XML

<property>
    <name>dfs.permissions</name>
    <value>false</value>
</property>

2. Give permissions to directory /user 777 on HDFS

$ hadoop fs -chmod -R 777 /user

0x05 Reference link

http://kylin.apache.org/cn/do…

http://kylin.apache.org/cn/do…

http://www.cnblogs.com/avivay…

It’s Friday

Update1:2017-05-04 20:10:05 Thursday