Docker

In this quick guide to Docker, we will download the Apache Druid image from the Docker Hub, install and use Docker and Docker Compose on a single machine.

After the initial setup, the cluster will be ready to load the data.

In the meantime, if you have completed the following reading, you will have a better understanding of the Docker installation and configuration.

  • Druid overview
  • Data Import Overview

If you have a good understanding of how Docker works, you will be able to use Druid on Docker.

Install the prerequisite

  • Docker

Start the installation

Druid’s source code includes a docker-comemage.yml file for the example. This file gets an image from the Docker Hub and can be used for Docker Druid configuration and deployment.

Compose the file

The docker-comemage. yml sample file will create a container for each Druid service, including Zookeeper and a PostgreSQL container for metadata storage.

A druid_shared volume will also be created, and this volume will have the opt/shared mount point of the container. This mount point will be used for deep storage to ensure sharing between segments and task logs.

The Druid container is configured using the environment file.

configuration

The configuration of the Druid Docker container is done using environment variables. Refer to the standard Druid configuration file to specify the path of environment variables.

Special environment variables:

  • JAVA_OPTS– Set Java Options
  • DRUID_LOG4J– The configuration is completelog4j.xml
  • DRUID_LOG_LEVEL– Override the default log level in log4j
  • DRUID_XMX– set the JavaXmx
  • DRUID_XMS– set the JavaXms
  • DRUID_MAXNEWSIZE– Set the maximum Java new size
  • DRUID_NEWSIZE– Set the size of Java New
  • DRUID_MAXDIRECTMEMORYSIZE– Set the maximum direct Java memory size
  • DRUID_CONFIG_COMMON– druid Full path to the common property file
  • DRUID_CONFIG_${service}– druid Full path to the service property file

In addition to the special environment variables above, Druid’s scripts will also try to configure the variables with environment variables prefixed with druid_ when the container starts.

For example, environment variables used by Druid’s process in the container:

druid_metadata_storage_type=postgresql

Will be converted to

Druid’s docker-comemage. yml file shows how to configure all druids using a single environment configuration file.

However, in a production environment, it is recommended to use DRUID_COMMON_CONFIG and DRUID_CONFIG_${service} to assign specific configuration parameters to the service-specific environment.

Start the cluster

Docker-compose up command to start the cluster directly in the shell.

If you want to start the cluster in the background, run the docker-compose up -d command.

If you are using the sample file directory, you will need to start the docker cluster from the distribution/docker/ directory.

When your cluster is all booted up, you can access the http://localhost:8888 console page from your browser.

The Druid Router process provides the interface displayed on the Druid Console.

It takes a few seconds for all Druid processes to start up completely. If you open the console immediately after the Druid process starts, you may see some security errors that can be ignored by simply refreshing the page.

At this point, you can continue to import the contents of the data using step 4 of the Quickstart page.

If you want to load other dependencies, you can edit the docker-comemage. yml file and restart docker.

Docker memory requirements

If you find a process crash during Docker startup with error code 137, your Docker is running out of memory.

In the beta phase, you can assign around 6GB of memory to your Docker.

The image above shows the Druid project in Docker Hub.

Because Druid is more often clustered, Docker configuration makes it easier for users to complete the configuration and start using it.

www.ossez.com/t/docker-dr…