Article directory
Preface
Recently, I contacted docker and wanted to play in a Flink cluster. I searched the Internet and found that most of them were built from Dockerfile. Then I looked at the official website and found that there were tutorials built using docker. I refer to the way of the official website to build in the Linux environment and record the pits I stepped on by the way.
There are mainly two ways to build, one is to use the docker command to build and the other is to use the docker compose to build. Before the operation, I have installed the docker, installed the docker and changed the domestic source of the docker. This is not covered here. You can search on the Internet
Method 1: build with docker command
- Create network
docker network create app-tier --bridge
- Create a jobmanager container
docker run -t -d --name jmr \
--network app-tier \
-e JOB_MANAGER_RPC_ADDRESS=jmr \
-p 8081:8081 \
flink:1.9.2-scala_2.12 jobmanager
- Create taskmanager container
docker run -t -d --name tmr \
--network app-tier \
-e JOB_MANAGER_RPC_ADDRESS=jmr \
flink:1.9.2-scala_2.12 taskmanager
Mode 2: build using docker compose
You must first install docker compose in this way
I used the following command to install, Reference link
su root curl -L https://get.daocloud.io/docker/compose/releases/download/1.25.4/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose chmod +x /usr/local/bin/docker-compose
After installation, we can check the version of docker to verify whether the installation is successful
docker-compose --version
Then we can create a file named docker-compose.yml in a folder. For convenience, I'll create it in the current folder
touch docker-compose.yml
Then edit this file, write the following and save to exit!!! Pay attention to the format
version: "2.1" services: jobmanager: image: flink:1.9.2-scala_2.12 expose: - "6123" ports: - "8081:8081" command: jobmanager environment: - JOB_MANAGER_RPC_ADDRESS=jobmanager taskmanager: image: flink:1.9.2-scala_2.12 expose: - "6121" - "6122" depends_on: - jobmanager command: taskmanager links: - "jobmanager:jobmanager" environment: - JOB_MANAGER_RPC_ADDRESS=jobmanager
Finally, it can be built by the following command
docker-compose up -d
How to view Flink clusters and logs
View cluster through web
The official image exposes port 8081. In mode 1 and mode 2, we have mapped the local port 8081 of the machine to port 8081 of the container, so we can view the cluster through the browser, because I am running on the virtual machine. Therefore, if the docker is installed directly on the machine and not on the virtual machine, it can be accessed directly through localhost:8081
View log command
#tmr is the image command, which can be replaced with other image names
docker logs tmr
Answering questions and dispel doubts
The first way is on the official website
Running a JobManager or a TaskManager You can run a JobManager (master). $ docker run --name flink_jobmanager -d -t flink jobmanager You can also run a TaskManager (worker). Notice that workers need to register with the JobManager directly or via ZooKeeper so the master starts to send them tasks to execute. $ docker run --name flink_taskmanager -d -t flink taskmanager
There are pits in it. You can build a single container. If you build multiple containers, you will report an error
Method 2: docker-compose.yml is copied from the official website
- Why to create a network in mode 1
A: the purpose of creating a network is for the container to discover and communicate with each other through the container name or container ID. If the created network is not specified, the following errors may occur to taskmanager
ERROR org.apache.flink.runtime.taskexecutor.TaskManagerRunner - TaskManager initialization failed. java.net.UnknownHostException: jmr: Name or service not known at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
- Why the environment variable job manager RPC address should be specified in mode 1
Answer: if the environment variable is not specified, jobmanager (i.e. jmr) will not report an error, but taskmanager (i.e. tmr) will report the following error
2020-02-27 03:35:12,029 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor - Could not resolve ResourceManager address akka.tcp://flink@f59c179f3470:6123/user/resourcemanager, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@f59c179f3470:6123/user/resourcemanager.. 2020-02-27 03:35:20,782 ERROR org.apache.flink.runtime.taskexecutor.TaskExecutor - Fatal error occurred in TaskExecutor akka.tcp://flink@172.18.0.9:38671/user/taskmanager_0. org.apache.flink.runtime.taskexecutor.exceptions.RegistrationTimeoutException: Could not register at the ResourceManager within the specified maximum registration duration 300000 ms. This indicates a problem with this instance. Terminating now. at org.apache.flink.runtime.taskexecutor.TaskExecutor.registrationTimeout(TaskExecutor.java:1111) at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$startRegistrationTimeout$8(TaskExecutor.java:1097) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:397)
taskmanager failed to register with jobmanager within 30000 MS, throwing an exception
The reason for this error is that when docker starts the flink container, it will execute the official customized image docker-entrypoint.sh Script file.
It is specified in the script file that if the environment variable job? Manager? RPC? Address is not specified in the startup command, the local hostname will be used as the value of the environment variable and written to the configuration file. In docker, the value is Container ID, so you can see similar information in the log
INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, f59c179f3470
Therefore, if we do not specify the value of this environment variable, then taskmanager will register with its 6123 port, which is wrong, because taskmanger does not have the ability to accept registration, and taskmanger should register with jobmanager. Therefore, we need to specify the value of job manager RPC address as the container name or ID of jobmanager or the IP of the container
- How to add or delete containers to this cluster
A: to delete the container, we can simply implement it with rm command
docker rm -f tmr
You can use the docker command to add a container. Just change the container name
docker run -t -d --name tmr1 \
--network app-tier \
-e JOB_MANAGER_RPC_ADDRESS=jmr \
flink:1.9.2-scala_2.12 taskmanager
You can also add it by modifying the docker-compose.yaml file, copy the taskmanger configuration, modify the container name (I haven't tried this, it should be OK), start it with the command, and the - d parameter in both docker and docker compose refers to start in the background, so that you won't print a lot of logs. You can use docker logs [container name] command view log
Docker compose up - D [container name]
- Flink: what is 1.9.2-scala Gu 2.12?
Answer: the format of this string is Image name:tag. In tag, 1.9.2 refers to the version of flink, and scala Gu 2.12 refers to the version of flink built with scala 2.12. If you want to use other images, you can use docker search [container name, such as flink] to search, and then use the docker pull [container name] command to pull the image