Apache Doris compilation and deployment

Posted by paperthinT on Mon, 08 Nov 2021 12:09:19 +0100

1, Official website

compile: http://doris.apache.org/master/zh-CN/installing/compilation.html

Deployment: http://doris.apache.org/master/zh-CN/installing/install-deploy.html

 

2, Docker

1 uninstall the old version

$ sudo yum remove docker \
                  docker-client \
                  docker-client-latest \
                  docker-common \
                  docker-latest \
                  docker-latest-logrotate \
                  docker-logrotate \
                  docker-engine

2 install the required software packages

Note: Yum utils provides Yum config manager, and the device mapper storage driver requires device mapper persistent data and lvm2

$ sudo yum install -y yum-utils device-mapper-persistent-data lvm2

3 configure source address

$ sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

4 install community engine

$ sudo yum -y install docker-ce docker-ce-cli containerd.io

5 start Docker

$ sudo systemctl start docker

6. Verify docker

$ sudo docker run hello-world

7 view docker configuration

$ sudo docker info

8 restart docker

$ sudo systemctl restart docker

9 search image

$ docker search [image_name:image_tag]

10 get image

$ docker pull [image_name:image_tag]

11 list local mirrors

$ docker images

12 delete image

$ sudo docker rmi [image_name:image_tag]

13 create container

$ sudo docker create -it [image_name:image_tag]  /bin/bash

14 start container

$ sudo docker start [container_id]

15 create and start a container

# Interactive startup
$ sudo docker run -it [image_name:image_tag] /bin/bash

# Background start
$ sudo docker run -it -d [image_name:image_tag] /bin/bash

16 entering the container

$ sudo docker exec -it [image_name:image_tag] /bin/bash

17 stop container

$ sudo docker stop [container_id]

18 delete container

# Delete stopped containers
$ sudo docker rm [container_id]
# Delete running container
$ sudo docker rm -f [container_id]

19 viewing containers

# View started containers
$ sudo docker ps

# View all containers
$ sudo docker ps -a

Three configuration npm

1 configure agent

npm config set proxy [your_proxy]
npm config set https-proxy [your_proxy]

2 configuration source

npm config set registry https://registry.npm.taobao.org
npm config get registry

Four configuration maven

  mvn -v   View maven directory and modify   setting.xml  

1 configure agent

<proxy>
    <!--id Name of agent (optional)-->
    <id>optional</id>
    <!--true Means effective-->
    <active>true</active>
    <!--agreement-->
    <protocol>http</protocol>
    <!--Internet user name and password, if not, please comment or delete-->
    <username></username>
    <password></password>
    <!--Internet use ip And port, i.e. agent, are replaced here with the corresponding ip And ports-->
    <host>[your_proxy_host]</host>
    <port>[your_proxy_port]</port>
    <!--Fill in the address without agent, with vertical bar|Split multiple addresses, usually fill in the local address Maven Warehouse address-->
    <nonProxyHosts>local.net|some.host.com</nonProxyHosts>
</proxy>

2 configuration source

<mirror>  
    <id>nexus-aliyun</id>  
    <mirrorOf>central</mirrorOf>    
    <name>Nexus aliyun</name>  
    <url>http://maven.aliyun.com/nexus/content/groups/public</url>  
</mirror>  
# Or open source Chinese maven image
<mirror>  
  <id>nexus-osc</id>  
  <mirrorOf>*</mirrorOf>  
  <name>Nexus osc</name>  
  <url>http://maven.oschina.net/content/groups/public/</url>  
</mirror>

V. compilation

1 Download Image

docker pull apache/incubator-doris:build-env-1.2

2 download source code

wget https://dist.apache.org/repos/dist/dev/incubator/doris/0.14/0.14.0-rc06/apache-doris-0.14.0-incubating-src.tar.gz

tar -zxvf
apache-doris-0.14.0-incubating-src.tar.gz

3 run image

docker run -it -v /[local]/.m2:/[docker]/.m2 -v /[local]/apache-doris-0.14.0-incubating-src/:/[docker]/apache-doris-0.14.0-incubating-src/ apache/incubator-doris:build-env-1.2

4 compile fe\be

cd /{doris_home}/

sh build.sh

 

Output directory:   {doris_home}/output  

 

  5 compile broker

cd {doris_home}/fs_brokers/apache_hdfs_broker/

sh build.sh

Output directory:

{doris_home}/fs_brokers/apache_hdfs_broker/output

6, FE deployment

1 copy the fe deployment file to the specified node

 

2 configuration fe

(1) Configuration file: {doris_home}/fe/conf/fe.conf

(2) Modify configuration item

meta_dir = ${DORIS_HOME}/doris-meta

Note: ① meta_dir is the metadata storage directory. The default value is  $ {DORIS_HOME}/doris-meta  , The directory needs to be created manually.

② other configuration items are optional. Please refer to the official website.

 

3 start fe

sh bin/start_fe.sh --daemon

Note: the FE process starts and enters the background for execution. Logs are stored in the log / directory by default. If startup fails, you can view the error information by viewing log/fe.log or log/fe.out.

 

4 fe high availability

4.1 connect to doris using mysql client

# query_port defaults to 9030, corresponding to fe/conf/fe.conf
mysql -h {fe_ip} -u root -P {query_port}

 

4.2 add follower or observer

# The first node is automatically leader and edit_log_port default 9010
ALTER SYSTEM ADD FOLLOWER "follower_host:edit_log_port";

ALTER SYSTEM ADD OBSERVER "observer_host:edit_log_port";

 

4.3 configure and start follower or observer

./bin/start_fe.sh --helper leader_host:edit_log_port --daemon

Note: ① the configuration of Follower and Observer is the same as that of Leader.

② when starting for the first time, execute the following commands  -- helpler leader_host:edit_log_port  , That is, the - helper parameter is only required when follower and observer are started for the first time

 

5 view fe status

show proc '/frontends'\G;

 

 

 

6 delete fe

ALTER SYSTEM DROP FOLLOWER[OBSERVER] "fe_host:edit_log_port";

 

seven   FE precautions for expansion / shrinkage

7.1 capacity expansion

(1) The number of follower FES (including leaders) must be odd. It is recommended to deploy up to 3 constituent high availability (HA) modes.

(2) When the FE is in high availability deployment (1 Leader and 2 followers), we suggest adding Observer FE to expand the FE's read service capability. Of course, we can continue to add Follower FE, but it is almost unnecessary.

(3) Generally, one FE node can handle 10-20 BE nodes. It is recommended that the total number of FE nodes BE less than 10. Usually three can meet most of the needs.

(4) The helper cannot point to the FE itself, but must point to one or more master / follower FES that already exist and are in normal operation.

 

7.2 volume reduction

(1) When deleting Follower FE, ensure that the final remaining Follower (including Leader) nodes are odd.

 

8 stop FE

sh bin/stop_fe.sh

 

  VII. Deployment of BE

7.1 copy be deployment files to the specified node

7.2 configuration

7.2.1 configuration file:   be/conf/be.conf.

7.2.2 modifying configuration items:

# data root path, separate by ';'
# you can specify the storage medium of each root path, HDD or SSD
# you can add capacity limit at the end of each root path, seperate by ','
# eg:
# storage_root_path = /home/disk1/doris.HDD,50;/home/disk2/doris.SSD,1;/home/disk2/doris
# /home/disk1/doris.HDD, capacity limit is 50GB, HDD;
# /home/disk2/doris.SSD, capacity limit is 1GB, SSD;
# /home/disk2/doris, capacity limit is disk capacity, HDD(default)

①storage_root_path: data storage directory. By default, under be/storage, you need to create the directory manually.

② Semicolons of English status are used between multiple paths  ;  Separate (do not add after the last directory)  ;).

③ You can distinguish the media of the storage directory through the path, HDD or SSD.

④ You can add capacity limits at the end of each path, separated by English status commas.

⑤ Example 1 is as follows: storage_root_path=/home/disk1/doris.HDD,50;/home/disk2/doris.SSD,10;/home/disk2/doris

Note: for SSD disks, add. SSD after the directory, and for HDD disks, add. HDD after the directory

Explain

    • / home/disk1/doris.HDD, 50, indicates that the storage limit is 50GB, HDD;
    • / home/disk2/doris.SSD 10, storage limited to 10GB, SSD;
    • / home/disk2/doris, the storage limit is the maximum disk capacity, and the default is HDD

⑥ Example 2 is as follows: storage_root_path=/home/disk1/doris,medium:hdd,capacity:50;/home/disk2/doris,medium:ssd,capacity:50

***

Explain

    • ***
    • / home/disk2/doris,medium:ssd,capacity:50, indicating that the storage limit is 50GB, SSD;   

7.2.3   Add all BE nodes in FE

# heartbeat_service_port is 9050 by default, corresponding to be/conf/be.conf
ALTER SYSTEM ADD BACKEND "be_host:be_heartbeat_service_port";

7.3 startup be

sh bin/start_be.sh --daemon

Note: the logs are stored in the be/log / directory by default. If startup fails, you can view the error information by viewing be/log/be.log or be/log/be.out.

7.4 viewing be status

SHOW PROC '/backends'\G;

 

 

7.5 be expansion

ALTER SYSTEM DROP BACKEND "be_host:be_heartbeat_service_port";

Precautions for BE expansion:

① After BE capacity expansion, Doris will automatically balance the data according to the load condition, which will not affect the use during the period.

7.6 be shrinkage

ALTER SYSTEM DROP BACKEND "be_host:be_heartbeat_service_port";

# perhaps

ALTER SYSTEM DECOMMISSION BACKEND "be_host:be_heartbeat_service_port";

Note: DROP BACKEND will directly delete the BE, and the data on it cannot BE recovered!!! Therefore, it is strongly not recommended to use DROP BACKEND to delete the BE node. When using the commitment statement, there will BE corresponding error prevention prompts.

seven point seven   Descommission command description

(1) This command is used to safely delete the BE node. After the command is issued, Doris will try to migrate the data on the BE to other BE nodes. When all data are migrated, Doris will automatically delete the node.

(2) This command is an asynchronous operation. After execution, you can   SHOW PROC '/backends';   Seeing that the isDecommission status of the BE node is true indicates that the node is going offline.

(3) The command may not BE executed successfully. For example, when the remaining BE storage space is insufficient to accommodate the data on the offline BE, or the number of remaining machines does not meet the minimum number of replicas, the command cannot BE completed, and the BE will always BE in the state of isDecommission true.

(4) The progress of communication can be through   SHOW PROC '/backends';   View the TabletNum in. If it is in progress, the TabletNum will continue to decrease.

(5) This operation can BE cancelled by the cancel resolution command. After cancellation, the data on the BE will maintain the current remaining data volume. Doris will perform load balancing again later

CANCEL DECOMMISSION BACKEND "be_host:be_heartbeat_service_port";


VIII. Deployment Broker (optional)

8.1 copy the broker file to the specified node

fs_brokers/apache_hdfs_broker/output/apache_hdfs_broker

8.2 configuring broker s

8.2.1 configuration file

apache_hdfs_broker/conf/apache_hdfs_broker.conf

8.2.2 configuration items

# the thrift rpc port
broker_ipc_port=8000

Note: the default value of 8000 can be used without modification

8.3 starting broker

sh bin/start_broker.sh --daemon

8.4 adding a broker

ALTER SYSTEM ADD BROKER broker_name "broker1_host:broker1_ipc_port","broker2_host:broker2_ipc_port",...;

8.5 viewing broker status

SHOW PROC "/brokers"\G;

 

  8.6 broker expansion

# ipc_port defaults to 8000, corresponding to / apache_hdfs_broker/conf/apache_hdfs_broker.conf
ALTER SYSTEM ADD BROKER broker_name "broker_host:broker_ipc_port";

8.7 broker shrinkage

ALTER SYSTEM DROP BROKER broker_name "broker_host:broker_ipc_port";

ALTER SYSTEM DROP ALL BROKER broker_name;

 

9, Apache Doris is easy to use

9.1 creating a new database

CREATE DATABASE `test`;

9.2 switching database

use `test`;

9.3 new data sheet

CREATE TABLE `student` (
  `id` int(11) NULL COMMENT "",
  `name` varchar(50) NULL COMMENT "",
  `age` int(11) NULL COMMENT "",
  `count` bigint(20) SUM NULL DEFAULT "0" COMMENT ""
) ENGINE=OLAP
AGGREGATE KEY(`id`, `name`, `age`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`id`) BUCKETS 10
PROPERTIES (
"replication_num" = "1",
"in_memory" = "false",
"storage_format" = "V2"
);

9.4 insert into insert data

insert into student values(1,'stephen',18,2);

9.5 stream load insert data

Sample data

3,stephen,18,33
6,lebron,28,44
4,stephen,18,33
5,stephen,18,33
1,stephen,18,33
2,lebron,28,44

 

curl --location-trusted -u root -T /app/student.csv -H "label:123" -H "column_separator:," http://{fe_host}:{fe_http_port}/api/test/student/_stream_load

9.6 query data

 select * from student;

 

 

Ten postscript

1. This document details the compilation, deployment and simple use of Apache Doris test environment. All configurations use the default configuration on the official website.

2. For more optimized configuration and advanced use, please refer to the Apache Doris official website.

3. If useful, please indicate the source.