Cluster planning
Host | IP | function |
---|---|---|
centos01 | 192.168.52.221 | master |
centos02 | 192.168.52.222 | segment |
centos03 | 192.168.52.223 | segment |
It is a cluster composed of three machines in total, and the standby node is not set up
System environment
name | explain |
---|---|
operating system | Centos7 |
Greenplum | Greenplum 6.17 |
Java | JDK8 |
GCC | GCC 4.8.5 |
Modify system files
Modify the system resource limit, / etc / security / limits Add the following information to the conf file:
* soft nofile 65536 * hard nofile 65536 * soft nproc 131072 * hard nproc 131072
Modify the system kernel information, / etc / sysctl Add the following information to the conf file:
# kernel.shmall calculates echo $(expr $(getconf _phys_pages) / 2 with the following command kernel.shmall = 357475 # kernel.shmmax calculates echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE)) with the following command kernel.shmmax = 1464217600 kernel.shmmni = 4096 vm.overcommit_memory = 2 vm.overcommit_ratio = 95 kernel.sem = 500 2048000 200 4096 kernel.sysrq = 1 kernel.core_uses_pid = 1 kernel.msgmnb = 65536 kernel.msgmax = 65536 kernel.msgmni = 2048 net.ipv4.tcp_syncookies = 1 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.tcp_max_syn_backlog = 4096 net.ipv4.conf.all.arp_filter = 1 net.core.netdev_max_backlog = 10000 net.core.rmem_max = 2097152 net.core.wmem_max = 2097152 vm.swappiness = 10 vm.zone_reclaim_mode = 0 vm.dirty_expire_centisecs = 500 vm.dirty_writeback_centisecs = 100 vm.dirty_background_ratio = 3 vm.dirty_ratio = 10
After modification, execute the command to take effect immediately
sysctl -p
Modify / etc / security / limits * * nproc. Under D / The conf file is (the file name may be 20-nproc.conf or 90-nproc.conf):
* soft nproc 131072
Close selinux and modify / etc/sysconfig/selinux:
SELINUX=disabled
Before setting up the mapping of all nodes of / etc/hosts and closing the firewall, hadoop and spark clusters have been configured and will not be repeated
Add user group
This step cannot be omitted!!! Because the root user cannot be used for subsequent initialization of greenplus!!!
Add gpadmin user group and grant corresponding permissions
# add group groupadd -g 530 gpadmin # Add user useradd -g 530 -u 530 -m -d /home/gpadmin -s /bin/bash gpadmin # Change owner chown -R gpadmin:gpadmin /home/gpadmin # Change Password passwd gpadmin
Configure ssh password free login of gpadmin users on three machines. The steps here cannot be omitted!!! ( For details, please refer to Centos 7 cluster configuration SSH password free login )Otherwise, you need to enter the gpadmin login password of the three machines n times during subsequent initialization of greenplug (when the initial iron is too lazy to configure, it is very tired to repeatedly enter the passwords of the three machines. Finally, an exception occurs when initializing the intermediate batch to create gpseg, which cannot be carried out)
Installing Greenplum
Switch the gpadmin user and create the configuration folder:
su gpadmin mkdir -p /home/gpadmin/conf
Create the hostlist and edit the file:
vim /home/gpadmin/conf/hostlist centos01 centos02 centos03
Create seg_hosts file and edit:
vim /home/gpadmin/conf/seg_hosts centos02 centos03
The three machines are installed in the specified directory respectively, Download address
rpm -ivh --prefix=/usr/local/services/greenplum/ open-source-greenplum-db-6.17.0-rhel7-x86_64.rpm
If an error is reported in this step, it is generally caused by the lack of required dependencies. Follow the prompts to install the required dependencies
# For example, my machine lacks apr and apr util, just install them one by one according to the prompts yum install -y apr-util
Switch to the root user and configure the secret free connection for greenplus
source /usr/local/services/greenplum/greenplum-db/greenplum_path.sh gpssh-exkeys -f /home/gpadmin/conf/hostlist
Batch create data directory and authorize
# Secret free connection, batch operation of three machines gpssh -f /home/gpadmin/conf/hostlist mkdir -p /opt/greenplum/data/master mkdir -p /opt/greenplum/data/primary mkdir -p /opt/greenplum/data/mirror mkdir -p /opt/greenplum/data2/primary mkdir -p /opt/greenplum/data2/mirror # Authorize users chown -R gpadmin:gpadmin /usr/local chown -R gpadmin:gpadmin /opt
Configure the environment variables for the gpadmin user
# Open file vim /home/gpadmin/.bash_profile # Added content source /usr/local/services/greenplum/greenplum-db/greenplum_path.sh export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1 export GPPORT=5432 export PGDATABASE=gp_sydb # Apply changes with immediate effect source .bash_profile
Initialize database
Create a new initialization configuration file initgp_config
cd /usr/local/services/greenplum/greenplum-db/docs/cli_help/gpconfigs cp gpinitsystem_config initgp_config
Modify configuration file initgp_config:
declare -a DATA_DIRECTORY=(/opt/greenplum/data/primary /opt/greenplum/data/primary /opt/greenplum/data2/primary /opt/greenplum/data2/primary) declare -a MIRROR_DATA_DIRECTORY=(/opt/greenplum/data/mirror /opt/greenplum/data/mirror /opt/greenplum/data2/mirror /opt/greenplum/data2/mirror) ARRAY_NAME="gp_sydb" #Initialize database name MASTER_HOSTNAME=centos01 #Master node name MASTER_DIRECTORY=/opt/greenplum/data/master #The resource directory is a previously created resource directory MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1 DATABASE_NAME=gp_sydb #Configured initialization database name MACHINE_LIST_FILE=/home/gpadmin/conf/seg_hosts
Switch the gpadmin user and perform initialization (you must use the user operation created earlier, not under the root user)
source /usr/local/services/greenplum/greenplum-db/greenplum_path.sh gpinitsystem -c initgp_config -D
If there is an error in this step, you need to delete all the gpseg files generated by initialization and restart initialization (/ opt / greenplus / data / / opt / greenplus / data2 primary, gpseg-1 created in master, etc.). The specific error information can be viewed in the log in / home/gpadmin/gpAdminLogs
Handling of errors reported by Greenplum connecting to external clients
Error message: no pg_hba.conf entry for host
Modify / opt / greenplus / data / Master / gpseg-1 / PG under the master node_ hba. Conf configuration file
# Add a line to indicate that any user is allowed to connect host all all 0.0.0.0/0 trust
Database operation
command | significance |
---|---|
gpstart | Start database |
gpstop -r | restart |
gpstop -u | Reload only configuration file changes |
gpstop | Stop database |
psql -d gp_sydb | Log in to gp_sydb database |