Alicloud builds big data platform: flume installation, deployment and testing

Posted by zahidraf on Tue, 10 Dec 2019 04:09:34 +0100

I. flume installation

1. decompress

 tar -zxvf flume-ng-1.6.0-cdh5.15.0.tar.gz -C /opt/modules/

2. Change name

mv apache-flume-1.6.0-cdh5.15.0-bin/ flume-1.6.0-cdh5.15.0-bin/ 

3. Configuration file: flume-env.sh

export JAVA_HOME=/opt/modules/jdk1.8.0_151

4. Test success

bin/flume-ng version

//Result:

Flume 1.6.0-cdh5.15.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: efd9b9d9eccdb177341c096d73bcaf70f9ea31c6
Compiled by jenkins on Thu May 24 04:26:40 PDT 2018
From source with checksum ae1e74e47187f6790f7fd226a8ca1920

II. Flume ng command of flume

Usage: bin/flume-ng <command> [options]...

1.commands:

  agent                     run a Flume agent
  avro-client               run an avro Flume client

2.options

(1)global options:

  --conf,-c <conf>          use configs in <conf> directory

(2)agent options:

  --name,-n <name>          the name of this agent (required)
  --conf-file,-f <file>     specify a config file (required if -z missing)

(3)avro-client options:

  --rpcProps,-P <file>   RPC client properties file with server connection params
  --host,-H <host>       hostname to which events will be sent
  --port,-p <port>       port of the avro source
  --dirname <dir>        directory to stream to avro source
  --filename,-F <file>   text file to stream to avro source (default: std input)
  --headerFile,-R <file> File containing event headers as key/value pairs on each new line

(4) order to submit task:

bin/flume-ng agent --conf conf --name agent --conf-file conf/test.properties  
bin/flume-ng agent -c conf -n agent -f conf/test.properties Dflume.root.logger=INFO,console
bin/flume-ng avro-client --conf conf --host hadoop --port 8080

III. configuration selection

1.flume is installed in hadoop cluster (own situation)

Configure Java home:

export JAVA_HOME=/opt/modules/jdk1.8.0_151

2 flume is installed in hadoop cluster, and HA is also configured

(1) changes in HDFS access
(2) configure JAVA_HOME: export JAVA_HOME=/opt/modules/jdk1.8.0_151
(3) add core-site.xml and hdfs-site.xml of hadoop to the conf directory of flume

3.flume is not in hadoop cluster

(1) configure Java home

export JAVA_HOME=/opt/modules/jdk1.8.0_151

(2) add core-site.xml and hdfs-site.xml of hadoop to the conf directory of flume

(3) add some hadoop jar packages to the lib directory of flume, and the corresponding version of the jar package when needed

IV. cases of running official website

1. Configure the flume run file flume-conf.properties

# 1.Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# 2.Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = hadoop
a1.sources.r1.port = 44444

# 3.Describe the sink
a1.sinks.k1.type = logger

# 4.Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# 5.Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

2. Run flume

 bin/flume-ng agent --name a1  --conf conf  --conf-file conf/flume-conf.properties -Dflume.root.logger=INFO,console

3. Install telnet

sudo yum -y install telnet

4. Open port 44444 and input test

telnet hadoop  44444

Result: flume can receive telnet input data~

 

Topics: Big Data Hadoop xml Apache git