1, Flume optimization
1. Adjust the memory size of Flume process,
It is recommended to set 1G~2G. Too small will lead to frequent GC Because Flume process is also based on Java, it involves the memory setting of the process. Generally, it is recommended to set the memory of a single Flume process (or a single Agent) to 1G~2G. If the memory is ...
Posted by mcfmullen on Fri, 04 Mar 2022 03:22:49 +0100
Flume introduction and flume deployment, principle and use
Flume is a highly available, reliable and distributed system for massive log collection, aggregation and transmission provided by Cloudera. Flume is based on streaming architecture, which is flexible and simple.
Flume's main function is to read the data from the server ...
Posted by CONFUSIONUK on Sat, 19 Feb 2022 16:46:18 +0100
Spark BigData Program: big data real-time stream processing log
1, Project content
Write python scripts to continuously generate user behavior logs of learning websites.Start Flume to collect the generated logs.Start Kafka to receive the log received by Flume.Use Spark Streaming to consume Kafka's user logs.Spark Streaming cleans the data ...
Posted by iriedodge on Mon, 31 Jan 2022 13:43:09 +0100
Xiaobai's big data journey (73)
The previous chapter introduced the internal principle of Flume. This chapter explains the extended knowledge of Flume. The focus of this chapter is to understand and learn to use the user-defined components of Flume
The internal principle was introduced in the pr ...
Posted by xgab on Tue, 25 Jan 2022 12:08:08 +0100
Flume log collection tool is mainly used since it is a tool!
Distributed acquisition processing and aggregation streaming framework
A tool for collecting data by writing a collection scheme, that is, a configuration file. The configuration scheme is in the official document
1. Flume architecture
Agent JVM process
Source: receive data ...
Posted by mikeylikesyou on Tue, 28 Dec 2021 16:52:10 +0100
Introduction: This is a learning note blog about the installation and deployment of flume. The main contents include: flume installation and deployment and two entry cases of flume. They are: the official case of monitoring port data and the file changes tracked by multiple files in the specified directory in real time. If there are mistakes, p ...
Posted by 3.grosz on Tue, 28 Dec 2021 09:57:51 +0100
26 data analysis cases -- the fourth station: web server log data collection based on Flume and Kafka
Python: Python 3.x；Hadoop2.7.2 environment;Kafka_2.11;Flume-1.9.0.
Link: https://pan.baidu.com/s/1oZcqAx0EIRF7Aj1xxm3WNw Extraction code: kohe
Step 1: install and start the httpd s ...
Posted by cemzafer on Sun, 26 Dec 2021 07:37:24 +0100
I Flume installation deployment
Flume official website Address: http://flume.apache.org/ Document viewing address : http://flume.apache.org/FlumeUserGuide.html Download address : http://archive.apache.org/dist/flume/
Installation deployment: CDH 6.3 is used locally Version 1, Flume has been installed. The instal ...
Posted by safra on Sat, 25 Dec 2021 11:49:21 +0100
This article is about the first process of big data offline data processing project: data collection
1) Use flume to collect website log file data to access.log
2) Write shell script: split the collected log data file (otherwise the access.log file is too large) and rename it to access_ Mm / DD / yyyy.log.   ...
Posted by erth on Tue, 30 Nov 2021 12:59:03 +0100
1. Agent Components
Components in Agent include Source, Channel, Sink.
The Source component can handle various types and formats of log data.
Common source s in Flume:
Common CategoriesdescribeavroListen for Avro ports and receive Event s from external Avro client streamsexecExec source r ...
Posted by sohdubom on Sun, 21 Nov 2021 19:51:24 +0100