Flume13: flume optimization
1, Flume optimization
1. Adjust the memory size of Flume process,
It is recommended to set 1G~2G. Too small will lead to frequent GC Because Flume process is also based on Java, it involves the memory setting of the process. Generally, it is recommended to set the memory of a single Flume process (or a single Agent) to 1G~2G. If the memory is ...
Posted by mcfmullen on Fri, 04 Mar 2022 03:22:49 +0100
Flume introduction and flume deployment, principle and use
Flume introduction and flume deployment, principle and use
Flume overview
Flume is a highly available, reliable and distributed system for massive log collection, aggregation and transmission provided by Cloudera. Flume is based on streaming architecture, which is flexible and simple.
Flume's main function is to read the data from the server ...
Posted by CONFUSIONUK on Sat, 19 Feb 2022 16:46:18 +0100
Spark BigData Program: big data real-time stream processing log
Spark BigData Program: big data real-time stream processing log
1, Project content
Write python scripts to continuously generate user behavior logs of learning websites.Start Flume to collect the generated logs.Start Kafka to receive the log received by Flume.Use Spark Streaming to consume Kafka's user logs.Spark Streaming cleans the data ...
Posted by iriedodge on Mon, 31 Jan 2022 13:43:09 +0100
Big data journey for beginners who play strange and upgrade < Flume advanced >
Xiaobai's big data journey (73)
Flume advanced
Last review
The previous chapter introduced the internal principle of Flume. This chapter explains the extended knowledge of Flume. The focus of this chapter is to understand and learn to use the user-defined components of Flume
Custom components
The internal principle was introduced in the pr ...
Posted by xgab on Tue, 25 Jan 2022 12:08:08 +0100
Big data learning tutorial SD version Chapter 9 [Flume]
Flume log collection tool is mainly used since it is a tool!
Distributed acquisition processing and aggregation streaming framework
A tool for collecting data by writing a collection scheme, that is, a configuration file. The configuration scheme is in the official document
1. Flume architecture
Agent JVM process
Source: receive data ...
Posted by mikeylikesyou on Tue, 28 Dec 2021 16:52:10 +0100
Flume cluster installation and deployment, flume entry operation cases: Official cases of monitoring port data and real-time monitoring of multiple additional files in the specified directory
Introduction: This is a learning note blog about the installation and deployment of flume. The main contents include: flume installation and deployment and two entry cases of flume. They are: the official case of monitoring port data and the file changes tracked by multiple files in the specified directory in real time. If there are mistakes, p ...
Posted by 3.grosz on Tue, 28 Dec 2021 09:57:51 +0100
26 data analysis cases -- the fourth station: web server log data collection based on Flume and Kafka
26 data analysis cases -- the fourth station: web server log data collection based on Flume and Kafka
Experimental environment
Python: Python 3.x;Hadoop2.7.2 environment;Kafka_2.11;Flume-1.9.0.
Data package
Link: https://pan.baidu.com/s/1oZcqAx0EIRF7Aj1xxm3WNw Extraction code: kohe
Experimental steps
Step 1: install and start the httpd s ...
Posted by cemzafer on Sun, 26 Dec 2021 07:37:24 +0100
Flume collection 2-Flume introduction
I Flume installation deployment
Installation address:
Flume official website Address: http://flume.apache.org/ Document viewing address : http://flume.apache.org/FlumeUserGuide.html Download address : http://archive.apache.org/dist/flume/
Installation deployment: CDH 6.3 is used locally Version 1, Flume has been installed. The instal ...
Posted by safra on Sat, 25 Dec 2021 11:49:21 +0100
Big data offline processing data project website log file data collection log splitting data collection to HDFS and preprocessing
Introduction:
This article is about the first process of big data offline data processing project: data collection
Main contents:
1) Use flume to collect website log file data to access.log
2) Write shell script: split the collected log data file (otherwise the access.log file is too large) and rename it to access_ Mm / DD / yyyy.log.   ...
Posted by erth on Tue, 30 Nov 2021 12:59:03 +0100
Flume Agent Component Matching
1. Agent Components
Components in Agent include Source, Channel, Sink.
1.1 Source
The Source component can handle various types and formats of log data.
Common source s in Flume:
avroexecnetcatspooling directorytaildir
Common CategoriesdescribeavroListen for Avro ports and receive Event s from external Avro client streamsexecExec source r ...
Posted by sohdubom on Sun, 21 Nov 2021 19:51:24 +0100