CMWebInstall of CDH installation treasure
premise
stay CDH It must be ensured before installation NODE1 The following programs in the master node have been started
1.Each machine :according to aliyun Time synchronization of services provided :ntpdate -u ntp6.aliyun.com
Start service
systemctl start ntpd
systemctl restart ntpd
Check to see if it starts ps -ef | grep n ...
Posted by Rhysickle on Mon, 10 Jan 2022 03:41:51 +0100
What if the Linux process gets stuck?
When we use the Linux system, if there is a problem with the network or disk I/O, we will find that the process is stuck. Even using kill -9 can't kill the process. Many commonly used debugging tools, such as strace and pstack, also fail. What's the matter?At this point, we use ps to view the process list. We can see that the status of the stuc ...
Posted by aesthetics1 on Thu, 06 Jan 2022 08:53:34 +0100
Data analysis & node management & setting up NFS gateway service | Cloud computing
1. Data analysis
1.1 problems
This case requires statistical analysis exercises:
Use the client to create the input directory on hdfsAnd upload * txt file to input directoryCall the cluster to analyze the uploaded files and count the words with the most occurrences
1.2 steps
To implement this case, you need to follow the following st ...
Posted by Q695 on Wed, 05 Jan 2022 18:31:13 +0100
Big data and Hadoop & distributed file systems & distributed Hadoop clusters | Cloud computing
1. Deploy Hadoop
1.1 problems
This case requires the installation of stand-alone Hadoop:
Hot word analysis:Minimum configuration: 2cpu, 2G memory, 10G hard diskVirtual machine IP: 192.168.1.50 Hadoop 1Installing and deploying hadoopData analysis to find the most frequently occurring words
1.2 steps
To implement this case, you need to ...
Posted by ball420 on Wed, 05 Jan 2022 18:23:47 +0100
Spark on yarn - spark submits tasks to yarn cluster for source code analysis
catalogue
1, Entry class - SparkSubmit
2, SparkApplication startup - JavaMainApplication, YarnClusterApplication
3, SparkContext initialization
4, YarnClientSchedulerBackend and YarnClusterSchedulerBackend initialization
5, ApplicationMaster startup
6, Spark on Yan task submission process summary
1, Entry class - SparkSubmit
When sub ...
Posted by suepahfly on Tue, 04 Jan 2022 10:03:33 +0100
Building Hadoop using virtual machine (pseudo distributed building, distributed building)
After learning Hadoop for a semester, I finally chewed off this big bone, tears!!! This article is more like a summary of learning Hadoop
1, Preparatory work
1. hadoop compressed package
There will be this official website. Download the compressed package and prepare it. I use version 2.7.1
2. jdk compressed package
This is the Java r ...
Posted by many_pets on Sat, 01 Jan 2022 19:05:16 +0100
CDH5 installing Kerberos authentication
BUG
BUG is written in front: Kerberos 1.15 1-18. el7. x86_ Version 64 has a BUG, do not install this version!!!! If you have installed the version described above, don't be afraid. Here is a solution Upgrade kerberos
1. System environment
1. Operating system: CentOS Linux release 7.5 1804 (Core) 2. CDH: 5.16.2-1.cdh5.16.2.p0.8 3. Kerberos: 1 ...
Posted by elum.chaitu on Sat, 01 Jan 2022 04:23:06 +0100
2021-12-30 the 58th step towards the program
catalogue
1, Introduction to azkaban
2, System architecture of azkaban
3, Installation mode of azkaban
3.1 Solo Server installation
3.1. 1 Introduction to solo server
3.1. 2 installation steps
3.2 installation method of multi exec server
3.2. 1 node layout
3.2. 2. Configure mysql
3.2. 3. Configure web server
3.2. 4. Configure exec se ...
Posted by evolve4 on Sat, 01 Jan 2022 04:07:23 +0100
Big data learning tutorial SD version Chapter 9 [Flume]
Flume log collection tool is mainly used since it is a tool!
Distributed acquisition processing and aggregation streaming framework
A tool for collecting data by writing a collection scheme, that is, a configuration file. The configuration scheme is in the official document
1. Flume architecture
Agent JVM process
Source: receive data ...
Posted by mikeylikesyou on Tue, 28 Dec 2021 16:52:10 +0100
Flume cluster installation and deployment, flume entry operation cases: Official cases of monitoring port data and real-time monitoring of multiple additional files in the specified directory
Introduction: This is a learning note blog about the installation and deployment of flume. The main contents include: flume installation and deployment and two entry cases of flume. They are: the official case of monitoring port data and the file changes tracked by multiple files in the specified directory in real time. If there are mistakes, p ...
Posted by 3.grosz on Tue, 28 Dec 2021 09:57:51 +0100