Hadoop data processing (sophomore training in 2020)
1, Project background
The training content is the statistical analysis of automobile sales data. Through this project, we will deepen our understanding of HDFS distributed file system and MapReduce distributed parallel computing framework, master and apply them skillfully, experience the dev ...
Posted by Garcia on Sat, 19 Feb 2022 11:54:35 +0100
Reference link 1 Reference link 2 The code comes from link 2 and has been modified by yourself. The level is limited. I hope to point out some mistakes.
hadoop3. 2.1 write code under centos 7 window, package and submit it to Hadoop cluster on centos for operation. ideas: put the picture on hdfs, and then write the path of each im ...
Posted by benyhanna on Fri, 18 Feb 2022 06:16:31 +0100
Hadoop[03-03] access count test based on DFS and ZKFC (Hadoop 2.0)
Prepare the environment
Prepare multiple virtual machines and start dfs and zookeeper See link for details: Hadoop2.0 start DFS and Zookeeper
Some data of multiple virtual machines are as follows
numberhost nameHost domain nameip address①ToozkyToozky192.168.64.220②Toozky2T ...
Posted by chantown on Fri, 11 Feb 2022 01:20:17 +0100
MapReduce is a cluster based high-performance parallel computing platform. MapReduce is a software framework for parallel computing and operation. MapReduce is a parallel programming model and methodcharacteristic:
① The distribution is reliable. The operation of the data set is distributed to multiple nodes in the cluster to ac ...
Posted by warren on Thu, 10 Feb 2022 19:39:51 +0100
Part of the content is extracted from the training materials of Shang Silicon Valley, dark horse and so on
1. Get to know MapReduce
1.1 understand MapReduce idea
MapReduce thought can be seen everywhere in life, and everyone has been exposed to it more or less. The core idea of MapReduce is "divide and then combine ...
Posted by mkili on Thu, 03 Feb 2022 16:06:08 +0100
Exception code description
Just beginning to contact Hadoop，about MapReduce From time to time, I especially understand that the following records the problems and solutions that have been tangled for a day
1. Execute MapReduce task
hadoop jar wc.jar hejie.zheng.mapreduce.wordcount2.WordCountDriver /input /output
2. Jump out of exception ...
Posted by burzvingion on Thu, 03 Feb 2022 14:16:53 +0100
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
public cla ...
Posted by warydig on Mon, 31 Jan 2022 03:11:39 +0100
1. Introduction to big data
1.1 big data concept
big data refers to a data set that cannot be captured, managed and processed by conventional software tools within a certain time range. It is a massive, high growth rate and diversified information asset that requires a new processing mode to have stronger decision-making power, insight an ...
Posted by monkuar on Sat, 29 Jan 2022 15:27:44 +0100
First, build haoop2.0 of eclipse 7.1 development environment, the resources used are linked as follows:
Install Hadoop 2.0 for windows 7.1 environment
Building hadoop development environment under eclipse
In this way, we can develop hadoop in eclipse
1, Introduction to MapReduce model
1. Map and Reduce functions
2. MapReduce a ...
Posted by clown[NOR] on Fri, 21 Jan 2022 03:56:50 +0100