Chapter II practical operation of Yarn case

Take Linux snapshots for Hadoop 102, 103 and 104, and then restore the snapshots. 2.1 configuration case of core parameters of yarn production environment Demand: count the number of occurrences of each word from 1G data. 3 servers, each configured with 4G memory and 4 cores CPU, 4 threads.Demand analysis: 1G / 128m = 8 maptasks; 1 ReduceTask ...

Posted by bigMoosey on Thu, 27 Jan 2022 19:17:13 +0100

65. Spark comprehensive case (Sogou search log analysis)

Sogou lab: the search engine query log database is designed as a collection of Web query log data including some web query requirements of Sogou search engine and user clicks for about one month (June 2008). Provide benchmark research corpus for researchers who analyze the behavior of Chinese search engine users catalogue Original da ...

Posted by Ace_Online on Thu, 27 Jan 2022 16:49:35 +0100

Installation and use of hive

1. Install hive 1.1 installing java Java must be installed on the system before Hive can be installed. Use the following command to verify whether Java has been installed. If Java has been installed on the system, you can see the following response stay java official website Download and install as a fool 1.2 hadoop installation downl ...

Posted by simenss on Thu, 27 Jan 2022 12:27:34 +0100

Big data journey for beginners who play strange and upgrade < Flume advanced >

Xiaobai's big data journey (73) Flume advanced Last review The previous chapter introduced the internal principle of Flume. This chapter explains the extended knowledge of Flume. The focus of this chapter is to understand and learn to use the user-defined components of Flume Custom components The internal principle was introduced in the pr ...

Posted by xgab on Tue, 25 Jan 2022 12:08:08 +0100

apache hive 3.x deployment

About hive Hive is a data warehouse framework built on Hadoop, which can map structured data files into a database table and provide SQL like query function. Hive can convert SQL into MapReduce tasks for operation, and HDFS provides data storage at the bottom. Hive was originally developed by Facebook and later transferred to the Apache Softwa ...

Posted by srhino on Mon, 24 Jan 2022 11:52:35 +0100

Ubuntu18.04 install and build hadoop-3.3.0 cluster

Ubuntu18.04 install and build hadoop-3.3.0 cluster Reference blog: https://blog.csdn.net/sunxiaoju/article/details/85222290?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522162443354216780261915061%2522%252C%2522scm%2522%253A%252220140713.130102334 …%2522%257D&request_ id=162443354216780261915061&biz_ id=0&utm_ med ...

Posted by adammo on Mon, 24 Jan 2022 03:58:19 +0100

Design and implementation of block placement policy for HDFS multi rack distribution

preface As we all know, HDFS has three sets to ensure the high availability of its data. Moreover, the placement of HDFS on the three replicas is also carefully designed. Two replicas are placed on the same rack (different nodes), and the other replica is placed on another rack. Under such a placement strategy, the replica data can toler ...

Posted by polymnia on Sun, 23 Jan 2022 08:42:49 +0100

Hive data type, database related operations, table related operations, data import and export

Hive data type 1. Basic data type 2. Collection data type Case practice (1) Assuming that a table has the following row, we use JSON format to represent its data structure. The format accessed under Hive is { "name": "songsong", "friends": ["bingbing" , "lili"] , //List Array, "children": { //Key value Map, "xiao song": 18 , ...

Posted by luanne on Sun, 23 Jan 2022 00:23:28 +0100

Alibaba cloud MaxComputer SQL learning DDL

💝 Today, we will introduce some features of MaxComputer, a big data engine, and MaxComputer SQL. Students interested in hive sql can view the following contents 👇: Part I: Import and export of Hive data of Hadoop (DML).Part II: Hive query statement of Hadoop.Part III: Seven Join statements of Hive in Hadoop.Part IV: Ranking of Hive in Hadoo ...

Posted by gilijk on Fri, 21 Jan 2022 23:49:20 +0100

hadoop cluster construction and hadoop configuration

Environmental description and purpose Preparation: I prepared three virtual machines myself Using Hyper-V to build virtual machine cluster environment on windows platform_ a18792721831 blog - CSDN blog The environment is as follows hostnameNodedataNoderesourceManagernodeManagerOpening to the outside worldhadoop01start-upDo not startstart ...

Posted by Ardivaba on Fri, 21 Jan 2022 22:06:41 +0100