HDFS high availability architecture

First, we need to build three virtual machines (here is a demonstration). For the construction process, please refer to this article. Using virtual machine to complete Hadoop fully distributed construction_ You can read your own blog - CSDN blog After completing the Hadoop fully distributed construction in the previous article, you can do the ...

Posted by antileon on Sat, 26 Feb 2022 14:10:24 +0100

atlas stand-alone installation

1, Virtual machine preparation Update virtual machine, command: yum -y updateModify the hostname. The command is hostnamectl set hostname atlasClose the firewall. The command is systemctl stop firewalld Service and systemctl disable firewalld servicereboot 2, Install jdk Uninstall openjdk, command: rpm -e --nodeps java-1.7.0-openjdk rpm -e ...

Posted by frao_0 on Wed, 23 Feb 2022 05:20:54 +0100

Hive tutorial (06) - Hive SerDe serialization and deserialization

01 introduction In the previous tutorial, you have a preliminary understanding of Hive's data model, data types and operation commands. Interested students can refer to: Hive tutorial (01) - getting to know hiveHive tutorial (02) - hive installationHive tutorial (03) - hive data modelHive tutorial (04) - hive data typesHive tutorial (05) ...

Posted by Jackanape on Tue, 22 Feb 2022 04:24:36 +0100

Hudi of data Lake: Hudi quick experience

catalogue 0. Links to related articles 1. Compile Hudi source code 1.1. Maven installation 1.2. Download and compile hudi 2. Install HDFS 3. Install Spark 4. Run hudi program in spark shell It mainly introduces the integrated use of Apache native Hudi, HDFS, Spark, etc 0. Links to related articles Summary of articles on basic know ...

Posted by mrman23 on Mon, 21 Feb 2022 06:05:15 +0100

Hive installation, deployment and management

Hive installation, deployment and management Experimental environment Linux Ubuntu 16.04 prerequisite: 1) Java runtime environment deployment completed 2) Hadoop 3.0.0 single point deployment completed 3) MySQL database installation completed The above preconditions are ready for you. Experimental content Under the above preconditions, ...

Posted by Homer30 on Sun, 20 Feb 2022 04:37:39 +0100

Hadoop pseudo distributed cluster installation and deployment

deploy Download installation packageUpload and unzip the installation packageConfigure environment variablesModify profileFormat HDFSModify script fileStart and verifyStop cluster be careful: 1. The JDK environment is installed and configured by default 2. The CentOS7 Linux environment is installed and configured by default 3. Change the curre ...

Posted by MasterACE14 on Sat, 19 Feb 2022 21:33:02 +0100

Flume introduction and flume deployment, principle and use

Flume introduction and flume deployment, principle and use Flume overview Flume is a highly available, reliable and distributed system for massive log collection, aggregation and transmission provided by Cloudera. Flume is based on streaming architecture, which is flexible and simple. Flume's main function is to read the data from the server ...

Posted by CONFUSIONUK on Sat, 19 Feb 2022 16:46:18 +0100

Principle and application of Hadoop Technology

Hadoop data processing (sophomore training in 2020) 1, Project background The training content is the statistical analysis of automobile sales data. Through this project, we will deepen our understanding of HDFS distributed file system and MapReduce distributed parallel computing framework, master and apply them skillfully, experience the dev ...

Posted by Garcia on Sat, 19 Feb 2022 11:54:35 +0100

MapReduce processing pictures

Reference link 1 Reference link 2 The code comes from link 2 and has been modified by yourself. The level is limited. I hope to point out some mistakes. hadoop3. 2.1 write code under centos 7 window, package and submit it to Hadoop cluster on centos for operation.   ideas:   put the picture on hdfs, and then write the path of each im ...

Posted by benyhanna on Fri, 18 Feb 2022 06:16:31 +0100

Using docker to build Hadoop cluster

I Environmental Science: 1.Ubuntu20 2.Hadoop3.1.4 3.Jdk1.8_301 II Specific steps Pull the latest version of ubuntu imageUse mount to transfer jdk,hadoop and other installation packages to the mount directory through xftp or using the command line scp command.Enter the ubuntu image container docker exec -it container id /bin/bashUpdate a ...

Posted by The_Assistant on Wed, 16 Feb 2022 14:32:26 +0100