Hadoop - Programmer Think - where programmers share thinking

Hadoop3.x Fully Distributed Setup (Detailed)

Python WeChat Subscription Applet Course Video https://edu.csdn.net/course/detail/36074 Python Actual Quantitative Transaction Finance System https://edu.csdn.net/course/detail/35475 Environmental preparation vm virtual machine (self-installing Centos7 system)hadoop3.x Installation Package (linux version)java1.8 installation package (linux ...

Posted by acidHL on Wed, 09 Mar 2022 18:15:48 +0100

Installing Spark and Python exercises

1, Install Spark Introduction to Spark 2.4.0: installation and use of Spark Blog address: http://dblab.xmu.edu.cn/blog/1307-2/ 1.1 basic environment 1.1.1 before installing Spark: Linux system Java environment (Java8 or JDK1.8 or above) Hadoop environment Hadoop installation tutorial address: http://dblab.xmu.edu.cn/blog/install-hadoop/ Follow ...

Posted by imperialized on Tue, 08 Mar 2022 12:13:09 +0100

Flink tutorial (13) - Flink advanced API (state management)

01 introduction In the previous blog, we have a certain understanding of the use of Flink batch streaming API. Interested students can refer to the following: Flink tutorial (01) - Flink knowledge mapFlink tutorial (02) - getting started with FlinkFlink tutorial (03) - Flink environment constructionFlink tutorial (04) - getting started wi ...

Posted by FortMyersDrew on Mon, 07 Mar 2022 17:54:50 +0100

Flink tutorial (12) - Flink advanced API (Time and Watermaker)

01 introduction In the previous blog, we have a certain understanding of the use of Flink batch streaming API. Interested students can refer to the following: Flink tutorial (01) - Flink knowledge mapFlink tutorial (02) - getting started with FlinkFlink tutorial (03) - Flink environment constructionFlink tutorial (04) - getting started wi ...

Posted by cwls1184 on Mon, 07 Mar 2022 16:36:49 +0100

Big data Hadoop 3 1.3 detailed introduction notes of HDFS

On the right side of the page, there is a directory index, which can jump to the content you want to see according to the titleIf not on the right, look for the left Main article link https://blog.csdn.net/grd_java/article/details/115639179Chapter I: environmental construction https://blog.csdn.net/grd_java/article/details/115693312 If you hav ...

Posted by algarve4me on Sat, 05 Mar 2022 05:22:59 +0100

SST import without dependency on Nebula Exchange

This article tries to share the steps of SST writing in Nebula Exchange in a minimum way (stand-alone, containerized Spark, Hadoop and Nebula Graph). This article applies to v2 5 or above versions of Nebula Exchange. Original link: Foreign visits: https://siwei.io/nebula-exchange-sst-2.x/ Domestic visits: https://cn.siwei.io/nebula-exc ...

Posted by R0bb0b on Fri, 04 Mar 2022 20:21:59 +0100

Flume13: flume optimization

1, Flume optimization 1. Adjust the memory size of Flume process, It is recommended to set 1G~2G. Too small will lead to frequent GC Because Flume process is also based on Java, it involves the memory setting of the process. Generally, it is recommended to set the memory of a single Flume process (or a single Agent) to 1G~2G. If the memory is ...

Posted by mcfmullen on Fri, 04 Mar 2022 03:22:49 +0100

Hadoop 07: introduction to SecondaryNameNode and DataNode

1, Introduction to SecondaryNameNode When analyzing the edits log files just now, we have introduced the SecondaryNameNode. Here is a summary to show our attention. The secondary namenode is mainly responsible for regularly merging the contents of the edits file into the fsimage This merge operation is called checkpoint. When merging, the co ...

Posted by abselect on Wed, 02 Mar 2022 01:47:02 +0100

Hadoop of big data -- Window API + time semantics + Watermark in Flink

1, window concept Window is the core of dealing with infinite flow. The window divides the flow into "buckets" of limited size, and we can apply the calculation on the buckets. This document focuses on how to perform window operations in Flink and how programmers can get the most benefit from the functions it provides. T ...

Posted by BLeez on Sun, 27 Feb 2022 11:36:41 +0100

1, Spark overview and quick start

1, Spark overview 1.1 what is Spark Spark is a fast, universal and scalable big data analysis and calculation engine based on memory; 1.2 Spark && Hadoop SParkHadoop1.Scala development, fast, universal and extended big data analysis engine1. Java development, an open source framework for storing massive data on distributed server cl ...

Posted by brokeDUstudent on Sun, 27 Feb 2022 03:14:52 +0100

Hot Topics