Big data tutorial (8.4) mobile traffic analysis case

The implementation and principle of wordcount word statistics using mapreduce are shared before. This blogger will continue to share a classic case of mobile traffic analysis to help understand and use hadoop platform in practical work. I. requirements The following is a mobile traffic log. We need to analyze the upstream traffic, downstream ...

Posted by chantown on Fri, 06 Dec 2019 23:52:36 +0100

Typical scenes of hbase

1. hbase integration Mapreduce    in the offline task scenario, MapReduce accesses HBASE data to speed up analysis and expand analysis capabilities.Read data from hbase (result) public class ReadHBaseDataMR { private static final String ZK_KEY = "hbase.zookeeper.quorum"; private static final String ZK_VALUE = "hadoop01:2181,h ...

Posted by diagnostix on Wed, 04 Dec 2019 11:55:27 +0100

Getting started with the cursor tutorial

Curator It is an open-source Zookeeper client of Netflix. Compared with the native client provided by Zookeeper, the cursor has a higher level of abstraction and simplifies the programming of Zookeeper client. Maven dependence   <dependency>     <groupId>org.apache.zookeeper</groupId>     <artifactId>zookeeper</arti ...

Posted by bdmovies on Wed, 04 Dec 2019 10:45:02 +0100

Advanced case of spark SQL

(1) case of ashes -- UDTF seeking wordcount Data format:Each line is a string and separated by spaces.Code implementation: object SparkSqlTest { def main(args: Array[String]): Unit = { //Block redundant logs Logger.getLogger("org.apache.hadoop").setLevel(Level.WARN) Logger.getLogger("org.apache.spark").setLevel(Leve ...

Posted by marklarah on Tue, 03 Dec 2019 04:36:38 +0100

Specific programming scenarios of spark SQL

Introduction case: object SparkSqlTest { def main(args: Array[String]): Unit = { //Block redundant logs Logger.getLogger("org.apache.hadoop").setLevel(Level.WARN) Logger.getLogger("org.apache.spark").setLevel(Level.WARN) Logger.getLogger("org.project-spark").setLevel(Level.WARN) //Building programmin ...

Posted by tigomark on Sun, 01 Dec 2019 01:17:03 +0100

JVM source code practice - OOP Klass model

> Github original link 1 OOP Klass (ordinary object pointer) model OOP Klass model is used to describe the properties and behaviors of class It is designed as OOP and Klass because we don't want each object to have a C ++ vtbl pointer. Therefore, ordinary oops has no virtual function Instead, they forward all "virtual" functions t ...

Posted by stopblackholes on Thu, 28 Nov 2019 20:24:27 +0100

High availability configuration of Hadoop distributed environment

The previous article introduced Hadoop distributed configuration , but designed to be highly available, this time use zookeeper to configure Hadoop highly available. 1. Environmental preparation 1) modify IP 2) modify the mapping of host name and host name and IP address 3) turn off the firewall 4) ssh password free login 5) create hado ...

Posted by eflopez on Tue, 26 Nov 2019 18:53:08 +0100

Spark 2.4.2 source compilation

Software version:     jdk: 1.8     maven: 3.61    http://maven.apache.org/download.cgi     spark: 2.42      https://archive.apache.org/dist/spark/spark-2.4.2/ Hadoop version: hadoop-2.6.0-cdh5.7.0 (Hadoop version supported by spark compilation, does not need to be installed) To configure maven: #Configure environment variables [root@hadoop004  ...

Posted by buddymoore on Wed, 20 Nov 2019 18:01:18 +0100

The docker k8s cluster deploys tomcat and uses one image to increase the reusability of the image.

Written in the front, k8s clusters have been set up. Please refer to the previous article for specific steps. Write the Dockerfile file to create a public image. Each time tomcat deploys, it directly uses the image without build ing the image every time. #At first, I wanted to use tomcat's official image, but I had no choice but to use diban ...

Posted by thankqwerty on Sat, 02 Nov 2019 15:13:20 +0100

Cluster construction of hadoop, spark, hive and azkaban under ubuntu

Tuesday, 08. October 2019 11:01 am Initial preparation: 1. jdk installation Do the following on all three machines (depending on the number of machines you have): 1) you can install jdk through apt get, execute whereis java on the command line to get the installation path of java, or download the installation package of jdk manually f ...

Posted by mattal999 on Sat, 02 Nov 2019 11:47:52 +0100