Big data tutorial (8.4) mobile traffic analysis case
The implementation and principle of wordcount word statistics using mapreduce are shared before. This blogger will continue to share a classic case of mobile traffic analysis to help understand and use hadoop platform in practical work.
I. requirements
The following is a mobile traffic log. We need to analyze the upstream traffic, downstream ...
Posted by chantown on Fri, 06 Dec 2019 23:52:36 +0100
Typical scenes of hbase
1. hbase integration Mapreduce
in the offline task scenario, MapReduce accesses HBASE data to speed up analysis and expand analysis capabilities.Read data from hbase (result)
public class ReadHBaseDataMR {
private static final String ZK_KEY = "hbase.zookeeper.quorum";
private static final String ZK_VALUE = "hadoop01:2181,h ...
Posted by diagnostix on Wed, 04 Dec 2019 11:55:27 +0100
Getting started with the cursor tutorial
Curator It is an open-source Zookeeper client of Netflix. Compared with the native client provided by Zookeeper, the cursor has a higher level of abstraction and simplifies the programming of Zookeeper client.
Maven dependence
<dependency>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</arti ...
Posted by bdmovies on Wed, 04 Dec 2019 10:45:02 +0100
Advanced case of spark SQL
(1) case of ashes -- UDTF seeking wordcount
Data format:Each line is a string and separated by spaces.Code implementation:
object SparkSqlTest {
def main(args: Array[String]): Unit = {
//Block redundant logs
Logger.getLogger("org.apache.hadoop").setLevel(Level.WARN)
Logger.getLogger("org.apache.spark").setLevel(Leve ...
Posted by marklarah on Tue, 03 Dec 2019 04:36:38 +0100
Specific programming scenarios of spark SQL
Introduction case:
object SparkSqlTest {
def main(args: Array[String]): Unit = {
//Block redundant logs
Logger.getLogger("org.apache.hadoop").setLevel(Level.WARN)
Logger.getLogger("org.apache.spark").setLevel(Level.WARN)
Logger.getLogger("org.project-spark").setLevel(Level.WARN)
//Building programmin ...
Posted by tigomark on Sun, 01 Dec 2019 01:17:03 +0100
JVM source code practice - OOP Klass model
> Github original link
1 OOP Klass (ordinary object pointer) model
OOP Klass model is used to describe the properties and behaviors of class
It is designed as OOP and Klass because we don't want each object to have a C ++ vtbl pointer. Therefore, ordinary oops has no virtual function Instead, they forward all "virtual" functions t ...
Posted by stopblackholes on Thu, 28 Nov 2019 20:24:27 +0100
High availability configuration of Hadoop distributed environment
The previous article introduced Hadoop distributed configuration , but designed to be highly available, this time use zookeeper to configure Hadoop highly available.
1. Environmental preparation
1) modify IP 2) modify the mapping of host name and host name and IP address 3) turn off the firewall 4) ssh password free login 5) create hado ...
Posted by eflopez on Tue, 26 Nov 2019 18:53:08 +0100
Spark 2.4.2 source compilation
Software version:
jdk: 1.8
maven: 3.61 http://maven.apache.org/download.cgi
spark: 2.42 https://archive.apache.org/dist/spark/spark-2.4.2/
Hadoop version: hadoop-2.6.0-cdh5.7.0 (Hadoop version supported by spark compilation, does not need to be installed)
To configure maven:
#Configure environment variables
[root@hadoop004 ...
Posted by buddymoore on Wed, 20 Nov 2019 18:01:18 +0100
The docker k8s cluster deploys tomcat and uses one image to increase the reusability of the image.
Written in the front, k8s clusters have been set up. Please refer to the previous article for specific steps.
Write the Dockerfile file to create a public image. Each time tomcat deploys, it directly uses the image without build ing the image every time.
#At first, I wanted to use tomcat's official image, but I had no choice but to use diban ...
Posted by thankqwerty on Sat, 02 Nov 2019 15:13:20 +0100
Cluster construction of hadoop, spark, hive and azkaban under ubuntu
Tuesday, 08. October 2019 11:01 am
Initial preparation:
1. jdk installation
Do the following on all three machines (depending on the number of machines you have):
1) you can install jdk through apt get, execute whereis java on the command line to get the installation path of java, or download the installation package of jdk manually f ...
Posted by mattal999 on Sat, 02 Nov 2019 11:47:52 +0100