scala learning notes - type parameters

Multiple definition Type variables can have both upper and lower bounds. Written as: T >: Lower <: Upper There cannot be multiple upper bounds or multiple lower bounds at the same time; However, you can still require a type to implement multiple characteristics, like this: T <: Comparable[T] with Serializable with Cloneable Ther ...

Posted by mbhcool on Tue, 08 Feb 2022 06:46:44 +0100

Hadoop installation complete

HADOOP installation Linux stand-alone Download Hadoop Hadoop3.xx download address: http://archive.apache.org/dist/hadoop/common/hadoop-3.1.3/ Upload to Linux via FTP Decompression software tar -zxvf hadoop-3.1.3.tar.gz -C /opt/module/ Configure HADOOP environment variables Create custom profile vim /etc/prof ...

Posted by cesarcesar on Mon, 07 Feb 2022 09:17:24 +0100

High performance message oriented middleware Kafka

Kafka was originally developed by Linkedin company. It is a distributed, partition and multi replica distributed message system based on zookeeper coordination. Its biggest feature is that it can process a large amount of data in real time to meet various demand scenarios: such as hadoop based batch processing system, low latency real-time sy ...

Posted by s.eardley on Mon, 07 Feb 2022 08:46:25 +0100

Basic query operation of elasticsearch

Basic operation of es 1. Create es_db index, and set the default word segmentation method of the index to ik_max_word PUT /es_db { "settings": { "index": { "analysis.analyzer.default.type": "ik_max_word" } } } 2. Basic operations for index GET /es_db DELETE /es_db 3. Add document PUT /es_db/_doc/1 { "name": "Zh ...

Posted by sciencebear on Sun, 06 Feb 2022 22:29:28 +0100

Spark chasing Wife Series (Pair RDD Episode 2)

After a busy day, I didn't do anything Small talk: I didn't do anything today. Unconsciously, it's the fifth day of the lunar new year. I'll start taking subject 4 in 5678 days. I hope to get my driver's license early combineByKey   First explain the meaning of each parameter createCombiner: a function that creates a combination withi ...

Posted by mj_23 on Sat, 05 Feb 2022 13:37:41 +0100

Exceptions and solutions when Hadoop runs MapReduce task

Exception code description Just beginning to contact Hadoop,about MapReduce From time to time, I especially understand that the following records the problems and solutions that have been tangled for a day 1. Execute MapReduce task hadoop jar wc.jar hejie.zheng.mapreduce.wordcount2.WordCountDriver /input /output 2. Jump out of exception ...

Posted by burzvingion on Thu, 03 Feb 2022 14:16:53 +0100

Classification case: sample imbalance in XGB

Parameter setting There are often problems in XGB classification There are parameters to adjust the sample imbalance scale_pos_weight , Usually, we enter the ratio of negative sample size to positive sample size in the parameter Classification case Create unbalanced dataset import numpy as np import xgboost as xgb im ...

Posted by jakebrewer1 on Thu, 03 Feb 2022 10:42:13 +0100

[Flink] Flink computing resource management

1. General Reprint: Flink source code reading notes (6) - Computing Resource Management In Flink, computing resources are allocated with Slot as the basic unit. This paper will analyze the management mechanism of computing resources in Flink. 2. Basic concept of task slot In the previous article, we learned about the startup process of Fl ...

Posted by sssphp on Thu, 03 Feb 2022 07:53:26 +0100

Spark chasing Wife Series (RDD mapping)

I finally sank down and began to be more literate Small talk: This article will talk about some operators in Spark RDD, which are all about mapping. Specifically, there are three operators: map, mappartitions and mappartitionwithindex There will be as few as possible to read the set of foreign operators in the set of six days after each ti ...

Posted by stokie-rich on Thu, 03 Feb 2022 07:00:04 +0100

[Flink] reading notes of Flink source code (19) - Implementation of flow table Join in Flink SQL

1. General Reprint: Reading notes of Flink source code (19) - Implementation of flow table Join in Flink SQL In the process of data analysis using SQL, association query is often used. In traditional OLTP and OLAP fields, the data set of association query is bounded, so it can rely on caching bounded data set for query. However, in Streamin ...

Posted by rune_sm on Thu, 03 Feb 2022 00:38:27 +0100