scala learning notes - type parameters
Multiple definition
Type variables can have both upper and lower bounds. Written as:
T >: Lower <: Upper
There cannot be multiple upper bounds or multiple lower bounds at the same time; However, you can still require a type to implement multiple characteristics, like this:
T <: Comparable[T] with Serializable with Cloneable
Ther ...
Posted by mbhcool on Tue, 08 Feb 2022 06:46:44 +0100
Hadoop installation complete
HADOOP installation Linux stand-alone
Download Hadoop
Hadoop3.xx download address: http://archive.apache.org/dist/hadoop/common/hadoop-3.1.3/
Upload to Linux via FTP
Decompression software
tar -zxvf hadoop-3.1.3.tar.gz -C /opt/module/
Configure HADOOP environment variables
Create custom profile
vim /etc/prof ...
Posted by cesarcesar on Mon, 07 Feb 2022 09:17:24 +0100
High performance message oriented middleware Kafka
Kafka was originally developed by Linkedin company. It is a distributed, partition and multi replica distributed message system based on zookeeper coordination. Its biggest feature is that it can process a large amount of data in real time to meet various demand scenarios: such as hadoop based batch processing system, low latency real-time sy ...
Posted by s.eardley on Mon, 07 Feb 2022 08:46:25 +0100
Basic query operation of elasticsearch
Basic operation of es
1. Create es_db index, and set the default word segmentation method of the index to ik_max_word
PUT /es_db
{
"settings": {
"index": {
"analysis.analyzer.default.type": "ik_max_word"
}
}
}
2. Basic operations for index
GET /es_db
DELETE /es_db
3. Add document
PUT /es_db/_doc/1
{
"name": "Zh ...
Posted by sciencebear on Sun, 06 Feb 2022 22:29:28 +0100
Spark chasing Wife Series (Pair RDD Episode 2)
After a busy day, I didn't do anything
Small talk:
I didn't do anything today. Unconsciously, it's the fifth day of the lunar new year. I'll start taking subject 4 in 5678 days. I hope to get my driver's license early
combineByKey
First explain the meaning of each parameter
createCombiner: a function that creates a combination withi ...
Posted by mj_23 on Sat, 05 Feb 2022 13:37:41 +0100
Exceptions and solutions when Hadoop runs MapReduce task
Exception code description
Just beginning to contact Hadoop,about MapReduce From time to time, I especially understand that the following records the problems and solutions that have been tangled for a day
1. Execute MapReduce task
hadoop jar wc.jar hejie.zheng.mapreduce.wordcount2.WordCountDriver /input /output
2. Jump out of exception ...
Posted by burzvingion on Thu, 03 Feb 2022 14:16:53 +0100
Classification case: sample imbalance in XGB
Parameter setting
There are often problems in XGB classification
There are parameters to adjust the sample imbalance
scale_pos_weight
,
Usually, we enter the ratio of negative sample size to positive sample size in the parameter
Classification case
Create unbalanced dataset
import numpy as np
import xgboost as xgb
im ...
Posted by jakebrewer1 on Thu, 03 Feb 2022 10:42:13 +0100
[Flink] Flink computing resource management
1. General
Reprint: Flink source code reading notes (6) - Computing Resource Management
In Flink, computing resources are allocated with Slot as the basic unit. This paper will analyze the management mechanism of computing resources in Flink.
2. Basic concept of task slot
In the previous article, we learned about the startup process of Fl ...
Posted by sssphp on Thu, 03 Feb 2022 07:53:26 +0100
Spark chasing Wife Series (RDD mapping)
I finally sank down and began to be more literate
Small talk:
This article will talk about some operators in Spark RDD, which are all about mapping.
Specifically, there are three operators: map, mappartitions and mappartitionwithindex There will be as few as possible to read the set of foreign operators in the set of six days after each ti ...
Posted by stokie-rich on Thu, 03 Feb 2022 07:00:04 +0100
[Flink] reading notes of Flink source code (19) - Implementation of flow table Join in Flink SQL
1. General
Reprint: Reading notes of Flink source code (19) - Implementation of flow table Join in Flink SQL
In the process of data analysis using SQL, association query is often used. In traditional OLTP and OLAP fields, the data set of association query is bounded, so it can rely on caching bounded data set for query. However, in Streamin ...
Posted by rune_sm on Thu, 03 Feb 2022 00:38:27 +0100