Installation and use of hive
1. Install hive
1.1 installing java
Java must be installed on the system before Hive can be installed. Use the following command to verify whether Java has been installed. If Java has been installed on the system, you can see the following response stay java official website Download and install as a fool
1.2 hadoop installation
downl ...
Posted by simenss on Thu, 27 Jan 2022 12:27:34 +0100
Syntax comparison between Presto and Hive
Some time conversion problems are often encountered in work:
1) log_date:20200110 needs to be converted to standard date or compared with timestamp data
2) The working environment involves Presto and hive. It is faster to check the query with presto. Therefore, it is generally necessary to convert the date with the syntax of Presto and hive a ...
Posted by Lord Brar on Tue, 25 Jan 2022 23:31:57 +0100
Hive data type, database related operations, table related operations, data import and export
Hive data type
1. Basic data type
2. Collection data type
Case practice
(1) Assuming that a table has the following row, we use JSON format to represent its data structure. The format accessed under Hive is
{
"name": "songsong",
"friends": ["bingbing" , "lili"] , //List Array,
"children": { //Key value Map,
"xiao song": 18 ,
...
Posted by luanne on Sun, 23 Jan 2022 00:23:28 +0100
HiveSql interview question 29 -- find the maximum number of people online and the peak time period [accumulator idea, timing analysis]
catalogue
0 demand analysis
1 data preparation
2 data analysis
3 Summary
0 demand analysis
The data is the anchor ID,stt represents the start time and edt represents the next time.
idsttedt10012021-06-14 12:12:122021-06-14 18:12:1210032021-06-14 13:12:122021-06-14 16:12:1210042021-06-14 13:15:122021-06-14 20:12:1210022021-06-14 15:12:12 ...
Posted by Mzor on Sat, 22 Jan 2022 04:45:39 +0100
The process of spark sql reading and writing hive
Hive related configuration is required for Spark sql to read and write hive, so hive site is generally used The XML file is placed in the conf directory of spark. Code calls are simple. The key is the source code analysis process and how spark interacts with hive.
1. Code call
Read hive code
SparkSession sparkSession = SparkSession.builder() ...
Posted by Brian W on Tue, 18 Jan 2022 02:41:37 +0100
Introduction to big data -- Hive data query
grammar
SELECT [ALL | DISTINCT] select_expr, select_expr, ...
FROM table_reference
[WHERE where_condition]
[GROUP BY col_list]
[ORDER BY col_list]
[CLUSTER BY col_list
| [DISTRIBUTE BY col_list] [SORT BY col_list]
]
[LIMIT number]
WHERE
Similar to SQL Extended RLIKE supports regular expressions
sort
order by
Global sorting, only one ...
Posted by kingconnections on Sun, 16 Jan 2022 10:29:28 +0100
Hive learning notes - Chapter 7 functions
1. System built-in functions
(1) View the functions provided by the system
show functions;
(2) Displays the usage of the built-in function
desc function upper;
(3) Displays the usage of the built-in function in detail
desc function extended upper;
2. Common built-in functions
2.1 empty field assignment
(1) Function description
NVL: ...
Posted by holstead on Sun, 16 Jan 2022 06:56:57 +0100
Hive file storage format
Hive supports the following formats for storing data: TEXTFILE (row storage), sequencefile (row storage), ORC (column storage) and PARQUET (column storage)
1: Column storage and row storage
The left side of the figure above is a logical table, the first one on the right is row storage, and the second one is column storage.
The storage ...
Posted by Lol5916 on Fri, 31 Dec 2021 12:59:52 +0100
Hive common function summary
function
View all built-in functions
show functions;
How to use query function
desc function [extended]Detailed display function name
UDF one in one out is measured by lineUDAF enters one more placeUDTF one in many out
UDF
NVL: assign a value to the data whose value is NULL. Its format is NVL (value, default_value). Its function is to ...
Posted by jerk on Thu, 30 Dec 2021 17:53:54 +0100
Hive SQL optimization ideas
Hive optimization is mainly divided into: configuration optimization, SQL statement optimization, task optimization and other schemes. Among them, SQL optimization may be mainly involved in the development process.The core idea of optimization is:Reduce the amount of data (such as partitioning, column clipping)Avoid data skew (e.g. adding param ...
Posted by ESCForums.com on Thu, 30 Dec 2021 12:47:56 +0100