Installation and use of hive

1. Install hive 1.1 installing java Java must be installed on the system before Hive can be installed. Use the following command to verify whether Java has been installed. If Java has been installed on the system, you can see the following response stay java official website Download and install as a fool 1.2 hadoop installation downl ...

Posted by simenss on Thu, 27 Jan 2022 12:27:34 +0100

Syntax comparison between Presto and Hive

Some time conversion problems are often encountered in work: 1) log_date:20200110 needs to be converted to standard date or compared with timestamp data 2) The working environment involves Presto and hive. It is faster to check the query with presto. Therefore, it is generally necessary to convert the date with the syntax of Presto and hive a ...

Posted by Lord Brar on Tue, 25 Jan 2022 23:31:57 +0100

Hive data type, database related operations, table related operations, data import and export

Hive data type 1. Basic data type 2. Collection data type Case practice (1) Assuming that a table has the following row, we use JSON format to represent its data structure. The format accessed under Hive is { "name": "songsong", "friends": ["bingbing" , "lili"] , //List Array, "children": { //Key value Map, "xiao song": 18 , ...

Posted by luanne on Sun, 23 Jan 2022 00:23:28 +0100

HiveSql interview question 29 -- find the maximum number of people online and the peak time period [accumulator idea, timing analysis]

catalogue 0 demand analysis 1 data preparation 2 data analysis 3 Summary 0 demand analysis The data is the anchor ID,stt represents the start time and edt represents the next time. idsttedt10012021-06-14 12:12:122021-06-14 18:12:1210032021-06-14 13:12:122021-06-14 16:12:1210042021-06-14 13:15:122021-06-14 20:12:1210022021-06-14 15:12:12 ...

Posted by Mzor on Sat, 22 Jan 2022 04:45:39 +0100

The process of spark sql reading and writing hive

Hive related configuration is required for Spark sql to read and write hive, so hive site is generally used The XML file is placed in the conf directory of spark. Code calls are simple. The key is the source code analysis process and how spark interacts with hive. 1. Code call Read hive code SparkSession sparkSession = SparkSession.builder() ...

Posted by Brian W on Tue, 18 Jan 2022 02:41:37 +0100

Introduction to big data -- Hive data query

grammar SELECT [ALL | DISTINCT] select_expr, select_expr, ... FROM table_reference [WHERE where_condition] [GROUP BY col_list] [ORDER BY col_list] [CLUSTER BY col_list | [DISTRIBUTE BY col_list] [SORT BY col_list] ] [LIMIT number] WHERE Similar to SQL Extended RLIKE supports regular expressions sort order by Global sorting, only one ...

Posted by kingconnections on Sun, 16 Jan 2022 10:29:28 +0100

Hive learning notes - Chapter 7 functions

1. System built-in functions (1) View the functions provided by the system show functions; (2) Displays the usage of the built-in function desc function upper; (3) Displays the usage of the built-in function in detail desc function extended upper; 2. Common built-in functions 2.1 empty field assignment (1) Function description NVL: ...

Posted by holstead on Sun, 16 Jan 2022 06:56:57 +0100

Hive file storage format

Hive supports the following formats for storing data: TEXTFILE (row storage), sequencefile (row storage), ORC (column storage) and PARQUET (column storage) 1: Column storage and row storage   The left side of the figure above is a logical table, the first one on the right is row storage, and the second one is column storage. The storage ...

Posted by Lol5916 on Fri, 31 Dec 2021 12:59:52 +0100

Hive common function summary

function View all built-in functions show functions; How to use query function desc function [extended]Detailed display function name UDF one in one out is measured by lineUDAF enters one more placeUDTF one in many out UDF NVL: assign a value to the data whose value is NULL. Its format is NVL (value, default_value). Its function is to ...

Posted by jerk on Thu, 30 Dec 2021 17:53:54 +0100

Hive SQL optimization ideas

Hive optimization is mainly divided into: configuration optimization, SQL statement optimization, task optimization and other schemes. Among them, SQL optimization may be mainly involved in the development process.The core idea of optimization is:Reduce the amount of data (such as partitioning, column clipping)Avoid data skew (e.g. adding param ...

Posted by ESCForums.com on Thu, 30 Dec 2021 12:47:56 +0100