Installation and use of hive

Posted by simenss on Thu, 27 Jan 2022 12:27:34 +0100

1. Install hive

1.1 installing java

Java must be installed on the system before Hive can be installed. Use the following command to verify whether Java has been installed. If Java has been installed on the system, you can see the following response

stay java official website Download and install as a fool

1.2 hadoop installation

download hadoop
Unzip it into the specified directory and configure the environment variables (mac system) as follows

  • Execute VIM ~ / bash_ Modify the environment variable of profile (it should be ~ /. bashrc under linux)
export HADOOP_HOME=/usr/local/hadoop 
export HADOOP_MAPRED_HOME=$HADOOP_HOME 
export HADOOP_COMMON_HOME=$HADOOP_HOME 
export HADOOP_HDFS_HOME=$HADOOP_HOME 
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export
PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
  • source ~/.bash_profile applies the current changes to the running system

Verify with the following command

hadoop configuration
Enter Hadoop configuration file directory $HADOOP_HOME/etc/hadoop

  • In order to use java to develop Hadoop projects, Java must be replaced by Java in the system_ Home value resets Hadoop env Java environment variables in SH file

    First use / usr/libexec/java_home -V view java installation path

    Modify Hadoop env SH file

  • Modify core site XML file, core site The XML file contains the following information, such as the memory allocated to the file system using the Hadoop instance, the memory limit port number for storing data, and the size of the read / write buffer

    Open core site XML file and add the following attributes between tags

    <configuration>
    
    <property> 
      <name>fs.default.name</name> 
      <value>hdfs://localhost:9000</value> 
    </property>
    
    </configuration>
    
  • Modify HDFS site XML file, HDFS site The XML file contains the following information, such as the value of the copied data, the path of the name node, and the path of the data node of the local file system

More configuration References: hive installation

Verify hadoop installation

1.3 hive installation

Download hive
download hive 2.3.9 , as shown in the figure below

Configure hive environment variables
Via VIM ~ / bash_ Profile, add the following statement

export HIVE_HOME=/Applications/apache-hive-2.3.9-bin
export PATH=$PATH:$HIVE_HOME/bin
export CLASSPATH=$CLASSPATH:/Applications/hadoop-3.3.1/lib/*:.
export CLASSPATH=$CLASSPATH:/Applications/apache-hive-2.3.9-bin/lib/*:.

Execute source ~ / bash_ Profile to make it effective

Configure hive

cd $HIVE_HOME/conf
cp hive-env.sh.template hive-env.sh

Edit HIV Env SH file add the following line

export HADOOP_HOME=/Applications/hadoop-3.3.1

1.4 download and install Apache Derby

As above, Hive installation completed successfully. Now, you need an external database server to configure the Metastore. We use the Apache Derby database.

2. hive use

  • Create database

    CREATE DATABASE [IF NOT EXISTS] <database name>;
    
  • query data base

    SHOW DATABASES;
    
  • Delete database

    DROP DATABASE [IF EXISTS] <database name>;
    
  • Create table

    CREATE TABLE [IF NOT EXISTS] [db_name.] table_name
    [(col_name data_type, col_name2 data_type2, ...)]
    [COMMENT table_comment]
    [ROW FORMAT row_format]
    [STORED AS file_format]
    

    For example, create a table of employees
    CREATE TABLE IF NOT EXISTS employee ( id int, name String, salary String, destination String);

  • insert data
    After creating a table in SQL, you can use INSERT statement to INSERT data. In Hive, you can use LOAD DATA statement to INSERT data

    LOAD DATA [LOCAL] INPATH <filepath> [OVERWRITE] INTO TABLE <tablename>
    

    LOCAL is the identifier specifying the LOCAL path; OVERWRITE overwrites the data in the table;

  • Change the name of the table

    ALTER TABLE <name> RENAME TO <new_name>
    
  • Add a column

    ALTER TABLE <name> ADD COLUMNS (col_spec[, col_spec ...])
    
  • Delete a column

    ALTER TABLE <name> DROP [COLUMN] <column_name>
    
  • Change the column name / type of a column

    ALTER TABLE <name> CHANGE <column_name> <new_name new_type>
    
  • Delete table

    DROP TABLE [IF EXISTS] table_name;
    
  • Query statement select

    SELECT [ALL | DISTINCT] <select_expr>, <select_expr>, ... 
    FROM <table_name>
    [WHERE <condition>] 
    [GROUP BY col_list] 
    [HAVING having_condition] 
    [CLUSTER BY col_list | [DISTRIBUTE BY col_list] [SORT BY col_list]] 
    [LIMIT number];
    
  • Query and sort

    SELECT [ALL | DISTINCT] <select_expr1>, <select_expr2>, ... 
    FROM <table_name> 
    [WHERE where_condition] 
    [GROUP BY col_list] 
    [HAVING having_condition] 
    ORDER BY <column_list>
    [LIMIT number];
    

From the above syntax, we can see that the syntax of hive is very similar to that of mysql

reference resources

1. Yibai tutorial hive

Topics: Java Hadoop hive