The following script is used for domestic brew installation:
/bin/zsh -c "$(curl -fsSL https://gitee.com/cunkai/HomebrewCN/raw/master/Homebrew.sh)"
brew install hive
Hide execute sql statement
hive can directly convert sql statements into mapreduce code (the link below is very good)
Because the company has a configured HIVE system, it is not necessary to install it on this machine.
So go straight to the SQL tutorial
SQL is case insensitive
Query and update instructions form the DML part of SQL:
SELECT - get data from database table
UPDATE - updates the data in the database table
DELETE - deletes data from a database table
INSERT INTO - inserts data into a database table
The most important DDL statements in SQL:
CREATE DATABASE - create a new database
ALTER DATABASE - modify database
CREATE TABLE - create a new table
ALTER TABLE - change (Alter) database tables
DROP TABLE - delete table
CREATE INDEX - CREATE INDEX (search key)
DROP INDEX - DROP INDEX
SELECT LastName,FirstName FROM Persons #Select all columns SELECT * FROM Persons #Returns a unique value SELECT DISTINCT Column name FROM Table name
SELECT * FROM Persons WHERE City='Beijing' #About the value of a value This is true: SELECT * FROM Persons WHERE FirstName='Bush' SELECT * FROM Persons WHERE Year>1965
AND and OR
#Use parentheses to use and and or together SELECT * FROM Persons WHERE (FirstName='Thomas' OR FirstName='William') AND LastName='Carter'
SELECT Company, OrderNumber FROM Orders ORDER BY Company DESC, OrderNumber ASC
INSERT INTO Persons (LastName, Address) VALUES ('Wilson', 'Champs-Elysees')
# UPDATE table name SET column name = new value WHERE column name = a value UPDATE Person SET Address = 'Zhongshan 23', City = 'Nanjing' WHERE LastName = 'Wilson'
#Select the first few or the first 50% SELECT TOP 2 * FROM Persons SELECT TOP 50 PERCENT * FROM Persons
Now, we want to start from the above "Persons" Select the person who lives in "N" People in the beginning City SELECT * FROM Persons WHERE City LIKE 'N%' Next, we want to start from "Persons" Select the person who lives in "g" People in the city at the end: SELECT * FROM Persons WHERE City LIKE '%g
Specify the selected value range SELECT * FROM Persons WHERE LastName IN ('Adams','Carter')
Specify data range SELECT column_name(s) FROM table_name WHERE column_name BETWEEN value1 AND value2 To display in alphabetical order between "Adams"((including) and "Carter"((not included), please use the following SQL: SELECT * FROM Persons WHERE LastName BETWEEN 'Adams' AND 'Carter'
Suppose we have two tables:"Persons" and "Product_Orders". We assign aliases to each of them "p" and "po". Now, we want to list "John Adams" All orders. SELECT po.OrderID, p.LastName, p.FirstName FROM Persons AS p, Product_Orders AS po WHERE p.LastName='Adams' AND p.FirstName='John'
Aliases are used for easy reading and to simplify sql statements
where and join
**where** SELECT Persons.LastName, Persons.FirstName, Orders.OrderNo FROM Persons, Orders WHERE Persons.Id_P = Orders.Id_P **join** SELECT Persons.LastName, Persons.FirstName, Orders.OrderNo FROM Persons INNER JOIN Orders ON Persons.Id_P = Orders.Id_P ORDER BY Persons.LastName
inner join: understood as "effective join", the data in both tables will be displayed
left join: understood as "left display". For example, if on a.field=b.field, all data in table a and data in both a and B will be displayed. Data in a and not in B will be displayed as null
right join: understood as "right display". For example, if on a.field=b.field, all data in table B and data in a and B will be displayed, and data in B and not in a will be displayed as null
Full join: understood as "full join". All data in the two tables are displayed. In fact, it is inner + (left inner) + (right inner)
Join is equivalent to inner join is equivalent to where
Now, we want to list all the people, their orders, all the orders, and the people who ordered them.
SELECT Persons.LastName, Persons.FirstName, Orders.OrderNo FROM Persons FULL JOIN Orders ON Persons.Id_P=Orders.Id_P ORDER BY Persons.LastName
The FULL JOIN keyword returns all rows from the left table (Persons) and the right table (Orders). If the rows in "Persons" do not match in the table "Orders", or if the rows in "Orders" do not match in the table "Persons", these rows are also listed.
The UNION operator is used to combine the result sets of two or more SELECT statements.
# Only two tables with different values are returned SELECT column_name(s) FROM table_name1 UNION SELECT column_name(s) FROM table_name2 # Return all values of two tables SELECT column_name(s) FROM table_name1 UNION ALL SELECT column_name(s) FROM table_name2
Tables and databases
CREATE DATABASE database_name
This example demonstrates how to create a file named "Person" Table of. The table contains five columns with the following names:"Id_P","LastName","FirstName","Address" as well as "City": CREATE TABLE Persons ( Id_P int, LastName varchar(255), FirstName varchar(255), Address varchar(255), City varchar(255) )
If a column in the table is optional, we can insert new records or update existing records without adding values to the column. This means that the field will be saved with a NULL value.
NULL values are handled differently from other values.
NULL is used as a placeholder for unknown or inapplicable values.
# How can we just select records with NULL values in the "Address" column? SELECT LastName,FirstName,Address FROM Persons WHERE Address IS NULL # How do we select records without NULL value in the "Address" column? SELECT LastName,FirstName,Address FROM Persons WHERE Address IS NOT NULL
SELECT function(column) FROM surface
The FIRST() function returns the value of the first record in the specified field.
Tip: you can use the ORDER BY statement to sort records.
SELECT FIRST(OrderPrice) AS FirstOrderPrice FROM Orders
And Max, min, sum
The last thing I understand is that there is no more than one row of data returned by group
Adding HAVING clause in SQL because WHERE keyword cannot be used with aggregate function.
SELECT column_name, aggregate_function(column_name) FROM table_name WHERE column_name operator value GROUP BY column_name HAVING aggregate_function(column_name) operator value
Now we want to find that the customer "Bush" or "Adams" has a total order amount of more than 1500.
We added a common WHERE clause to the SQL statement:
SELECT Customer,SUM(OrderPrice) FROM Orders WHERE Customer='Bush' OR Customer='Adams' GROUP BY Customer HAVING SUM(OrderPrice)>1500