1, Case analysis
Table structure of query
Get data from multiple tables:
-- Wrong query method: query employees Tabular id and departments Name of the table SELECT employee_id, department_name FROM employees, departments; # 2889 records, there can't be so many, it's wrong SELECT * FROM employees; #107 SELECT 2889/107 FROM DUAL; #27 SELECT * FROM departments; #27
Reason: Cartesian product is carried out, and the connection conditions of multiple tables are missing
# Cross connection is the Cartesian product SELECT employee_id, department_name FROM employees CROSS JOIN departments; #2889
Correct query
-- Correct query SELECT employee_id, department_name FROM employees, departments #Add conditions WHERE employees.department_id=departments.department_id; #106, one is null, so one is missing
The error of Cartesian product will occur under the following conditions
- Omit join conditions (or association conditions) for multiple tables
- The connection condition (or association condition) is invalid
- All rows in all tables are connected to each other
Extension: when there are the same columns in the table, prefix the column name with the table name
2, Explanation of multi table query classification
Category 1: equivalent connection vs non equivalent connection
Equivalent connection
Equivalent connection refers to the connection between different tables with the same conditions, as follows
SELECT employees.employee_id, employees.last_name, employees.department_id, departments.department_id, departments.location_id FROM employees, departments WHERE employees.department_id = departments.department_id;
Summary: to connect n tables, at least n-1 connection conditions are required. For example, connecting three tables requires at least two connection conditions.
Non equivalent connection
The two tables are not connected by equal conditions, but are queried by field judgment
SELECT e.last_name, e.salary, j.grade_level FROM employees e, job_grades j # WHERE e.salary BETWEEN j.lowest_sal AND j.highest_sal; WHERE e.salary >= j.lowest_sal AND e.salary <= j.highest_sal
Category 2: self connected vs non self connected
Self connection: that is to query yourself. For example, in this employee table, each employee has a supervisor id, and this supervisor id is queried through this employee table.
When table1 and table2 are essentially the same table, they are only virtualized into two tables by taking aliases to represent different meanings. Then the two tables are queried for internal connection and external connection.
-- Query employees in employee table id,Employee name, manager id,Manager name SELECT emp.employee_id ,emp.last_name,mgr.employee_id,mgr.last_name FROM employees emp ,employees mgr WHERE emp.manager_id = mgr.employee_id;
Title: query the employees table and return "Xxx works for Xxx"
# CONCAT splice SELECT CONCAT(worker.last_name ,' works for ' , manager.last_name) FROM employees worker, employees manager WHERE worker.manager_id = manager.employee_id ;
Non self join means that multiple tables join queries
Category 3: internal connection vs external connection
Inner connection
Merge rows of more than two tables with the same column. The result set does not contain rows that do not match one table with another, that is, only those that meet the conditions will be queried
-- Internal connection: here employees There are 107 records, but only 106 data are queried, because there is no record on one day department_id -- If the data does not meet the conditions, it will not be queried. This is inner connection SELECT employee_id,department_name FROM employees,departments WHERE employees.department_id=departments.department_id;
External connection
In addition to the rows that meet the connection conditions, two tables also return the rows that do not meet the conditions in the left (or right) table. This connection is called left (or right) outer connection. When there is no matching row, the corresponding column in the result table is empty (NULL).
Classification of external connection: left external connection, right external connection and full external connection
- If it is a left outer connection, the left table in the connection condition is also called the master table, and the right table is called the slave table.
- If it is a right outer connection, the right table in the connection condition is also called the master table, and the left table is called the slave table.
3, What are the common SQL standards
When we begin to talk about the types of join tables, we first need to know that SQL has different versions of standard specifications, because the table join operations under different specifications are different.
SQL has two main standards, SQL92 and SQL99. 92 and 99 represent the time when the standard was proposed. SQL92 is the standard specification proposed in 92. Of course, in addition to SQL92 and SQL99, there are other standards such as SQL-86, SQL-89, SQL:2003, SQL:2008, SQL:2011 and SQL:2016.
So many standards, which should we learn? In fact, the most important SQL standards are SQL92 and SQL99. Generally speaking, the form of SQL92 is simpler, but the SQL statements written will be longer and less readable. Compared with SQL92, SQL99 has more complex syntax and better readability. We can also see from the number of pages published by these two standards that SQL92 has 500 pages, while SQL99 has more than 1000 pages. In fact, after SQL99, few people can master all the content, because there are really too many. Just like when we use Windows, Linux and Office, few people can master all the content. We only need to master some core functions to meet the needs of daily work.
**SQL92 and SQL99 are classic SQL standards, also known as SQL-2 and SQL-3 standards respectively** It is also after the release of these two standards that SQL has become more and more influential, even beyond the database field. Nowadays, SQL is not only the mainstream language in the field of database, but also the mainstream language of information processing in the field of information. The use of SQL language can be seen in graphics retrieval, image retrieval and voice retrieval.
SQL92: create connection using (+)
-
In SQL92, (+) is used to represent the location of the slave table. That is, in the left or right outer join, (+) indicates which is the slave table.
-
Oracle supports SQL92 well, while MySQL does not support the external connection of SQL92.
-- Exercise: query all employees last_name,department_name information -- This is in Oracle Database to execute #SQL92 syntax inner join SELECT last_name,department_name FROM employees,departments WHERE employees.department_id=departments.department_id; #Left outer connection SELECT last_name,department_name FROM employees ,departments WHERE employees.department_id = departments.department_id(+); #Right outer connection SELECT last_name,department_name FROM employees ,departments WHERE employees.department_id(+) = departments.department_id;
- Moreover, in SQL92, there are only left external connection and right external connection, and there is no full (or full) external connection.
Implementation of multi table query with SQL99 syntax
graphic
1. Basic grammar
- Use the JOIN... ON clause to create the syntax structure of the JOIN:
SELECT table1.column, table2.column,table3.column FROM table1 JOIN table2 ON table1 and table2 Connection conditions JOIN table3 ON table2 and table3 Connection conditions
The nested structure adopted by SQL99 is very refreshing, hierarchical and readable. Even if more tables are connected, they are clearly visible. If you use SQL92, readability will be greatly reduced.
-
Syntax description:
- You can use the ON clause to specify additional join conditions.
- This connection condition is separate from other conditions.
- The ON clause makes the statement more readable.
- The keywords JOIN, INNER JOIN and CROSS JOIN have the same meaning, and all represent internal connections
2. Implementation of inner join
grammar
SELECT Field list FROM A surface INNER JOIN B surface ON Association condition WHERE Other clauses such as;
3. Implementation of left outer join
#The query result is A SELECT Field list FROM A surface LEFT JOIN B surface ON Association condition WHERE Other clauses such as;
4. Implementation of right outer join
#The query result is B SELECT Field list FROM A surface RIGHT JOIN B surface ON Association condition WHERE Other clauses such as;
It should be noted that LEFT JOIN and RIGHT JOIN only exist in SQL99 and later standards. They do not exist in SQL92 and can only be represented by (+).
5. Full outer join
- The result of full external connection = data matched by the left and right tables + data not matched by the left table + data not matched by the right table.
- SQL99 supports full external connections. Use FULL JOIN or FULL OUTER JOIN to implement.
- It should be noted that MySQL does not support FULL JOIN, but LEFT JOIN UNION RIGHT JOIN can be used instead.
6. Practical operation
-- Exercise: query all employees last_name,department_name information #SQL92 syntax inner join SELECT last_name,department_name FROM employees,departments WHERE employees.department_id=departments.department_id; #SQL92 syntax outer join: Using+ #SQL99 syntax inner connection SELECT last_name,department_name FROM employees INNER JOIN departments ON employees.department_id=departments.department_id;#Article 106 #SQL99 syntax left outer connection SELECT last_name,department_name FROM employees LEFT JOIN departments ON employees.department_id=departments.department_id;#Article 107 #SQL99 syntax right outer connection SELECT last_name,department_name FROM employees RIGHT JOIN departments ON employees.department_id=departments.department_id;#Article 122 #SQL99 syntax full external connection, MySQL does not support FULL JOIN, so the execution failed SELECT last_name,department_name FROM employees FULL JOIN departments ON employees.department_id=departments.department_id;
4, Use of UNION
Merge query results
Using the UNION keyword, you can give multiple SELECT statements and combine their results into a single result set. When merging, the number of columns and data types corresponding to the two tables must be the same and correspond to each other. SELECT statements are separated by the UNION or UNION ALL keyword.
Syntax format:
SELECT column,... FROM table1 UNION [ALL] SELECT column,... FROM table2
The UNION operator returns the UNION of the result sets of two queries to remove duplicate records.
The UNION ALL operator returns the union of the result sets of two queries. For the duplicate parts of the two result sets, the duplicate is not removed.
Note: less resources are required to execute the UNION ALL statement than the UNION statement. If it is clear that there is no duplicate data in the result data after merging data, or there is no need to remove the duplicate data, try to use the UNION ALL statement to improve the efficiency of data query.
For example: query employee information with department number > 90 or email containing a
SELECT * FROM employees WHERE email LIKE '%a%' UNION SELECT * FROM employees WHERE department_id>90;
Query the information of male users in China and middle-aged male users in the United States
SELECT id,cname FROM t_chinamale WHERE csex='male' UNION ALL SELECT id,tname FROM t_usmale WHERE tGender='male';
5, Implementation of 7 kinds of SQL JOINS
code implementation
Number of data entries in these two tables
Middle figure: inner connection A ∩ B
SELECT employee_id,last_name,department_name FROM employees e INNER JOIN departments d on e.department_id = d.department_id;#Article 106
Top left: left outer connection
SELECT employee_id,last_name,department_name FROM employees e left JOIN departments d on e.department_id = d.department_id; #Article 107
Top right: right outer connection
SELECT employee_id,last_name,department_name FROM employees e RIGHT JOIN departments d on e.department_id = d.department_id; #Article 122
Middle left: A - A ∩ B
SELECT employee_id,last_name,department_name FROM employees e LEFT JOIN departments d on e.department_id = d.department_id WHERE e.department_id IS NULL; # Article 1
Middle right: B-A ∩ B
SELECT employee_id,last_name,department_name FROM employees e RIGHT JOIN departments d ON e.`department_id` = d.`department_id` WHERE e.`department_id` IS NULL
Lower left figure: full external connection = middle left figure + upper right figure A ∪ B
That is, left join query and right join query through unino connection
SELECT employee_id,last_name,department_name FROM employees e LEFT JOIN departments d ON e.`department_id` = d.`department_id` WHERE d.`department_id` IS NULL UNION ALL #No weight removal operation, high efficiency SELECT employee_id,last_name,department_name FROM employees e RIGHT JOIN departments d ON e.`department_id` = d.`department_id`;
Bottom right
Equivalent to middle left figure + middle right figure A ∪ B- A ∩ B or (A - A ∩ B) ∪ (B- A ∩ B)
SELECT employee_id,last_name,department_name FROM employees e LEFT JOIN departments d ON e.`department_id` = d.`department_id` WHERE d.`department_id` IS NULL UNION ALL SELECT employee_id,last_name,department_name FROM employees e RIGHT JOIN departments d ON e.`department_id` = d.`department_id` WHERE e.`department_id` IS NULL
Grammar summary
- Left middle picture
#Achieve a - a ∩ B select Field list from A surface left join B surface on Association condition where Associate fields from table is null and Other clauses such as;
- Right middle figure
#Realize B - a ∩ B select Field list from A surface right join B surface on Association condition where Associate fields from table is null and Other clauses such as;
- Bottom left
#The query result is A ∪ B #Use A outside left, union B outside right select Field list from A surface left join B surface on Association condition where Other clauses such as union select Field list from A surface right join B surface on Association condition where Other clauses such as;
- Bottom right
#Implement A ∪ B - A ∩ B or (A - A ∩ B) ∪ (B - A ∩ B) #Use left outer (a - a ∩ b) union right outer (B - A ∩ b) select Field list from A surface left join B surface on Association condition where Associate fields from table is null and Other clauses such as union select Field list from A surface right join B surface on Association condition where Associate fields from table is null and Other clauses such as
6, New features of SQL99 syntax (just understand)
Natural connection
SQL99 provides some special syntax based on SQL92. For example, NATURAL JOIN is used to represent NATURAL JOIN. We can understand natural connection as equivalent connection in SQL92. It will help you automatically query all the same fields in the two connection tables, and then make equivalent connection.
Note: there will be several fields corresponding to the conditions of several fields, which must meet the equivalent conditions
In the SQL92 standard:
SELECT employee_id,last_name,department_name FROM employees e JOIN departments d ON e.`department_id` = d.`department_id` AND e.`manager_id` = d.`manager_id`;
In SQL99, you can write:
SELECT employee_id,last_name,department_name FROM employees e NATURAL JOIN departments d;
USING connection
When we connect, SQL99 also supports equivalent connection USING the fields with the same name in the USING specified data table. However, it can only be used together with JOIN, and it is not applicable to self connection. For example:
SELECT employee_id,last_name,department_name FROM employees e JOIN departments d USING (department_id);
You can see that different from the NATURAL JOIN, USING specifies the specific same field name. You need to fill in the * * same name * * field to be specified in the brackets () of USING. Use join at the same time USING can simplify the equivalent connection of JOIN ON. It is the same as the following SQL query results:
SELECT employee_id,last_name,department_name FROM employees e ,departments d WHERE e.department_id = d.department_id;
7. Chapter summary
There are three ways to constrain table joins: where, on, and using
-
WHERE: applicable to all associated queries
-
ON: can only be used with JOIN, and can only write association conditions. Although Association conditions can be written together with other conditions in WHERE, it is better to write them separately.
-
USING: it can only be used with JOIN, and the names of the two associated fields are required to be the same in the associated table, and it can only mean that the values of the associated fields are equal
#Association condition #Write the association condition after where SELECT last_name,department_name FROM employees,departments WHERE employees.department_id = departments.department_id; #Write the association condition after on, which can only be used with JOIN SELECT last_name,department_name FROM employees INNER JOIN departments ON employees.department_id = departments.department_id; SELECT last_name,department_name FROM employees CROSS JOIN departments ON employees.department_id = departments.department_id; SELECT last_name,department_name FROM employees JOIN departments ON employees.department_id = departments.department_id; #Write the associated field in using(), which can only be used with JOIN #Moreover, the associated fields in the two tables must have the same name and can only represent= #Query employee name and basic salary SELECT last_name,job_title FROM employees INNER JOIN jobs USING(job_id); #n tables are associated, and n-1 Association conditions are required #Query employee name, basic salary and department name SELECT last_name,job_title,department_name FROM employees,departments,jobs WHERE employees.department_id = departments.department_id AND employees.job_id = jobs.job_id; SELECT last_name,job_title,department_name FROM employees INNER JOIN departments INNER JOIN jobs ON employees.department_id = departments.department_id AND employees.job_id = jobs.job_id;
be careful:
We need to control the number of join tables. Multi table joins are like nested for loops. They consume resources and degrade the performance of SQL queries. Therefore, do not join unnecessary tables. In many DBMS, there is also the limit of the maximum connection table.
[mandatory] join is prohibited for more than three tables. The data types of the fields that need to be joined are absolutely consistent; In multi table Association query, ensure that the associated fields need to have indexes.
Note: pay attention to the table index and SQL performance even if the double table join is used.
Source: Alibaba Java development manual