MySQL multi table query

Posted by dm3 on Mon, 24 Jan 2022 15:07:26 +0100

1, Case analysis

Table structure of query


Get data from multiple tables:

-- Wrong query method: query employees Tabular id and departments Name of the table
SELECT employee_id, department_name
FROM employees, departments; # 2889 records, there can't be so many, it's wrong	


SELECT * 
FROM employees; #107

SELECT 2889/107
FROM DUAL;  #27

SELECT *
FROM departments; #27

Reason: Cartesian product is carried out, and the connection conditions of multiple tables are missing

# Cross connection is the Cartesian product

SELECT employee_id, department_name
FROM employees CROSS JOIN departments; #2889

Correct query

-- Correct query
SELECT employee_id, department_name
FROM employees, departments
#Add conditions
WHERE employees.department_id=departments.department_id; #106, one is null, so one is missing

The error of Cartesian product will occur under the following conditions

  • Omit join conditions (or association conditions) for multiple tables
  • The connection condition (or association condition) is invalid
  • All rows in all tables are connected to each other

Extension: when there are the same columns in the table, prefix the column name with the table name

2, Explanation of multi table query classification

Category 1: equivalent connection vs non equivalent connection

Equivalent connection

Equivalent connection refers to the connection between different tables with the same conditions, as follows

SELECT employees.employee_id, employees.last_name, 
       employees.department_id, departments.department_id,
       departments.location_id
FROM   employees, departments
WHERE  employees.department_id = departments.department_id;

Summary: to connect n tables, at least n-1 connection conditions are required. For example, connecting three tables requires at least two connection conditions.

Non equivalent connection

The two tables are not connected by equal conditions, but are queried by field judgment

SELECT e.last_name, e.salary, j.grade_level
FROM   employees e, job_grades j
# WHERE  e.salary BETWEEN j.lowest_sal AND j.highest_sal;
WHERE e.salary >= j.lowest_sal AND e.salary <= j.highest_sal

Category 2: self connected vs non self connected

Self connection: that is to query yourself. For example, in this employee table, each employee has a supervisor id, and this supervisor id is queried through this employee table.


When table1 and table2 are essentially the same table, they are only virtualized into two tables by taking aliases to represent different meanings. Then the two tables are queried for internal connection and external connection.

-- Query employees in employee table id,Employee name, manager id,Manager name
SELECT emp.employee_id ,emp.last_name,mgr.employee_id,mgr.last_name
FROM employees emp ,employees mgr 
WHERE emp.manager_id = mgr.employee_id;

Title: query the employees table and return "Xxx works for Xxx"

# CONCAT splice
SELECT CONCAT(worker.last_name ,' works for ' 
       , manager.last_name)
FROM   employees worker, employees manager
WHERE  worker.manager_id = manager.employee_id ;

Non self join means that multiple tables join queries

Category 3: internal connection vs external connection

Inner connection

Merge rows of more than two tables with the same column. The result set does not contain rows that do not match one table with another, that is, only those that meet the conditions will be queried

-- Internal connection: here employees There are 107 records, but only 106 data are queried, because there is no record on one day department_id
-- If the data does not meet the conditions, it will not be queried. This is inner connection
SELECT employee_id,department_name
FROM employees,departments
WHERE employees.department_id=departments.department_id;

External connection

In addition to the rows that meet the connection conditions, two tables also return the rows that do not meet the conditions in the left (or right) table. This connection is called left (or right) outer connection. When there is no matching row, the corresponding column in the result table is empty (NULL).

Classification of external connection: left external connection, right external connection and full external connection

  • If it is a left outer connection, the left table in the connection condition is also called the master table, and the right table is called the slave table.
  • If it is a right outer connection, the right table in the connection condition is also called the master table, and the left table is called the slave table.

3, What are the common SQL standards

When we begin to talk about the types of join tables, we first need to know that SQL has different versions of standard specifications, because the table join operations under different specifications are different.

SQL has two main standards, SQL92 and SQL99. 92 and 99 represent the time when the standard was proposed. SQL92 is the standard specification proposed in 92. Of course, in addition to SQL92 and SQL99, there are other standards such as SQL-86, SQL-89, SQL:2003, SQL:2008, SQL:2011 and SQL:2016.

So many standards, which should we learn? In fact, the most important SQL standards are SQL92 and SQL99. Generally speaking, the form of SQL92 is simpler, but the SQL statements written will be longer and less readable. Compared with SQL92, SQL99 has more complex syntax and better readability. We can also see from the number of pages published by these two standards that SQL92 has 500 pages, while SQL99 has more than 1000 pages. In fact, after SQL99, few people can master all the content, because there are really too many. Just like when we use Windows, Linux and Office, few people can master all the content. We only need to master some core functions to meet the needs of daily work.

**SQL92 and SQL99 are classic SQL standards, also known as SQL-2 and SQL-3 standards respectively** It is also after the release of these two standards that SQL has become more and more influential, even beyond the database field. Nowadays, SQL is not only the mainstream language in the field of database, but also the mainstream language of information processing in the field of information. The use of SQL language can be seen in graphics retrieval, image retrieval and voice retrieval.

SQL92: create connection using (+)

  • In SQL92, (+) is used to represent the location of the slave table. That is, in the left or right outer join, (+) indicates which is the slave table.

  • Oracle supports SQL92 well, while MySQL does not support the external connection of SQL92.

-- Exercise: query all employees last_name,department_name information 
-- This is in Oracle Database to execute
#SQL92 syntax inner join
SELECT last_name,department_name
FROM employees,departments
WHERE employees.department_id=departments.department_id;
#Left outer connection
SELECT last_name,department_name
FROM employees ,departments
WHERE employees.department_id = departments.department_id(+);

#Right outer connection
SELECT last_name,department_name
FROM employees ,departments
WHERE employees.department_id(+) = departments.department_id;
  • Moreover, in SQL92, there are only left external connection and right external connection, and there is no full (or full) external connection.

Implementation of multi table query with SQL99 syntax

graphic

1. Basic grammar

  • Use the JOIN... ON clause to create the syntax structure of the JOIN:
  SELECT table1.column, table2.column,table3.column
  FROM table1
      JOIN table2 ON table1 and table2 Connection conditions
          JOIN table3 ON table2 and table3 Connection conditions

The nested structure adopted by SQL99 is very refreshing, hierarchical and readable. Even if more tables are connected, they are clearly visible. If you use SQL92, readability will be greatly reduced.

  • Syntax description:

    • You can use the ON clause to specify additional join conditions.
    • This connection condition is separate from other conditions.
    • The ON clause makes the statement more readable.
    • The keywords JOIN, INNER JOIN and CROSS JOIN have the same meaning, and all represent internal connections

2. Implementation of inner join

grammar

SELECT Field list
FROM A surface INNER JOIN B surface
ON Association condition
WHERE Other clauses such as;

3. Implementation of left outer join

#The query result is A
SELECT Field list
FROM A surface LEFT JOIN B surface
ON Association condition
WHERE Other clauses such as;

4. Implementation of right outer join

#The query result is B
SELECT Field list
FROM A surface RIGHT JOIN B surface
ON Association condition
WHERE Other clauses such as;

It should be noted that LEFT JOIN and RIGHT JOIN only exist in SQL99 and later standards. They do not exist in SQL92 and can only be represented by (+).

5. Full outer join

  • The result of full external connection = data matched by the left and right tables + data not matched by the left table + data not matched by the right table.
  • SQL99 supports full external connections. Use FULL JOIN or FULL OUTER JOIN to implement.
  • It should be noted that MySQL does not support FULL JOIN, but LEFT JOIN UNION RIGHT JOIN can be used instead.

6. Practical operation

-- Exercise: query all employees last_name,department_name information 

#SQL92 syntax inner join
SELECT last_name,department_name
FROM employees,departments
WHERE employees.department_id=departments.department_id;

#SQL92 syntax outer join: Using+


#SQL99 syntax inner connection
SELECT last_name,department_name
FROM employees INNER JOIN departments
ON employees.department_id=departments.department_id;#Article 106

#SQL99 syntax left outer connection
SELECT last_name,department_name
FROM employees LEFT JOIN departments
ON employees.department_id=departments.department_id;#Article 107

#SQL99 syntax right outer connection
SELECT last_name,department_name
FROM employees RIGHT JOIN departments
ON employees.department_id=departments.department_id;#Article 122

#SQL99 syntax full external connection, MySQL does not support FULL JOIN, so the execution failed
SELECT last_name,department_name
FROM employees  FULL JOIN departments
ON employees.department_id=departments.department_id;

4, Use of UNION

Merge query results
Using the UNION keyword, you can give multiple SELECT statements and combine their results into a single result set. When merging, the number of columns and data types corresponding to the two tables must be the same and correspond to each other. SELECT statements are separated by the UNION or UNION ALL keyword.

Syntax format:

SELECT column,... FROM table1
UNION [ALL]
SELECT column,... FROM table2

The UNION operator returns the UNION of the result sets of two queries to remove duplicate records.

The UNION ALL operator returns the union of the result sets of two queries. For the duplicate parts of the two result sets, the duplicate is not removed.

Note: less resources are required to execute the UNION ALL statement than the UNION statement. If it is clear that there is no duplicate data in the result data after merging data, or there is no need to remove the duplicate data, try to use the UNION ALL statement to improve the efficiency of data query.

For example: query employee information with department number > 90 or email containing a

SELECT * FROM employees  WHERE email LIKE '%a%'
UNION
SELECT * FROM employees  WHERE department_id>90;

Query the information of male users in China and middle-aged male users in the United States

SELECT id,cname FROM t_chinamale WHERE csex='male'
UNION ALL
SELECT id,tname FROM t_usmale WHERE tGender='male';

5, Implementation of 7 kinds of SQL JOINS

code implementation

Number of data entries in these two tables

Middle figure: inner connection A ∩ B

SELECT employee_id,last_name,department_name
FROM employees e INNER JOIN departments d
on e.department_id = d.department_id;#Article 106

Top left: left outer connection

SELECT employee_id,last_name,department_name
FROM employees e left JOIN departments d
on e.department_id = d.department_id; #Article 107

Top right: right outer connection

SELECT employee_id,last_name,department_name
FROM employees e RIGHT JOIN departments d
on e.department_id = d.department_id; #Article 122

Middle left: A - A ∩ B

SELECT employee_id,last_name,department_name
FROM employees e LEFT JOIN departments d
on e.department_id = d.department_id
WHERE e.department_id IS NULL; # Article 1

Middle right: B-A ∩ B

SELECT employee_id,last_name,department_name
FROM employees e RIGHT JOIN departments d
ON e.`department_id` = d.`department_id`
WHERE e.`department_id` IS NULL

Lower left figure: full external connection = middle left figure + upper right figure A ∪ B
That is, left join query and right join query through unino connection

SELECT employee_id,last_name,department_name
FROM employees e LEFT JOIN departments d
ON e.`department_id` = d.`department_id`
WHERE d.`department_id` IS NULL
UNION ALL  #No weight removal operation, high efficiency
SELECT employee_id,last_name,department_name
FROM employees e RIGHT JOIN departments d
ON e.`department_id` = d.`department_id`;

Bottom right
Equivalent to middle left figure + middle right figure A ∪ B- A ∩ B or (A - A ∩ B) ∪ (B- A ∩ B)

SELECT employee_id,last_name,department_name
FROM employees e LEFT JOIN departments d
ON e.`department_id` = d.`department_id`
WHERE d.`department_id` IS NULL
UNION ALL
SELECT employee_id,last_name,department_name
FROM employees e RIGHT JOIN departments d
ON e.`department_id` = d.`department_id`
WHERE e.`department_id` IS NULL

Grammar summary

  • Left middle picture
#Achieve a - a ∩ B
select Field list
from A surface left join B surface
on Association condition
where Associate fields from table is null and Other clauses such as;
  • Right middle figure
#Realize B - a ∩ B
select Field list
from A surface right join B surface
on Association condition
where Associate fields from table is null and Other clauses such as;
  • Bottom left
#The query result is A ∪ B
#Use A outside left, union B outside right
select Field list
from A surface left join B surface
on Association condition
where Other clauses such as

union 

select Field list
from A surface right join B surface
on Association condition
where Other clauses such as;
  • Bottom right
#Implement A ∪ B - A ∩ B or (A - A ∩ B) ∪ (B - A ∩ B)
#Use left outer (a - a ∩ b) union right outer (B - A ∩ b)
select Field list
from A surface left join B surface
on Association condition
where Associate fields from table is null and Other clauses such as

union

select Field list
from A surface right join B surface
on Association condition
where Associate fields from table is null and Other clauses such as

6, New features of SQL99 syntax (just understand)

Natural connection

SQL99 provides some special syntax based on SQL92. For example, NATURAL JOIN is used to represent NATURAL JOIN. We can understand natural connection as equivalent connection in SQL92. It will help you automatically query all the same fields in the two connection tables, and then make equivalent connection.

Note: there will be several fields corresponding to the conditions of several fields, which must meet the equivalent conditions

In the SQL92 standard:

SELECT employee_id,last_name,department_name
FROM employees e JOIN departments d
ON e.`department_id` = d.`department_id`
AND e.`manager_id` = d.`manager_id`;

In SQL99, you can write:

SELECT employee_id,last_name,department_name
FROM employees e NATURAL JOIN departments d;

USING connection

When we connect, SQL99 also supports equivalent connection USING the fields with the same name in the USING specified data table. However, it can only be used together with JOIN, and it is not applicable to self connection. For example:

SELECT employee_id,last_name,department_name
FROM employees e JOIN departments d
USING (department_id);

You can see that different from the NATURAL JOIN, USING specifies the specific same field name. You need to fill in the * * same name * * field to be specified in the brackets () of USING. Use join at the same time USING can simplify the equivalent connection of JOIN ON. It is the same as the following SQL query results:

SELECT employee_id,last_name,department_name
FROM employees e ,departments d
WHERE e.department_id = d.department_id;

7. Chapter summary

There are three ways to constrain table joins: where, on, and using

  • WHERE: applicable to all associated queries

  • ON: can only be used with JOIN, and can only write association conditions. Although Association conditions can be written together with other conditions in WHERE, it is better to write them separately.

  • USING: it can only be used with JOIN, and the names of the two associated fields are required to be the same in the associated table, and it can only mean that the values of the associated fields are equal

#Association condition
#Write the association condition after where
SELECT last_name,department_name 
FROM employees,departments 
WHERE employees.department_id = departments.department_id;

#Write the association condition after on, which can only be used with JOIN
SELECT last_name,department_name 
FROM employees INNER JOIN departments 
ON employees.department_id = departments.department_id;

SELECT last_name,department_name 
FROM employees CROSS JOIN departments 
ON employees.department_id = departments.department_id;

SELECT last_name,department_name  
FROM employees JOIN departments 
ON employees.department_id = departments.department_id;

#Write the associated field in using(), which can only be used with JOIN
#Moreover, the associated fields in the two tables must have the same name and can only represent=
#Query employee name and basic salary
SELECT last_name,job_title
FROM employees INNER JOIN jobs USING(job_id);

#n tables are associated, and n-1 Association conditions are required
#Query employee name, basic salary and department name
SELECT last_name,job_title,department_name FROM employees,departments,jobs 
WHERE employees.department_id = departments.department_id 
AND employees.job_id = jobs.job_id;

SELECT last_name,job_title,department_name 
FROM employees INNER JOIN departments INNER JOIN jobs 
ON employees.department_id = departments.department_id 
AND employees.job_id = jobs.job_id;

be careful:

We need to control the number of join tables. Multi table joins are like nested for loops. They consume resources and degrade the performance of SQL queries. Therefore, do not join unnecessary tables. In many DBMS, there is also the limit of the maximum connection table.

[mandatory] join is prohibited for more than three tables. The data types of the fields that need to be joined are absolutely consistent; In multi table Association query, ensure that the associated fields need to have indexes.

Note: pay attention to the table index and SQL performance even if the double table join is used.

Source: Alibaba Java development manual

Topics: Database MySQL SQL