MySql connection query - internal connection and external connection

Posted by chipev on Sat, 05 Mar 2022 05:34:47 +0100

introduction

In actual business development, data interaction between multiple tables is often involved. At this time, single table query can no longer meet complex business requirements, so multi table connection query is needed;

Connection query is mainly divided into the following three types:

  • Inner connection
    • Equivalent connection
    • Non equivalent connection
    • Self connection
  • External connection
    • Left outer connection
    • Right outer connection
  • Full connection

Among them, internal connection and external connection are commonly used. Full connection is rarely used, so it is not introduced here;

Database table portal

The following example will use three tables:

emp employee form

+-------+--------+-----------+------+------------+---------+---------+--------+
| EMPNO | ENAME  | JOB       | MGR  | HIREDATE   | SAL     | COMM    | DEPTNO |
+-------+--------+-----------+------+------------+---------+---------+--------+
|  7369 | SMITH  | CLERK     | 7902 | 1980-12-17 |  800.00 |    NULL |     20 |
|  7499 | ALLEN  | SALESMAN  | 7698 | 1981-02-20 | 1600.00 |  300.00 |     30 |
|  7521 | WARD   | SALESMAN  | 7698 | 1981-02-22 | 1250.00 |  500.00 |     30 |
|  7566 | JONES  | MANAGER   | 7839 | 1981-04-02 | 2975.00 |    NULL |     20 |
|  7654 | MARTIN | SALESMAN  | 7698 | 1981-09-28 | 1250.00 | 1400.00 |     30 |
|  7698 | BLAKE  | MANAGER   | 7839 | 1981-05-01 | 2850.00 |    NULL |     30 |
|  7782 | CLARK  | MANAGER   | 7839 | 1981-06-09 | 2450.00 |    NULL |     10 |
|  7788 | SCOTT  | ANALYST   | 7566 | 1987-04-19 | 3000.00 |    NULL |     20 |
|  7839 | KING   | PRESIDENT | NULL | 1981-11-17 | 5000.00 |    NULL |     10 |
|  7844 | TURNER | SALESMAN  | 7698 | 1981-09-08 | 1500.00 |    0.00 |     30 |
|  7876 | ADAMS  | CLERK     | 7788 | 1987-05-23 | 1100.00 |    NULL |     20 |
|  7900 | JAMES  | CLERK     | 7698 | 1981-12-03 |  950.00 |    NULL |     30 |
|  7902 | FORD   | ANALYST   | 7566 | 1981-12-03 | 3000.00 |    NULL |     20 |
|  7934 | MILLER | CLERK     | 7782 | 1982-01-23 | 1300.00 |    NULL |     10 |
+-------+--------+-----------+------+------------+---------+---------+--------+

dept department table:

+--------+------------+----------+
| DEPTNO | DNAME      | LOC      |
+--------+------------+----------+
|     10 | ACCOUNTING | NEW YORK |
|     20 | RESEARCH   | DALLAS   |
|     30 | SALES      | CHICAGO  |
|     40 | OPERATIONS | BOSTON   |
+--------+------------+----------+

Salary scale of salgrad;

+-------+-------+-------+
| GRADE | LOSAL | HISAL |
+-------+-------+-------+
|     1 |   700 |  1200 |
|     2 |  1201 |  1400 |
|     3 |  1401 |  2000 |
|     4 |  2001 |  3000 |
|     5 |  3001 |  9999 |
+-------+-------+-------+

Cartesian product phenomenon

I believe many people have heard of this concept. It is a mathematical concept, which roughly means:

Cartesian product refers to two sets in mathematics X and Y Cartesian product of( Cartesian product),Also known as direct product,
Expressed as X * Y ,The first object is X And the second object is Y One of the members of all possible ordered pairs of

In the database, a set is actually a table, so the Cartesian product phenomenon in the database can be explained as follows:

Cartesian product, also known as cross join,yes SQL A way to connect two tables in. 
If A The data in the table is m that 's ok, B The data in the table is n OK, then A and B Do the Cartesian product and the result is m * n that 's ok

Now if you want to query the Department name (DNAME) corresponding to each employee (ENAME), it involves the query between emp and dept. if you write the following sql:

SELECT 
	e.ENAME, d.DNAME 
FROM 
	emp e, dept d;

The query result is:

+--------+------------+
| ENAME  | DNAME      |
+--------+------------+
| SMITH  | ACCOUNTING |
| SMITH  | RESEARCH   |
| SMITH  | SALES      |
| SMITH  | OPERATIONS |
| ALLEN  | ACCOUNTING |
| ALLEN  | RESEARCH   |
| ALLEN  | SALES      |
| ALLEN  | OPERATIONS |
...............Wait, a total of 56 lines
56 rows in set (0.00 sec)

It can be found that the query result does not meet the requirements. The actual result is that the data of each row in the emp table matches all the data in the dept table, so the final total row number is:

emp table rows (14) * dept table rows (4)

This is the Cartesian product phenomenon. The reason for this phenomenon is very simple. Because the connection of two tables does not set connection conditions, the result returned by the table relationship without connection conditions is Cartesian product;

To avoid this phenomenon, you can use the WHERE clause to establish a connection relationship:

SELECT 
	e.ENAME, d.DNAME 
FROM 
	emp e, dept d 
WHERE 
	e.DEPTNO=d.DEPTNO;

The query result is:

+--------+------------+
| ENAME  | DNAME      |
+--------+------------+
| SMITH  | RESEARCH   |
| ALLEN  | SALES      |
| WARD   | SALES      |
| JONES  | RESEARCH   |
| MARTIN | SALES      |
| BLAKE  | SALES      |
| CLARK  | ACCOUNTING |
| SCOTT  | RESEARCH   |
| KING   | ACCOUNTING |
| TURNER | SALES      |
| ADAMS  | RESEARCH   |
| JAMES  | SALES      |
| FORD   | RESEARCH   |
| MILLER | ACCOUNTING |
+--------+------------+
14 rows in set (0.00 sec)

This is the result we need;

But remember one thing: Although the use of WHERE clause can avoid Cartesian product, it will not reduce the number of matches in the query process, but only display the effective data;

(meaning: after using the WHERE clause, the query result has only 14 rows, but it actually matches 56 times in the query process)

Inner connection

Equivalent connection

The connection method used above is the equivalent connection in the internal connection, so the internal connection is based on the test of equality between two tables. Generally speaking, the condition between two tables is an equal relationship;

But the above syntax is out of date. It's 92 syntax. Now we generally use 99 syntax. We generally need to use inner join On connects two tables, so the above sql can be written as:

SELECT 
	e.ENAME, d.DNAME 
FROM 
	emp e 
INNER JOIN 
	dept d 
ON 
	e.DEPTNO=d.DEPTNO;

The result is the same; The INNER here can not be written. The default is internal connection, but it is more readable;

Condition judgment here uses ON to judge. The conditions passed to ON are the same as those passed to WHERE; Moreover, a WHERE judgment statement can be added after ON: a JOIN b ON condition 1 WHERE condition 2

Non equivalent connection

The main characteristics of non equivalent connection are: the relationship in the connection conditions is non equivalent relationship;

For example: find out the salary grade of each employee and display the employee name, salary and salary grade

You need to use the emp table and the salgrade table. It is very simple. You only need to use the inner connection to connect the two tables and find out the grade in the salgrade table where the salary is greater than the minimum wage and less than the maximum wage;

SELECT
	e.ENAME,e.SAL,s.GRADE
FROM
	emp e
INNER JOIN
	salgrade s
ON
	e.SAL BETWEEN s.LOSAL AND s.HISAL;

Query results:

+--------+---------+-------+
| ENAME  | SAL     | GRADE |
+--------+---------+-------+
| SMITH  |  800.00 |     1 |
| ALLEN  | 1600.00 |     3 |
| WARD   | 1250.00 |     2 |
| JONES  | 2975.00 |     4 |
| MARTIN | 1250.00 |     2 |
| BLAKE  | 2850.00 |     4 |
| CLARK  | 2450.00 |     4 |
| SCOTT  | 3000.00 |     4 |
| KING   | 5000.00 |     5 |
| TURNER | 1500.00 |     3 |
| ADAMS  | 1100.00 |     1 |
| JAMES  |  950.00 |     1 |
| FORD   | 3000.00 |     4 |
| MILLER | 1300.00 |     2 |
+--------+---------+-------+
14 rows in set (0.00 sec)

Self connection

Self join is a bit special. It is a query based on one table, but it uses this table as two tables; As the name suggests, it is to connect yourself;

The following is an example: find out the superior leader of each employee and display the employee name and the corresponding leader name

In the employee table emp, each employee has a corresponding superior (except KING employee), so this table can be regarded as two tables according to different meanings:

If the employee table and the superior table are regarded as which table, the other parts will be ignored and left out of control;

SELECT
	a.ENAME AS 'staff',b.ENAME AS 'leader'
FROM
	emp a
INNER JOIN
	emp b
ON
	a.MGR = b.EMPNO;

The constraint is to make the employee's leader number equal to the leader's employee number, so that one table can be operated as two tables;

Query results:

+--------+-------+
| staff   | leader  |
+--------+-------+
| SMITH  | FORD  |
| ALLEN  | BLAKE |
| WARD   | BLAKE |
| JONES  | KING  |
| MARTIN | BLAKE |
| BLAKE  | KING  |
| CLARK  | KING  |
| SCOTT  | JONES |
| TURNER | BLAKE |
| ADAMS  | SCOTT |
| JAMES  | BLAKE |
| FORD   | JONES |
| MILLER | CLARK |
+--------+-------+
13 rows in set (0.01 sec)

External connection

When associating a row in one table with a row in another table, the inner connection will skip the empty row. For example, in the example of finding an employee leader above, KING is the largest leader, so he has no leader and no data found. So how can he bring its data? This requires external connections;

External connection is divided into left external connection and right external connection, but the two connection methods are not very different. Here is a demonstration:

As an example of looking for employee leaders above, this time the additional requirement is to output employees without leaders:

SELECT
	a.ENAME AS 'staff',b.ENAME AS 'leader'
FROM
	emp a
LEFT JOIN
	emp b
ON
	a.MGR = b.EMPNO;

You can see that the INNER JOIN is no longer used here, but the LEFT JOIN is replaced by the OUTER join (equivalent to LEFT OUTER JOIN, which can be ignored). It means that the table on the left of the LEFT JOIN is used as the main table and the table on the right is used as the sub table. As a result, all contents of the main table will be found, and the contents of the sub table depend on the contents of the main table;

Query results:

+--------+-------+
| staff   | leader  |
+--------+-------+
| SMITH  | FORD  |
| ALLEN  | BLAKE |
| WARD   | BLAKE |
| JONES  | KING  |
| MARTIN | BLAKE |
| BLAKE  | KING  |
| CLARK  | KING  |
| SCOTT  | JONES |
| KING   | NULL  | Pay attention here
| TURNER | BLAKE |
| ADAMS  | SCOTT |
| JAMES  | BLAKE |
| FORD   | JONES |
| MILLER | CLARK |
+--------+-------+
14 rows in set (0.00 sec)

You can see that the content (KING) not in the secondary table automatically generates NULL;

This is the left outer join. Similarly, the right join is the main table on the right and the sub table on the left. This query can also be written as follows:

SELECT
	a.ENAME AS 'staff',b.ENAME AS 'leader'
FROM
	emp b
RIGHT JOIN
	emp a
ON
	a.MGR = b.EMPNO;

To summarize briefly:

  • The main table plays a dominant role in query, and the contents of the main table can be found;

  • The secondary table is attached to the primary table during query. If the content of the primary table does not correspond to the secondary table, NULL is generated;

summary

External connection can be divided into main table and auxiliary table, while internal connection can not be divided into main table and auxiliary table

In the example, only the connection of two tables is designed, and the same is true when connecting multiple tables for query. For example, table 1 JOIN table 2 ON condition 1 JOIN table 3 ON condition 2 JOIN table 4 ON condition 3

The actual business will be very complex, so we need to find the relationship between tables after splitting, and analyze it step by step;

When to use internal connection and when to use external connection? It only needs to be distinguished by foreign keys:

  • If the foreign key cannot be empty, the internal connection is preferred;
  • If the foreign key can be empty:
    • If you only need to query those corresponding records in another table, use inner connection;
    • If you need to query all qualified records in the left (right) side table, use the left (right) external connection;

Note: whether the foreign key can be empty is determined by the design of the database table. Because the foreign key is only a field of the database table, it needs to be analyzed according to the actual situation;

Topics: Java Database MySQL SQL