MySQL topic 5 - sub query

Posted by rahulephp on Sun, 06 Feb 2022 21:47:05 +0100

Subquery

give an example

Find out the name and salary of employees with higher salary than Abel

Method 1: multiple queries

SELECT salary `salary_abel` FROM employees WHERE last_name = 'Abel';
SELECT last_name, salary FROM employees WHERE salary > salary_abel;

Mode 2 self connection

SELECT e2.last_name, e2.salary FROM employees e1 , employees e2 WHERE e2.salary > e1.salary AND e1.last_name = 'Abel';

Method 3 sub query

SELECT last_name,salary FROM employees WHERE salary > (
	SELECT salary FROM employees WHERE last_name = 'Abel'
);

Subquery

Normative appellation

External query (or primary query) / internal query (or sub query)

Use of subqueries

The subquery is executed before the main query

The results of the sub query are used by the main query (external query)

matters needing attention

  • Subqueries should be enclosed in parentheses
  • Place the subquery on the right side of the comparison condition (to ensure readability and aesthetics)
  • Single row operators correspond to single row subqueries, and multi row operators correspond to multi row subqueries

Classification of subqueries

Angle 1: single row subquery VS multi row subquery

The sub query is divided into single line sub query and multi line sub query according to whether one or more records are returned from the results of internal query

Angle 2: related (or associated) sub query and unrelated (or non associated) sub query

Sub queries are divided into related (or related) sub queries and unrelated (or non related) sub queries according to whether the internal query has been executed multiple times

For example, the demand of related sub query: query the information of employees whose salary is greater than the average salary of the Department (external query and internal query are related, and the execution times of internal query are related to external query)

Requirements for unrelated sub query: query employee information whose salary is greater than the average salary of the company (external query and internal query are not related, and all external queries can be completed by executing internal query at one time)

single-row subqueries

Single line operator

= != > >= < <=

subject

Query and employee manager 141_ ID and department_ Employees of other employees with the same ID_ id, manager_id, department_id

Analysis: the same manager as 141_ id , department_ id ; Other employees;

answer:

Mode 1:

SELECT employee_id, manager_Id, department_id FROM employees 
WHERE 
manager_id = (SELECT manager_id FROM employees WHERE employee_id = 141)
AND 
department_id = (SELECT department_id FROM employees WHERE employee_id = 141)
AND 
employee_id <> 141;#other

Mode 2:

SELECT employee_id, manager_Id, department_id FROM employees 
WHERE (manager_id, department_id) = (SELECT manager_id, department_id FROM employees WHERE employee_id = 141)
AND employee_id <> 141;

The difference between mode 1 and mode 2:

Mode 1 has a wide range of application and slightly low efficiency; The application scope of mode 2 is narrow (sub query conditions can be merged) and the efficiency is slightly higher

Title: subquery in HAVING

Query the Department id and the minimum wage of the Department whose minimum wage is greater than the minimum wage of No. 50 department

SELECT department_id, MIN(salary)
FROM employees 
GROUP BY department_id
HAVING MIN(salary) > 
	(SELECT MIN(salary) FROM employees WHERE department_id = 50 );

be careful

GROUP BY cannot be forgotten (if there is one in SELECT, there must be one in GROUP BY; otherwise, there is no need!)

Sub question in CASE query

Displays the employee of the employee_id,last_name and location. Where, if the employee department_id and location_ Department with ID 1800_ If the ID is the same, the location is Canada and the rest is USA

SELECT employee_id,last_name,
(CASE department_id WHEN 
(SELECT department_id FROM departments WHERE location_id = 1800)
THEN 'Canada' ELSE 'USA' END) location
FROM employees
);

be careful

Null value problem of single line subquery

SELECT last_name, job_id FROM employees
WHERE job_id = 
(
	SELECT job_id FROM employees WHERE last_name = 'Haas'
);

The subquery does not return any rows

Multiline subquery

Multiline subquery

  • Also known as set comparison subquery
  • Multiple rows returned by internal query
  • Using multiline comparison operators

Operator comparison multiline

Operatormeaning
INEqual to any one in the list
ANYIt needs to be used together with the single line comparison operator to compare with any value returned by the sub query
ALLIt needs to be used together with the single line comparison operator to compare with all the values returned by the sub query
SOMEIn fact, it is an alias of ANY, which has the same function. Generally, ANY is often used

subject

Query other jobs_ Job in ID_ ID is' it '_ Prog 'employee number / name / job of any low paid employee in the Department_ ID and salary

SELECT employee_id, last_name,job_id,salary
FROM employees
WHERE salary < ANY
		(SELECT salary FROM employees WHERE job_id = 'IT_PROG')
AND job_id <> 'IT_PROG';

subject

Query other jobs_ Job in ID_ ID is' it '_ Prog 'employee number / name / job of all employees in the Department with low salary_ ID and salary

SELECT employee_id, last_name,job_id,salary
FROM employees
WHERE salary < ALL
		(SELECT salary FROM employees WHERE job_id = 'IT_PROG')
AND job_id <> 'IT_PROG';

subject

Query the Department id with the lowest average wage

#Mode 1:
SELECT department_id FROM employees
GROUP BY department_id
HAVING AVG(salary) = 
(
	SELECT MIN(avg_sal)
    FROM 
    (
    	SELECT AVG(salary) avg_sal
        FROM employees
        GROUP BY department_id        
    ) dept_avg_sal
)
#Mode 2:
SELECT department_id FROM employees
GROUP BY department_id
HAVING AVG(salary) <= ALL 
(
	SELECT AVG(salary) avg_sal
    FROM employees
    GROUP BY department_id
)

Null value problem of multi row subquery

SELECT last_name FROM employees
WHERE employee_id NOT IN 
(
	SELECT manager_id FROM employees
);

Related sub query

If the execution of a sub query depends on an external query, it is usually because the tables in the sub query use external tables and are conditionally associated. Therefore, every time an external query is executed, the sub query must be recalculated. Such a sub query is called an associated sub query.

Related sub queries are executed in row by row order. Each row of the main query executes a sub query.

# The columns in the main query are used in the subquery
SELECT column1,column2,...
FROM table1 `outer`
WHERE column1 operator
(
	SELECT column1, column2
    FROM table2
    WHERE expr1 = `outer`.expr2
)

subject

Query the last of employees whose salary is greater than the average salary of the Department_ Name, salary and its department_id

#Method 1: related sub query
SELECT last_name, salary, department_id
FROM employees outer
WHERE salary >
(
	SELECT AVG(salary)
    FROM employees
    WHERE department_id = outer.department_id
);
#Method 2: use sub query in FROM
SELECT last_name, salary, e1.department_id
FROM employees e1,
(SELECT department_id, AVG(salary) dept_avg_sal FROM employees GROUP BY department_id) e2
WHERE e1.`department_id` = e2.`department_id`
AND e2.dept_avg_sal < e1.`salary`;

subject

Query the employee's ID and salary according to department_name sort

SELECT employee_id, salary
FROM employees e
ORDER BY 
(
	SELECT department_name
    FROM departments d
    WHERE e.`department_id` = d.`department_id`
);

subject

If employees in the employees table_ ID and job_ Employee in history table_ The number of employees with the same ID is not less than 2, and the employees of these employees with the same ID are output_ id,last_ Name and its job_id

SELECT e.employee_id, last_name, e.job_id
FROM employees e
WHERE 2 <= 
(
	SELECT COUNT(*) FROM job_history
    WHERE employee_id = e.employee_id
);

EXISTS and NOT EXISTS keywords

  • The key subquery is usually used together with the EXISTS operator to check whether there are qualified rows in the subquery.

  • If there are no qualified rows in the subquery:

    • Condition returns FALSE
    • Continue to find in subquery
  • If there are qualified rows in the subquery:

    • Do not continue to find in subquery
    • The condition returns TRUE
  • NOT EXISTS keyword indicates that if there is no condition, it returns TRUE; otherwise, it returns FALSE

subject

Query the employees of company managers_ id,last_ name,job_id, department_id information

# Mode 1:
SELECT employee_id, last_name, job_id, department_id
FROM employees e1
WHERE EXISTS 
(
	SELECT * FROM employees e2 WHERE e2.manager_id = e1.employee_id
);
# Mode 2: self connection
SELECT DISTINCT e1.employee_id, e1.last_name, e1.job_id, e1.department_id
FROM employees e1 JOIN employees e2
WHERE e1.employee_id = e2.manager_id;
# Mode 3:
SELECT DISTINCT e1.employee_id, e1.last_name, e1.job_id, e1.department_id
FROM employees e1 WHERE e1.employee_id IN 
(
	SELECT DISTINCT manager_id FROM employees
);

subject

Query the departments of departments that do not exist in the employees table in the departments table_ ID and department_name

SELECT department_id, department_name
FROM departments d
WHERE NOT EXISTS 
(
	SELECT 'X'
    FROM employees
    WHERE department_id = d.department_id
);

Related updates

UPDATE table1 alias1
SET column = 
(
	SELECT expression FROM table2 alias2
    WHERE alias1.column = alias2.column
);

Use related subqueries to update data from one table to another

subject

Add a department in employees_ Name field, the data is the Department name corresponding to the employee

# 1)
ALTER TABLE employees
ADD(department_name VARCHAR2(14));
# 2)
UPDATE employees e
SET department_name = 
(
	SELECT department_name
    FROM departments d
    WHERE e.department_id = d.department_id
);

Related deletion

DELETE FROM table1 alias1
WHERE column operator
(
	SELECT expression FROM table2 alias2
    WHERE alias1.column = alias2.column
);

Use related subqueries to delete data from one table based on data from another

DELETE FROM employees e
WHERE employee_id IN
(
	SELECT employee_id FROM emp_history
    WHERE employee_id = e.employee_id
);

subject

Whose salary is higher than Abel?

# Mode 1: self connection
SELECT e2.last_name,e2.salary
FROM employees e1,employees e2
WHERE e1.last_name = 'Abel'
AND e1.`salary` < e2.`salary`
# Method 2: sub query
SELECT last_name,salary
FROM employees
WHERE salary > (
        SELECT salary
        FROM employees
        WHERE last_name = 'Abel'
);

Question: are the above two methods good or bad?

Answer: good self connection mode!
You can use subquery or self connection in the topic. In general, it is recommended that you use self connection, because in the processing of many DBMS, the processing speed of self connection is much faster than that of sub query.
It can be understood as follows: the sub query is actually the condition judgment after the query through the unknown table, while the self connection is the condition judgment through the known self data table. Therefore, the self connection processing is optimized in most DBMS.

Topics: MySQL