SQL training camp -- Task03: complex query methods - views, subqueries, functions, etc

Posted by duclet on Tue, 18 Jan 2022 06:26:14 +0100

1 View

1.1 what is a view

A view is a virtual table, which is different from a direct operation data table. A view is created according to a SELECT statement. Therefore, when operating a view, a virtual table will be generated according to the SELECT statement that created the view, and then SQL operations will be performed on this virtual table.

1.2 difference between view and table

The basic sql tutorial * * version 2 summarizes the difference between views and tables - "whether the actual data is saved". Therefore, we should distinguish the essence of view and data table, that is, view is a virtual table based on real table, and its data sources are based on real table.

Image source: sql foundation tutorial version 2

Relationship between view and table: "view is not a table, view is a virtual table, and view depends on table".

1.3 why are there views

Now that you have a data table, why do you need a view? The main reasons are as follows:

  1. By defining views, you can save frequently used SELECT statements to improve efficiency.
  2. By defining views, users can see data more clearly.
  3. By defining the view, you can not disclose all the fields of the data table, so as to enhance the confidentiality of the data.
  4. Data redundancy can be reduced by defining views.

1.4 how to create a view

The basic syntax for creating a view is as follows:

CREATE VIEW <View name>(<Column name 1>,<Column name 2>,...)
AS <SELECT sentence>

The SELECT statement needs to be written after the AS keyword. The columns in the SELECT statement are arranged in the same order AS the columns in the view. The first column in the SELECT statement is the first column in the view, the second column in the SELECT statement is the second column in the view, and so on. Moreover, the column name of the view is defined in the list after the view name.
It should be noted that the view name must be unique in the database and cannot be the same as other views and tables.

Views can not only be based on real tables, but we can also continue to create views based on views.

Image source: sql foundation tutorial version 2

Although there is no syntax error in continuing to create views on views, we should try to avoid this operation. This is because for most DBMS, multiple views degrade SQL performance.

  • matters needing attention

It should be noted that the ORDER BY statement cannot be used when defining views in a general DBMS. It is wrong to define a view like this.

CREATE VIEW productsum (product_type, cnt_product)
AS
SELECT product_type, COUNT(*)
  FROM product
 GROUP BY product_type
 ORDER BY product_type;

Why can't you use the ORDER BY clause? This is because, like a view and a table, data rows are out of order.

In MySQL, the definition of a view allows the use of ORDER BY statements. However, if you select from a specific view and the view uses its own ORDER BY statement, the ORDER BY in the view definition will be ignored.

  • Single table based view

We create a view based on the product table, as follows:

CREATE VIEW productsum (product_type, cnt_product)
AS
SELECT product_type, COUNT(*)
  FROM product
 GROUP BY product_type ;

The created view is shown in the following figure:

  • Multi table based view

We are in the product table and shop_ Create a view based on the product table.

CREATE VIEW view_shop_product(product_type, sale_price, shop_name)
AS
SELECT product_type, sale_price, shop_name
  FROM product,
       shop_product
 WHERE product.product_id = shop_product.product_id;

The created view is shown in the following figure

We can query based on this view

SELECT sale_price, shop_name
  FROM view_shop_product
 WHERE product_type = 'clothes';

The query result is:

1.5 how to modify view structure

The basic syntax for modifying the view structure is as follows:

ALTER VIEW <View name> AS <SELECT sentence>

The view name must be unique in the database and cannot be the same as other views and tables.
Of course, you can also delete and recreate the current view to achieve the effect of modification. (is this the same for the underlying database? You can explore it yourself.)

  • Modify view

We modify the product sum view above to

ALTER VIEW productSum
    AS
        SELECT product_type, sale_price
          FROM Product
         WHERE regist_date > '2009-09-11';

The contents of the productSum view are shown in the following figure

1.6 how to update view content

Because the view is a virtual table, the operation on the view is the operation on the underlying basic table. Therefore, the modification can be successful only if the definition of the underlying basic table is met.

For a view, any of the following structures cannot be updated:

  • Aggregate functions SUM(), MIN(), MAX(), COUNT(), etc.
  • DISTINCT keyword.
  • GROUP BY clause.
  • HAVING clause.
  • UNION or UNION ALL operator.
  • The FROM clause contains more than one table.

In the final analysis, the view is derived from the table. Therefore, if the original table can be updated, the data in the view can also be updated. Vice versa, if the view changes and the original table is not updated accordingly, the data consistency cannot be guaranteed.

  • update the view

Because the product sum view we just modified does not include the above restrictions, let's try to update the view

UPDATE productsum
   SET sale_price = '5000'
 WHERE product_type = 'Office Supplies';

At this point, we can check the product sum view and find that the data has been updated

At this time, you can also find that the data has been updated by observing the original table

I wonder if you will have any questions when you see this result. When you just modified the view, you set product_ Sales of goods with type = 'office supplies'_ Price = 5000, why is only one piece of data in the original table modified?

Or because of the definition of view, the view is only a window of the original table, so it can only modify the contents that can be seen through the window.

Note: Although the modification here is successful, it is not recommended. In addition, we also try to use restrictions when creating views. It is not allowed to modify tables through views

1.7 how to delete a view

The basic syntax for deleting a view is as follows:

DROP VIEW <View name 1> [ , <View name 2> ...]

Note: you need corresponding permissions to successfully delete.

  • Delete view

We delete the productSum view we just created

DROP VIEW productSum;

If we continue to operate this view, we will prompt that the content of the current operation does not exist.

2 sub query

Let's take a look at a statement (for example only, no relevant data is provided)

SELECT stu_name
FROM (
         SELECT stu_name, COUNT(*) AS stu_cnt
          FROM students_info
          GROUP BY stu_age) AS studentSum;

This statement looks easy to understand. The sql statement enclosed in parentheses is executed first, and then the external sql statement is executed after successful execution. However, the view mentioned in the previous section is also created according to the SELECT statement, and then queried on this basis. So what is a subquery? What is the relationship between subquery and view?

2.1 what is a subquery

Subquery refers to a query in which one query statement is nested inside another query statement. This feature was introduced from MySQL 4.1. The subquery is calculated first in the SELECT clause. The subquery result is used as the filter condition of another query in the outer layer. The query can be based on one table or multiple tables.

2.2 relationship between sub query and view

A subquery is to use the SELECT statement used to define the view directly in the FROM clause. Among them, AS studentSum can be regarded as the name of the sub query. Because the sub query is one-time, the sub query will not be saved in the storage medium like the view, but will disappear after the execution of the SELECT statement.

2.3 nested sub query

Similar to redefining views on views, sub queries have no specific restrictions. For example, we can do this

SELECT product_type, cnt_product
FROM (SELECT *
        FROM (SELECT product_type, 
                      COUNT(*) AS cnt_product
                FROM product 
               GROUP BY product_type) AS productsum
       WHERE cnt_product = 4) AS productsum2;

The innermost sub query is named productSum. This statement is based on product_type group and query the number. In the second level query, the number of products with 4 will be found, and the outermost layer will query product_type and cnt_product has two columns.
Although nested subqueries can produce results, with the superposition of nested layers of subqueries, SQL statements will not only be difficult to understand, but also have poor execution efficiency. Therefore, such use should be avoided as far as possible.

2.4 quantum query

Scalar means single, so scalar subquery is also a single subquery. What is a single subquery?

The so-called single SQL statement requires us to return only one value, that is, to return a column of a specific row in the table. For example, we have the following table

product_id | product_name | sale_price 
------------+-------------+----------
0003       | motion T Shirt       | 4000 
0004       | kitchen knife          | 3000 
0005       | pressure cooker        | 6800

Then, after executing a scalar subquery, we want to return results like "0004" and "kitchen knife".

2.5 what is the use of scalar quantum query

We now know that scalar quantum queries can return a value, so what is its role?

It may be difficult to think so directly. Let's look at several specific requirements:

  1. Find the goods whose sales unit price is higher than the average sales unit price
  2. Find out the product with the latest registration date

Do you have any ideas?

Let's see how to query the goods whose sales unit price is higher than the average sales unit price through scalar subquery statements.

SELECT product_id, product_name, sale_price
  FROM product
 WHERE sale_price > (SELECT AVG(sale_price) FROM product);

The above statement first queries the average selling price in the product table in the second half. The previous sql statement selects the appropriate goods according to the WHERE condition.
Due to the characteristics of scalar subquery, scalar subquery is not limited to WHERE clause. Generally, any location WHERE a single value can be used can be used. In other words, constants or column names can be used almost everywhere, whether in the SELECT clause, GROUP BY clause, HAVING clause, or ORDER BY clause.

We can also use scalar subqueries like this:

SELECT product_id,
       product_name,
       sale_price,
       (SELECT AVG(sale_price)
          FROM product) AS avg_price
  FROM product;

Can you guess what the result of this code is? Run it to see if the result is consistent with your imagination.

2.6 associated sub query

  • What is an associated subquery

Since the associated sub query contains the word association, it must mean that there is a relationship between the query and the sub query. How was this connection established?

Let's start with an example:

SELECT product_type, product_name, sale_price
  FROM product AS p1
 WHERE sale_price > (SELECT AVG(sale_price)
                       FROM product AS p2
                      WHERE p1.product_type = p2.product_type
   GROUP BY product_type);

Can you understand what this example is doing? Let's take a look at the execution results of this example

From the above example, we can guess that the associated sub query is to connect the internal and external queries through some flags to filter data. Next, let's take a look at the specific content of the associated sub query.

  • Relationship between associated subquery and subquery

Remember our previous example, query the goods whose sales unit price is higher than the average sales unit price. The SQL statement of this example is as follows

SELECT product_id, product_name, sale_price
  FROM product
 WHERE sale_price > (SELECT AVG(sale_price) FROM product);

Let's take another look at this demand and select the goods in each commodity category that are higher than the average sales unit price of the commodity category. The SQL statement is as follows:

SELECT product_type, product_name, sale_price
  FROM product ASp1
 WHERE sale_price > (SELECT AVG(sale_price)
   FROM product ASp2
                      WHERE p1.product_type =p2.product_type
   GROUP BY product_type);

Can you see the difference between the above two statements?
In the second SQL statement, that is, the associated sub query, we mark the external product table as p1, set the internal product table as p2, and connect the two queries through the WHERE statement.

However, if you have just touched it, you will be confused about the execution process of associated queries. Here is one Blog It was quite clear. Here we briefly summarize as follows:

  1. First, execute the main query without WHERE
  2. Match product according to the main query result_ Type to get sub query results
  3. Combine the sub query results with the main query to execute the complete SQL statement

In subquery, like scalar subquery, nested subquery or associated subquery can be regarded as an operation mode of subquery.

Summary

View and subquery are the basic contents of database operation. For some complex queries, you need to use the combination of subquery and some conditional statements to get the correct results. However, in any case, for an SQL statement, the number of layers that should not be designed is very deep and complex. It is not only poor in readability, but also difficult to ensure the execution efficiency. Therefore, try to have concise statements to complete the required functions.

3 function

sql comes with various functions, which greatly improves the convenience of sql language.

The so-called function is similar to a black box. If you give it an input value, it will give the return value according to the preset program definition. The input value is called a parameter.

Functions can be roughly divided into the following categories:

  • Arithmetic function (function used for numerical calculation)
  • String function (function used for string operation)
  • Date function (function used for date operation)
  • Conversion function (function used to convert data types and values)
  • Aggregate function (function used for data aggregation)

The total number of functions exceeds 200. You don't need to remember it completely. There are 30 ~ 50 commonly used functions. You can consult the document when using other infrequent functions.

3.1 arithmetic function

  • +- * / four operations have been introduced in previous chapters and will not be repeated here.
  • ABS – absolute value

Syntax: ABS (numeric)

ABS function is used to calculate the absolute value of a number, which represents the distance from a number to the origin.

When the parameter of the ABS function is NULL, the return value is also NULL.

  • MOD – remainder

Syntax: mod (dividend, divisor)

MOD is a function for calculating the division remainder (remainder), which is the abbreviation of modulo. Decimal has no concept of remainder. It can only find the remainder of integer column.

Note: the mainstream DBMS supports MOD function. Only SQL Server does not support this function. It uses the% symbol to calculate the remainder.

  • ROUND – ROUND

Syntax: round (object value, number of decimal places reserved)

The ROUND function is used for rounding.

Note: errors may be encountered when the parameter , keep decimal places , is a variable. Please use the variable with caution.

SELECT m,
ABS(m)ASabs_col ,
n, p,
MOD(n, p) AS mod_col,
ROUND(m,1)ASround_colS
FROM samplemath;
+----------+---------+------+------+---------+-----------+
| m        | abs_col | n    | p    | mod_col | round_col |
+----------+---------+------+------+---------+-----------+
|  500.000 | 500.000 |    0 | NULL |    NULL |     500.0 |
| -180.000 | 180.000 |    0 | NULL |    NULL |    -180.0 |
|     NULL |    NULL | NULL | NULL |    NULL |      NULL |
|     NULL |    NULL |    7 |    3 |       1 |      NULL |
|     NULL |    NULL |    5 |    2 |       1 |      NULL |
|     NULL |    NULL |    4 | NULL |    NULL |      NULL |
|    8.000 |   8.000 | NULL |    3 |    NULL |       8.0 |
|    2.270 |   2.270 |    1 | NULL |    NULL |       2.3 |
|    5.555 |   5.555 |    2 | NULL |    NULL |       5.6 |
|     NULL |    NULL |    1 | NULL |    NULL |      NULL |
|    8.760 |   8.760 | NULL | NULL |    NULL |       8.8 |
+----------+---------+------+------+---------+-----------+
11 rows in set (0.08 sec)

3.2 string function

String functions are also often used

  • CONCAT – splice

Syntax: CONCAT(str1, str2, str3)

CONCAT function is used for splicing in MySQL.

  • LENGTH – string LENGTH

Syntax: length (string)

  • LOWER – LOWER case conversion

The LOWER function can only be used for English letters. It converts all strings in the parameter to lowercase. This function is not applicable to situations other than English letters and does not affect characters that are originally lowercase.

Similarly, the UPPER function is used for uppercase conversion.

  • REPLACE – REPLACE string

Syntax: replace (object string, string before replacement, string after replacement)

  • SUBSTRING – interception of string

Syntax: SUBSTRING (the starting position of the object string intercepted FROM and the number of characters intercepted FOR)

You can use the SUBSTRING function to truncate a part of the string. The starting position of the interception is calculated from the leftmost side of the string, and the starting index value is 1.

  • (extended content) SUBSTRING_INDEX – the string is truncated by index

Syntax: SUBSTRING_INDEX (original string, separator, n)

This function is used to obtain the substring before (or after) the nth separator after the original string is divided according to the separator. It supports forward and reverse indexing, and the starting values of the index are 1 and - 1 respectively.

SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2);
+------------------------------------------+
| SUBSTRING_INDEX('www.mysql.com', '.', 2) |
+------------------------------------------+
| www.mysql                                |
+------------------------------------------+
1 row in set (0.00 sec)
SELECT SUBSTRING_INDEX('www.mysql.com', '.', -2);
+-------------------------------------------+
| SUBSTRING_INDEX('www.mysql.com', '.', -2) |
+-------------------------------------------+
| mysql.com                                 |
+-------------------------------------------+
1 row in set (0.00 sec)

It is easy to get the first element, and the second element / nth element can be written in the way of secondary splitting.

SELECT SUBSTRING_INDEX('www.mysql.com', '.', 1);
+------------------------------------------+
| SUBSTRING_INDEX('www.mysql.com', '.', 1) |
+------------------------------------------+
| www                                      |
+------------------------------------------+
1 row in set (0.00 sec)
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('www.mysql.com', '.', 2), '.', -1);
+--------------------------------------------------------------------+
| SUBSTRING_INDEX(SUBSTRING_INDEX('www.mysql.com', '.', 2), '.', -1) |
+--------------------------------------------------------------------+
| mysql                                                              |
+--------------------------------------------------------------------+
1 row in set (0.00 sec)

3.3 date function

The date function syntax of different DBMS is different. This course introduces some functions recognized by standard SQL that can be applied to most DBMS. Refer to the document for the date function of a specific DBMS.

  • CURRENT_DATE – gets the current date
SELECT CURRENT_DATE;
+--------------+
| CURRENT_DATE |
+--------------+
| 2020-08-08   |
+--------------+
1 row in set (0.00 sec)
  • CURRENT_TIME – current time
SELECT CURRENT_TIME;
+--------------+
| CURRENT_TIME |
+--------------+
| 17:26:09     |
+--------------+
1 row in set (0.00 sec)
  • CURRENT_TIMESTAMP – current date and time
SELECT CURRENT_TIMESTAMP;
+---------------------+
| CURRENT_TIMESTAMP   |
+---------------------+
| 2020-08-08 17:27:07 |
+---------------------+
1 row in set (0.00 sec)
  • EXTRACT – intercept date element

Syntax: extract (date element FROM date)

Use the EXTRACT function to EXTRACT a part of the date data, such as "year"

"Month", or "hour", "second", etc. The return value of this function is not a date type, but a numeric type

SELECT CURRENT_TIMESTAMP as now,
EXTRACT(YEAR   FROM CURRENT_TIMESTAMP) AS year,
EXTRACT(MONTH  FROM CURRENT_TIMESTAMP) AS month,
EXTRACT(DAY    FROM CURRENT_TIMESTAMP) AS day,
EXTRACT(HOUR   FROM CURRENT_TIMESTAMP) AS hour,
EXTRACT(MINUTE FROM CURRENT_TIMESTAMP) AS MINute,
EXTRACT(SECOND FROM CURRENT_TIMESTAMP) AS second;
+---------------------+------+-------+------+------+--------+--------+
| now                 | year | month | day  | hour | MINute | second |
+---------------------+------+-------+------+------+--------+--------+
| 2020-08-08 17:34:38 | 2020 |     8 |    8 |   17 |     34 |     38 |
+---------------------+------+-------+------+------+--------+--------+
1 row in set (0.00 sec)

3.4 conversion function

The word "conversion" has a very broad meaning. In SQL, it mainly has two meanings: one is data type conversion, which is called type conversion for short and cast in English; Another layer means value conversion.

  • CAST - type conversion

Syntax: CAST (value before conversion AS data type to be converted)

-- Converts a string type to a numeric type
SELECT CAST('0001' AS SIGNED INTEGER) AS int_col;
+---------+
| int_col |
+---------+
|       1 |
+---------+
1 row in set (0.00 sec)
-- Convert string type to date type
SELECT CAST('2009-12-14' AS DATE) AS date_col;
+------------+
| date_col   |
+------------+
| 2009-12-14 |
+------------+
1 row in set (0.00 sec)
  • COALESCE – converts NULL to another value

Syntax: coalesce (data 1, data 2, data 3...)

COALESCE is A SQL specific function. This function will return the first non NULL value from the left in variable parameter A. The number of parameters is variable, so it can be increased infinitely as needed.

The conversion function is used when converting NULL to other values in SQL statements.

SELECT COALESCE(NULL, 11) AS col_1,
COALESCE(NULL, 'hello world', NULL) AS col_2,
COALESCE(NULL, NULL, '2020-11-01') AS col_3;
+-------+-------------+------------+
| col_1 | col_2       | col_3      |
+-------+-------------+------------+
|    11 | hello world | 2020-11-01 |
+-------+-------------+------------+
1 row in set (0.00 sec)

4 predicate

4.1 what are predicates

A predicate is a function that returns a true value. Include TRUE / FALSE / UNKNOWN.

Predicates mainly include the following:

  • LIKE
  • BETWEEN
  • IS NULL,IS NOT NULL
  • IN
  • EXISTS

4.2 LIKE predicate – used for partial consistent query of strings

This predicate is used when a partially consistent query of a string is required.

Partial consistency can be divided into three types: front consistency, middle consistency and rear consistency.

First, let's create a table

-- DDL : Create table
CREATE TABLE samplelike
( strcol VARCHAR(6) NOT NULL,
PRIMARY KEY (strcol)
samplelike);
-- DML : insert data
START TRANSACTION; -- Start transaction
INSERT INTO samplelike (strcol) VALUES ('abcddd');
INSERT INTO samplelike (strcol) VALUES ('dddabc');
INSERT INTO samplelike (strcol) VALUES ('abdddc');
INSERT INTO samplelike (strcol) VALUES ('abcdd');
INSERT INTO samplelike (strcol) VALUES ('ddabc');
INSERT INTO samplelike (strcol) VALUES ('abddc');
COMMIT; -- Commit transaction
SELECT * FROM samplelike;
+--------+
| strcol |
+--------+
| abcdd  |
| abcddd |
| abddc  |
| abdddc |
| ddabc  |
| dddabc |
+--------+
6 rows in set (0.00 sec)
  • Consistent in front: select "dddabc"

Consistent in front, that is, the string as the query condition (here "ddd") is the same as the starting part of the query object string.

SELECT *
FROM samplelike
WHERE strcol LIKE 'ddd%';
+--------+
| strcol |
+--------+
| dddabc |
+--------+
1 row in set (0.00 sec)

Where% is a special symbol representing "zero or more arbitrary strings", and in this example represents "all strings starting with ddd".

  • Consistent in the middle: select "abcddd", "dddabc", "abddddc"

Consistent in the middle, that is, the query object string contains a string as the query condition, regardless of whether the string appears in the object word

It doesn't matter whether the last or the middle of the string.

SELECT *
FROM samplelike
WHERE strcol LIKE '%ddd%';
+--------+
| strcol |
+--------+
| abcddd |
| abdddc |
| dddabc |
+--------+
3 rows in set (0.00 sec)
  • Consistent in the rear: select "abcddd"“

The last part is consistent, that is, the string as the query condition (here "ddd") is the same as the end of the query object string.

SELECT *
FROM samplelike
WHERE strcol LIKE '%ddd';
+--------+
| strcol |
+--------+
| abcddd |
+--------+
1 row in set (0.00 sec)

Based on the above three types of queries, we can see that the query conditions are the most relaxed, that is, the middle consistency can obtain the most records. This is because it contains both front consistent and rear consistent query results.

  • _ Underscore matches any 1 character

Use_ (underline) instead of%, it represents "any 1 character", which is different from%.

SELECT *
FROM samplelike
WHERE strcol LIKE 'abc__';
+--------+
| strcol |
+--------+
| abcdd  |
+--------+
1 row in set (0.00 sec)

4.3 BETWEEN predicate – used for range query

Use BETWEEN to query the range. This predicate differs from other predicates or functions in that it uses three parameters.

-- Select the sales unit price as 100~ 1000 Yuan commodity
SELECT product_name, sale_price
FROM product
WHERE sale_price BETWEEN 100 AND 1000;
+--------------+------------+
| product_name | sale_price |
+--------------+------------+
| T Shirt          |       1000 |
| Punch       |        500 |
| Fork         |        500 |
| Dish cleaning board       |        880 |
| ball pen       |        100 |
+--------------+------------+
5 rows in set (0.00 sec)

The characteristic of BETWEEN is that the result will contain two critical values of 100 and 1000, that is, closed interval. If you do not want the results to contain critical values, you must use < and >.

SELECT product_name, sale_price
FROM product
WHERE sale_price > 100
AND sale_price < 1000;
+--------------+------------+
| product_name | sale_price |
+--------------+------------+
| Punch       |        500 |
| Fork         |        500 |
| Dish cleaning board       |        880 |
+--------------+------------+
3 rows in set (0.00 sec)

4.4 IS NULL, IS NOT NULL – used to determine whether it is NULL

In order to select the data of some columns with NULL value, you cannot use =, but only the specific predicate IS NULL.

SELECT product_name, purchase_price
FROM product
WHERE purchase_price IS NULL;
+--------------+----------------+
| product_name | purchase_price |
+--------------+----------------+
| Fork         |           NULL |
| ball pen       |           NULL |
+--------------+----------------+
2 rows in set (0.00 sec)

In contrast, when you want to select data other than NULL, you need to use IS NOT NULL.

SELECT product_name, purchase_price
FROM product
WHERE purchase_price IS NOT NULL;
+--------------+----------------+
| product_name | purchase_price |
+--------------+----------------+
| T Shirt          |            500 |
| Punch       |            320 |
| motion T Shirt      |           2800 |
| kitchen knife         |           2800 |
| pressure cooker       |           5000 |
| Dish cleaning board       |            790 |
+--------------+----------------+
6 rows in set (0.00 sec)

4.5 IN predicate – simple use of OR

When multiple query criteria are merged, you can choose to use the or statement.

-- adopt OR Specify multiple purchase unit prices for query
SELECT product_name, purchase_price
FROM product
WHERE purchase_price = 320
OR purchase_price = 500
OR purchase_price = 5000;
+--------------+----------------+
| product_name | purchase_price |
+--------------+----------------+
| T Shirt          |            500 |
| Punch       |            320 |
| pressure cooker       |           5000 |
+--------------+----------------+
3 rows in set (0.00 sec)

Although there is no problem with the above methods, there is still one deficiency, that is, with more and more objects to be selected, the SQL statement will be longer and longer, and it will be more and more difficult to read. At this point, we can use the IN predicate
`In (value 1, value 2, value 3,...) to replace the above SQL statement.

SELECT product_name, purchase_price
FROM product
WHERE purchase_price IN (320, 500, 5000);
+--------------+----------------+
| product_name | purchase_price |
+--------------+----------------+
| T Shirt          |            500 |
| Punch       |            320 |
| pressure cooker       |           5000 |
+--------------+----------------+
3 rows in set (0.00 sec)

The above statements are much simpler and the readability is greatly improved.
On the contrary, if you want to select goods whose purchase unit price is not 320 yuan, 500 yuan or 5000 yuan, you can use the negative form NOT IN.

SELECT product_name, purchase_price
FROM product
WHERE purchase_price NOT IN (320, 500, 5000);
+--------------+----------------+
| product_name | purchase_price |
+--------------+----------------+
| motion T Shirt      |           2800 |
| kitchen knife         |           2800 |
| Dish cleaning board       |            790 |
+--------------+----------------+
3 rows in set (0.00 sec)

It should be noted that NULL data cannot be selected when IN and NOT IN are used.
The same is true for the actual results. The forks and ballpoint pens with NULL purchase unit price are not included in the above two groups of results. NULL can only be judged using IS NULL and IS NOT NULL.

4.6 using subquery as parameter of IN predicate

  • IN and subquery

The IN predicate (NOT IN predicate) has a usage that other predicates do not have, that is, you can use a subquery as its parameter. We have learned IN section 5-2 that a subquery is a table generated within SQL, so it can also be said that "the table can be used as an IN parameter". Similarly, we can also say that "the view can be used as a parameter of IN".

Here, we create a new table shopproduct to show which stores sell which goods.

-- DDL : Create table
DROP TABLE IF EXISTS shopproduct;
CREATE TABLE shopproduct
(  shop_id CHAR(4)     NOT NULL,
 shop_name VARCHAR(200) NOT NULL,
product_id CHAR(4)      NOT NULL,
  quantity INTEGER      NOT NULL,
PRIMARY KEY (shop_id, product_id) -- Specify primary key
);
-- DML : insert data
START TRANSACTION; -- Start transaction
INSERT INTO shopproduct (shop_id, shop_name, product_id, quantity) VALUES ('000A', 'Tokyo', '0001', 30);
INSERT INTO shopproduct (shop_id, shop_name, product_id, quantity) VALUES ('000A', 'Tokyo', '0002', 50);
INSERT INTO shopproduct (shop_id, shop_name, product_id, quantity) VALUES ('000A', 'Tokyo', '0003', 15);
INSERT INTO shopproduct (shop_id, shop_name, product_id, quantity) VALUES ('000B', 'Nagoya', '0002', 30);
INSERT INTO shopproduct (shop_id, shop_name, product_id, quantity) VALUES ('000B', 'Nagoya', '0003', 120);
INSERT INTO shopproduct (shop_id, shop_name, product_id, quantity) VALUES ('000B', 'Nagoya', '0004', 20);
INSERT INTO shopproduct (shop_id, shop_name, product_id, quantity) VALUES ('000B', 'Nagoya', '0006', 10);
INSERT INTO shopproduct (shop_id, shop_name, product_id, quantity) VALUES ('000B', 'Nagoya', '0007', 40);
INSERT INTO shopproduct (shop_id, shop_name, product_id, quantity) VALUES ('000C', 'Osaka', '0003', 20);
INSERT INTO shopproduct (shop_id, shop_name, product_id, quantity) VALUES ('000C', 'Osaka', '0004', 50);
INSERT INTO shopproduct (shop_id, shop_name, product_id, quantity) VALUES ('000C', 'Osaka', '0006', 90);
INSERT INTO shopproduct (shop_id, shop_name, product_id, quantity) VALUES ('000C', 'Osaka', '0007', 70);
INSERT INTO shopproduct (shop_id, shop_name, product_id, quantity) VALUES ('000D', 'Fukuoka', '0001', 100);
COMMIT; -- Commit transaction
SELECT * FROM shopproduct;
+---------+-----------+------------+----------+
| shop_id | shop_name | product_id | quantity |
+---------+-----------+------------+----------+
| 000A    | Tokyo      | 0001       |       30 |
| 000A    | Tokyo      | 0002       |       50 |
| 000A    | Tokyo      | 0003       |       15 |
| 000B    | Nagoya      | 0002       |       30 |
| 000B    | Nagoya      | 0003       |      120 |
| 000B    | Nagoya      | 0004       |       20 |
| 000B    | Nagoya      | 0006       |       10 |
| 000B    | Nagoya      | 0007       |       40 |
| 000C    | Osaka      | 0003       |       20 |
| 000C    | Osaka      | 0004       |       50 |
| 000C    | Osaka      | 0006       |       90 |
| 000C    | Osaka      | 0007       |       70 |
| 000D    | Fukuoka      | 0001       |      100 |
+---------+-----------+------------+----------+
13 rows in set (0.00 sec)

Each row of data in the table cannot be distinguished due to the separate use of store _idor product_id
Therefore, two columns are specified as the primary key to combine stores and goods to uniquely determine each row of data.

Suppose we need to take out the sales unit price of goods on sale in Osaka, how can we achieve it?

Step 1: take out the goods on sale in Osaka stores ` product_id ;

Step 2: take out the sales unit price of goods on sale in Osaka stores ` sale_price

-- step1: Take out the goods on sale in Osaka store `product_id`
SELECT product_id
FROM shopproduct
WHERE shop_id = '000C';
+------------+
| product_id |
+------------+
| 0003       |
| 0004       |
| 0006       |
| 0007       |
+------------+
4 rows in set (0.00 sec)

The above statement takes out the commodity number on sale in Osaka store. Next, we can use the above statement as the query criteria in the second step.

-- step2: Take out the sales unit price of goods on sale in Osaka stores `sale_price`
SELECT product_name, sale_price
FROM product
WHERE product_id IN (SELECT product_id
  FROM shopproduct
                       WHERE shop_id = '000C');
+--------------+------------+
| product_name | sale_price |
+--------------+------------+
| motion T Shirt      |       4000 |
| kitchen knife         |       3000 |
| Fork         |        500 |
| Dish cleaning board       |        880 |
+--------------+------------+
4 rows in set (0.00 sec)

According to the knowledge learned in Chapter 5, the sub query is executed from the innermost layer (from inside to outside). Therefore, after the sub query of the above statement is executed, the sql is expanded into the following statement

-- Results after sub query expansion
SELECT product_name, sale_price
FROM product
WHERE product_id IN ('0003', '0004', '0006', '0007');
+--------------+------------+
| product_name | sale_price |
+--------------+------------+
| motion T Shirt      |       4000 |
| kitchen knife         |       3000 |
| Fork         |        500 |
| Dish cleaning board       |        880 |
+--------------+------------+
4 rows in set (0.00 sec)

You can see that the subquery is converted into the in predicate usage. Do you understand?
Or, you wonder why subqueries should be used since the in predicate can also be implemented? Here are two reasons:

① : in real life, the goods on sale in a store are constantly changing, and the sql statement needs to be updated frequently when using the in predicate, which reduces the efficiency and improves the maintenance cost;

② : in fact, there may be hundreds or thousands of goods on sale in a store. Manually maintaining the number of goods on sale is really a big project.

Using sub query can keep the sql statement unchanged, which greatly improves the maintainability of the program, which is the key consideration in system development.

  • NOT IN and subqueries

NOT IN also supports sub queries as parameters, and the usage is exactly the same as in.

-- NOT IN Use the subquery as a parameter to retrieve the sales unit price of goods not sold in Osaka stores
SELECT product_name, sale_price
  FROM product
 WHERE product_id NOT IN (SELECT product_id
                            FROM shopproduct
                           WHERE shop_id = '000A');
+--------------+------------+
| product_name | sale_price |
+--------------+------------+
| kitchen knife         |       3000 |
| pressure cooker       |       6800 |
| Fork         |        500 |
| Dish cleaning board       |        880 |
| ball pen       |        100 |
+--------------+------------+
5 rows in set (0.00 sec)

4.7 EXIST predicate

The use of the EXIST predicate is difficult to understand.

① The use of EXIST is different from that before

② Grammar is difficult to understand

③ IN fact, even if EXIST is not used, it can basically be replaced by IN (or NOT IN)

So, is it necessary to learn the EXIST predicate? The answer is yes, because once you can skillfully use the EXIST predicate, you can realize its great convenience.

However, you don't have to worry too much. This course introduces some basic usage. You can pay more attention to the usage of EXIST predicate in future study, so as to master this usage when you reach the intermediate level of SQL.

  • Use of EXIST predicate

The function of predicate is to "judge whether there are records that meet certain conditions".

If such a record exists, it returns TRUE (TRUE), and if it does not exist, it returns FALSE (FALSE).

The subject of the EXIST predicate is "record".

We continue to use the examples IN in and sub query to select the sales unit price of goods on sale IN Osaka stores using EXIST.

SELECT product_name, sale_price
  FROM product AS p
 WHERE EXISTS (SELECT *
                 FROM shopproduct AS sp
                WHERE sp.shop_id = '000C'
                  AND sp.product_id = p.product_id);
+--------------+------------+
| product_name | sale_price |
+--------------+------------+
| motion T Shirt      |       4000 |
| kitchen knife         |       3000 |
| Fork         |        500 |
| Dish cleaning board       |        880 |
+--------------+------------+
4 rows in set (0.00 sec)
  • Parameters of EXIST

The predicates we learned before basically need to specify more than two parameters, such as "column LIKE string" or "column BETWEEN value 1 AND value 2", but there are no parameters on the left side of EXIST. Because EXIST is a predicate with only 1 parameter. Therefore, EXIST only needs to write one parameter on the right, which is usually a sub query.

(SELECT *
   FROM shopproduct AS sp
  WHERE sp.shop_id = '000C'
    AND sp.product_id = p.product_id)  

The subquery above is the only parameter. Specifically, because the product table and the shopproduct table are joined by the condition "SP.product_id = P.product_id", the associated subquery is used as a parameter. EXIST usually uses an associated subquery as a parameter.

  • SELECT in subquery*

Since EXIST only cares about whether records EXIST, it doesn't matter which columns are returned. EXIST will only judge whether there is a product table and a store that meet the criteria specified in the WHERE clause in the subquery: "shop_id is' 000C '"

Records with the same product _idin the shopproduct table will return TRUE only if such records exist.

Therefore, using the following query statement, the query results will not change.

SELECT product_name, sale_price
  FROM product AS p
 WHERE EXISTS (SELECT 1 -- The appropriate constants can be written here
                 FROM shopproduct AS sp
                WHERE sp.shop_id = '000C'
                  AND sp.product_id = p.product_id);
+--------------+------------+
| product_name | sale_price |
+--------------+------------+
| motion T Shirt      |       4000 |
| kitchen knife         |       3000 |
| Fork         |        500 |
| Dish cleaning board       |        880 |
+--------------+------------+
4 rows in set (0.00 sec)

You can take writing SELECT * in the sub query of EXIST as a habit of SQL.

  • Replace NOT IN with NOT EXIST

Just as EXIST can be used to replace IN, NOT IN can also be replaced with NOT EXIST.

The following code example takes out the sales unit price of goods not sold in Osaka stores.

SELECT product_name, sale_price
  FROM product AS p
 WHERE NOT EXISTS (SELECT *
                     FROM shopproduct AS sp
                    WHERE sp.shop_id = '000A'
                      AND sp.product_id = p.product_id);
+--------------+------------+
| product_name | sale_price |
+--------------+------------+
| kitchen knife         |       3000 |
| pressure cooker       |       6800 |
| Fork         |        500 |
| Dish cleaning board       |        880 |
| ball pen       |        100 |
+--------------+------------+
5 rows in set (0.00 sec)

NOT EXIST, in contrast to EXIST, returns TRUE when "no record exists" that meets the criteria specified in the subquery.

5 CASE expression

5.1 what is a CASE expression?

CASE expression is a kind of function. It is one of the top two important functions in SQL. It is necessary to learn it well.

CASE expressions are used to distinguish situations, which are often called (conditional) branches in programming.

The syntax of CASE expression is divided into simple CASE expression and search CASE expression. Because the search CASE expression contains all the functions of a simple CASE expression. This lesson focuses on searching CASE expressions.

Syntax:

CASE WHEN <Evaluation expression> THEN <expression>
     WHEN <Evaluation expression> THEN <expression>
     WHEN <Evaluation expression> THEN <expression>
     .
     .
     .
ELSE <expression>
END  

When executing the above statements, judge whether the when expression is true. If yes, execute the statement after THEN. If all the when expressions are false, execute the statement after ELSE.
No matter how large a CASE expression is, it will only return one value in the end.

5.2 use of case expression

Suppose you want to achieve the following results now:

A : clothes
B : Office Supplies
C : kitchenware  

Because the records in the table do not contain strings such as "A:" or "B:", they need to be added in SQL. And combine "A:" "B:" "C:" with the record.

  • Application scenario 1: get different column values according to different branches
SELECT  product_name,
        CASE WHEN product_type = 'clothes' THEN CONCAT('A :  ',product_type)
             WHEN product_type = 'Office Supplies'  THEN CONCAT('B :  ',product_type)
             WHEN product_type = 'kitchenware'  THEN CONCAT('C :  ',product_type)
             ELSE NULL
        END AS abc_product_type
  FROM  product;
+--------------+------------------+
| product_name | abc_product_type |
+--------------+------------------+
| T Shirt          | A :  clothes        |
| Punch       | B :  Office Supplies    |
| motion T Shirt      | A :  clothes        |
| kitchen knife         | C :  kitchenware    |
| pressure cooker       | C :  kitchenware    |
| Fork         | C :  kitchenware    |
| Dish cleaning board       | C :  kitchenware    |
| ball pen       | B :  Office Supplies    |
+--------------+------------------+
8 rows in set (0.00 sec)

The ELSE clause can also be omitted without writing. At this time, it will be defaulted to ELSE NULL. However, in order to prevent people from missing reading, I hope you can write the ELSE clause explicitly.
In addition, the "END" at the END of the CASE expression cannot be omitted. Please pay special attention not to omit it. Forgetting to write END will lead to grammatical errors, which is also the easiest mistake to make at the beginning of learning.

  • Application scenario 2: aggregation in column direction

Generally, we use the following code to implement different kinds of aggregation (sum in this case) in the direction of lines

SELECT product_type,
       SUM(sale_price) AS sum_price
  FROM product
 GROUP BY product_type;  
+--------------+-----------+
| product_type | sum_price |
+--------------+-----------+
| clothes         |      5000 |
| Office Supplies      |       600 |
| kitchenware      |     11180 |
+--------------+-----------+
3 rows in set (0.00 sec)

If you want to display different kinds of aggregate values in the direction of the column, how should you write them?

sum_price_clothes | sum_price_kitchen | sum_price_office
------------------+-------------------+-----------------
             5000 |             11180 |              600  

Aggregate function + CASE WHEN expression can achieve this effect

-- Perform line to line conversion on the total sales unit price calculated by commodity type
SELECT SUM(CASE WHEN product_type = 'clothes' THEN sale_price ELSE 0 END) AS sum_price_clothes,
       SUM(CASE WHEN product_type = 'kitchenware' THEN sale_price ELSE 0 END) AS sum_price_kitchen,
       SUM(CASE WHEN product_type = 'Office Supplies' THEN sale_price ELSE 0 END) AS sum_price_office
  FROM product;
+-------------------+-------------------+------------------+
| sum_price_clothes | sum_price_kitchen | sum_price_office |
+-------------------+-------------------+------------------+
|              5000 |             11180 |              600 |
+-------------------+-------------------+------------------+
1 row in set (0.00 sec)
  • (extended content) application scenario 3: realize row to column conversion

Suppose there is the structure of the following chart

The following chart structure is planned

Aggregate function + CASE WHEN expression can realize this conversion

-- CASE WHEN Implement numeric columns score Row to column
SELECT name,
       SUM(CASE WHEN subject = 'language' THEN score ELSE null END) as chinese,
       SUM(CASE WHEN subject = 'mathematics' THEN score ELSE null END) as math,
       SUM(CASE WHEN subject = 'Foreign Languages' THEN score ELSE null END) as english
  FROM score
 GROUP BY name;
+------+---------+------+---------+
| name | chinese | math | english |
+------+---------+------+---------+
| Zhang San |      93 |   88 |      91 |
| Li Si |      87 |   90 |      77 |
+------+---------+------+---------+
2 rows in set (0.00 sec)

The above code realizes the row to column conversion of the digital column score, and can also realize the row to column conversion of the text column subject

-- CASE WHEN Implement text columns subject Row to column
SELECT name,
       MAX(CASE WHEN subject = 'language' THEN subject ELSE null END) as chinese,
       MAX(CASE WHEN subject = 'mathematics' THEN subject ELSE null END) as math,
       MIN(CASE WHEN subject = 'Foreign Languages' THEN subject ELSE null END) as english
  FROM score
 GROUP BY name;
+------+---------+------+---------+
| name | chinese | math | english |
+------+---------+------+---------+
| Zhang San | language    | mathematics | Foreign Languages    |
| Li Si | language    | mathematics | Foreign Languages    |
+------+---------+------+---------+
2 rows in set (0.00 sec

Summary:

  • When the column to be converted is a number, aggregate functions such as SUM AVG MAX MIN can be used;
  • When the column to be converted is text, aggregate functions such as MAX MIN can be used

Topics: SQL