MySQL's most commonly used grouping aggregation function

Posted by b0ksah on Fri, 11 Feb 2022 13:15:06 +0100

Hello, everyone. Meet again. I'm Jun Quan.

1, aggregation function - that is, group functions

Operate on a set of rows (a group of rows) and give a result to each group.

Common group functions:

AVG([distinct] expr)

Average

COUNT({*|[distinct] } expr)

Number of statistical lines

MAX([distinct] expr)

Find the maximum value

MIN([distinct] expr)

Find the minimum value

SUM([distinct] expr)

Summation

① each group function receives one parameter

② by default, the group function ignores the row with null column value and does not participate in the calculation

③ sometimes, the keyword distinct is used to eliminate the number of duplicate field values

be careful:

1) when there is no group by clause in the select statement using group function, all rows in the intermediate result set automatically form a group, and then calculate the group function;

2) group functions cannot be nested, for example: count(max(...));

3) the parameters of group functions can be columns or function expressions;

4) multiple aggregate functions can appear in a SELECT clause.

Table for experimental demonstration:

mysql> select * from salary_tab;
+--------+---------+
| userid | salary  |
+--------+---------+
|      1 | 1000.00 |
|      2 | 2000.00 |
|      3 | 3000.00 |
|      4 |    NULL |
|      5 | 1000.00 |
+--------+---------+
5 rows in set (0.00 sec) 
mysql> use TENNIS
mysql> show tables;
+-------------------+
| Tables_in_TENNIS  |
+-------------------+
| COMMITTEE_MEMBERS |
| MATCHES           |
| PENALTIES         |
| PLAYERS           |
| TEAMS             |
+-------------------+
5 rows in set (0.00 sec)

1. count function

① count(*): returns the number of rows in the table that meet the where condition

mysql> select count(*) from salary_tab where salary='1000';
+----------+
| count(*) |
+----------+
|        2 |
+----------+

mysql> select count(*) from salary_tab;  #There are no conditions. The default is the number of rows in the statistics table
+----------+
| count(*) |
+----------+
|        5 |
+----------+

② Count (column): returns the number of rows whose column value is not empty

mysql> select count(salary) from salary_tab;
+---------------+
| count(salary) |
+---------------+
|             4 |
+---------------+

③ count(distinct column): returns the number of rows with non empty column values and non duplicate column values

mysql> select count(distinct salary) from salary_tab;
+------------------------+
| count(distinct salary) |
+------------------------+
|                      3 |
+------------------------+

④ count(expr): statistics based on expression

mysql> select * from TT;
+------+------------+
| UNIT | DATE       |
+------+------------+
| a    | 2018-04-03 |
| a    | 2017-12-12 |
| b    | 2018-01-01 |
| b    | 2018-04-03 |
| c    | 2016-06-06 |
| d    | 2018-03-03 |
+------+------------+
6 rows in set (0.00 sec)

mysql> select UNIT as 'Company',
    ->     COUNT(TO_DAYS(DATE)=TO_DAYS(NOW()) or null) as 'Statistics today',
    ->     COUNT(YEAR(DATE)=YEAR(NOW()) or null) as 'Statistics this year'
    -> from v_jjd
    -> group by JJDW;
+------+----------+----------+
| Company  | Statistics today  | Statistics this year  |
+------+----------+----------+
| a    |        1 |        1 |
| b    |        1 |        2 |
| c    |        0 |        0 |
| d    |        0 |        1 |
+------+----------+----------+
4 rows in set (0.00 sec)

2. max and min functions - the maximum and minimum values in the statistical column

mysql> select max(salary) from salary_tab;
+-------------+
| max(salary) |
+-------------+
|     3000.00 |
+-------------+

mysql> select min(salary) from salary_tab;
+-------------+
| min(salary) |
+-------------+
|     1000.00 |
+-------------+

Note: if there is only NULL value in the statistical column, MAX and MIN will return NULL

3. Sum and avg functions - sum and average

!! Rows with null column values in the table do not participate in the calculation

mysql> select sum(salary) from salary_tab;
+-------------+
| sum(salary) |
+-------------+
|     7000.00 |
+-------------+

mysql> select avg(salary) from salary_tab;
+-------------+
| avg(salary) |
+-------------+
| 1750.000000 |
+-------------+

mysql> select avg(ifnull(salary,0)) from salary_tab;
+-----------------------+
| avg(ifnull(salary,0)) |
+-----------------------+
|           1400.000000 |
+-----------------------+

Note: if you want the row with NULL column value to participate in the calculation of group function, you must use IFNULL function to convert the NULL value.

2, Group SELECT

SELECT select_expr [, select_expr ...]

    [FROM table_references

      [PARTITION partition_list]

    [WHERE where_condition]

    [GROUP BY {col_name | expr | position}

      [ASC | DESC], ... [WITH ROLLUP]]

    [HAVING where_condition]

    [ORDER BY {col_name | expr | position}

      [ASC | DESC], ...]

[LIMIT {[offset,] row_count | row_count OFFSET offset}]

Basic format of group SELECT:

select [aggregate function] field name from table name

[where query criteria]

[group by field name]

[having filter condition]

1. group by clause

According to each different value of a given column or expression, the rows in the table are divided into different groups, and the group function is used to return the statistical information of each group

Rules:

① individual columns appearing in the SELECT clause must appear in the GROUP BY clause as grouping columns

② grouping columns may not appear in the SELECT clause

③ grouping columns can appear in a compound expression in the SELECT clause

④ if GROUP BY is followed by a compound expression, it must be used as a whole as part of an expression in the SELECT clause.

1) Specify a column to group

mysql> select salary,count(*) from salary_tab
    -> where salary>=2000
    -> group by salary;
+---------+----------+
| salary  | count(*) |
+---------+----------+
| 2000.00 |        1 |
| 3000.00 |        1 |
+---------+----------+

2) Specify multiple grouping columns, 'sub group in large group'

mysql> select userid,count(salary) from salary_tab
    -> where salary>=2000
    -> group by salary,userid;
+--------+---------------+
| userid | count(salary) |
+--------+---------------+
|      2 |             1 |
|      3 |             1 |
+--------+---------------+

3) Group by expression

mysql> select year(payment_date),count(*)
    -> from PENALTIES
    -> group by year(payment_date);
+--------------------+----------+
| year(payment_date) | count(*) |
+--------------------+----------+
|               1980 |        3 |
|               1981 |        1 |
|               1982 |        1 |
|               1983 |        1 |
|               1984 |        2 |
+--------------------+----------+
5 rows in set (0.00 sec)

4) Grouping with sorting: if the grouping column and row sequence are the same, the group by and order by clauses can be merged

mysql> select teamno,count(*)
    -> from MATCHES
    -> group by teamno
    -> order by teamno desc;
+--------+----------+
| teamno | count(*) |
+--------+----------+
|      2 |        5 |
|      1 |        8 |
+--------+----------+
2 rows in set (0.00 sec)

mysql> select teamno,count(*)
    -> from MATCHES
    -> group by teamno desc;  #Desc (or asc) can be included in the group by Clause for simplification
+--------+----------+
| teamno | count(*) |
+--------+----------+
|      2 |        5 |
|      1 |        8 |
+--------+----------+
2 rows in set (0.00 sec)

For group aggregation, note:

Through select, in the return set fields, these fields must either be included after the group by statement as the basis for grouping, or be included in the aggregate function. We can imagine the group by operation as a process as follows: first, the system obtains a result set according to the select statement, and then combines the records with the same grouping field into one record according to the grouping field. At this time, the remaining fields that do not exist and are used as the basis for grouping after the group by statement are likely to have multiple values. However, at present, there is only one record in a grouping situation, and one data grid cannot put multiple values. Therefore, at this time, it is necessary to convert these multivalued columns into single values through a certain process, and then put them in the corresponding data grid, Then the aggregation functions mentioned above are the ones that complete this step, which is why these functions are called aggregation functions.

2,GROUP_CONCAT() function

The value of the function is equal to all the values of the specified column belonging to a group, separated by commas, and represented as a string.

Example 1: for each team, get its number and the numbers of all players

mysql> select teamno,group_concat(playerno)
    -> from MATCHES
    -> group by teamno;
+--------+------------------------+
| teamno | group_concat(playerno) |
+--------+------------------------+
|      1 | 6,6,6,44,83,2,57,8     |
|      2 | 27,104,112,112,8       |
+--------+------------------------+
2 rows in set (0.01 sec)

If there is no group by clause, group_concat returns all values of a column

Example 2: get a list of all penalty numbers

mysql> select group_concat(paymentno)
    -> from PENALTIES;
+-------------------------+
| group_concat(paymentno) |
+-------------------------+
| 1,2,3,4,5,6,7,8         |
+-------------------------+
1 row in set (0.00 sec)

3. The with clause is used to group multiple roles in the with clause

Use less, but sometimes it can be used according to specific needs

If there are clauses GROUP BY E1,E2,E3,E4 WITH ROLLUP

Then the following groups will be executed respectively: [E1,E2,E3,E4], [E1,E2,E3], [E1,E2], [E1], []

Note: [] indicates that all rows are grouped together

Example: count the total number of players according to their gender and city of residence; Count the total number of players of each gender; Count the total number of all players

mysql> select sex,town,count(*)
    -> from PLAYERS
    -> group by sex,town with rollup;
+-----+-----------+----------+
| sex | town      | count(*) |
+-----+-----------+----------+
| F   | Eltham    |        2 |
| F   | Inglewood |        1 |
| F   | Midhurst  |        1 |
| F   | Plymouth  |        1 |
| F   | NULL      |        5 |
| M   | Douglas   |        1 |
| M   | Inglewood |        1 |
| M   | Stratford |        7 |
| M   | NULL      |        9 |
| NULL | NULL      |       14 |
+-----+-----------+----------+
10 rows in set (0.00 sec)

4. HAVING clause: filter grouping results

be careful:

You cannot use the WHERE clause to filter the grouped results

Group functions cannot be used in the WHERE clause, only for filtering rows

mysql> select playerno
    -> from PENALTIES
    -> where count(*)>1
    -> group by playerno;
ERROR 1111 (HY000): Invalid use of group function

Because the WHERE clause is executed before GROUP BY, and the group function must be executed after grouping, and the filtering of the result set must be performed with the having clause after grouping.

Basic syntax:

SELECT   select_expr [, select_expr ...]

   FROM  table_name

   [WHERE where_condition]

   [GROUP BY {col_name | expr} [ASC | DESC], ... [WITH ROLLUP]]

[HAVING where_condition]

!!! The difference between having sub statement and where sub statement:

where clause filters records before grouping;

having clause filters records after grouping

mysql> select salary,count(*) from salary_tab
    -> where salary>=2000
    -> group by salary
    -> having count(*)>=0;
+---------+----------+
| salary  | count(*) |
+---------+----------+
| 2000.00 |        1 |
| 3000.00 |        1 |
+---------+----------+

1) HAVING can be used alone without GROUP BY. If there is only a HAVING clause without GROUP BY, all rows in the table are divided into one group

2) Group functions can be used in the HAVING clause

3) The columns in the HAVING clause either appear in a group function or in the GROUP BY Clause (otherwise an error occurs)

mysql> select town,count(*)
    -> from PLAYERS
    -> group by town
    -> having birth_date>'1970-01-01';
ERROR 1054 (42S22): Unknown column 'birth_date' in 'having clause'
mysql> select town,count(*)
    -> from PLAYERS
    -> group by town
    -> having town in ('Eltham','Midhurst');
+----------+----------+
| town     | count(*) |
+----------+----------+
| Eltham   |        2 |
| Midhurst |        1 |
+----------+----------+
2 rows in set (0.00 sec)

3, Collection query operation

union is used to combine the results of two or more select queries into one

SELECT ...

UNION [ALL | DISTINCT]

SELECT ...

[UNION [ALL | DISTINCT]

SELECT ...]

By default, UNION = UNION DISTINCT

① for two queries to be merged, the SELECT list must be consistent in quantity and data type of corresponding columns;

② by default, duplicate rows in two query result sets will be removed; The default result set is not sorted;

③ the column name of the final result set comes from the SELECT list of the first query

Union all does not remove duplicate rows from the result set

Note: the joint query result uses the field name in the first select statement

mysql> select * from t1;
+------+------+
| num  | addr |
+------+------+
|  123 | abc  |
|  321 | cba  |
+------+------+
2 rows in set (0.00 sec)

mysql> select * from t2;
+------+------+
| id   | name |
+------+------+
|    1 | a    |
|    2 | A    |
+------+------+
2 rows in set (0.00 sec)

mysql> select * from t1
    -> union
    -> select * from t2;
+------+------+
| num  | addr |
+------+------+
|  123 | abc  |
|  321 | cba  |
|    1 | a    |
|    2 | A    |
+------+------+
4 rows in set (0.00 sec)

If you want to sort the entire merged result set, the ORDER BY clause can only appear in the last query

be careful:

In the de duplication operation, if the column values contain NULL values, they are considered equal

Publisher: full stack programmer, stack length, please indicate the source for Reprint: https://javaforall.cn/112073.html Original link: https://javaforall.cn