MysqlLimit query optimization

Posted by dmonares on Tue, 08 Feb 2022 14:44:06 +0100

If we only need a specified number of rows in the result set, use the LIMIT clause in the query instead of getting the entire result set and discarding additional data.

MySQL sometimes optimizes rows with limit_ Query with count clause and no HAVING clause:

  • If you use LIMIT to select only a few rows, MySQL will use indexes in some cases, but usually it prefers to perform a full table scan.
  • If LIMIT row_count is used in combination with ORDER BY, and MySQL will find the first row of the sorting result_ Stop sorting immediately after the count row, instead of sorting the entire result. If you use indexes for sorting, it is very fast. If you must sort files, find the first row in the_ Before count, all rows that match the query and do not have a LIMIT clause are selected and most or all of them are sorted. After finding the initial row, MySQL does not sort any remaining part of the result set.

    One manifestation of this behavior is that ORDER BY queries with and without LIMIT may return rows in different order, as described later in this section.

  • If limit row_ When count is used in combination with DISTINCT, MySQL will find row_ Stop immediately after counting the only row.

  • In some cases, GROUP BY can be solved by reading the index in order (or sorting the index), and then calculating the summary until the index value changes. In this case, LIMIT row_count does not calculate any unnecessary GROUP BY values.

  • Once MySQL sends the required number of rows to the client, it will abort the query unless we use SQL_CALC_FOUND_ROWS. In this case, you can use SELECT FOUND_ROWS() retrieves the number of rows.

  • LIMIT 0 quickly returns an empty set. This is useful for checking the validity of queries. It can also be used to get the type of result column in applications that use MySQL API, which makes result set metadata available. Using the mysql client program, we can use the -- column type info option to display the result column type.

  • If the server uses temporary tables to parse queries, it uses LIMIT row_count clause to calculate how much space is needed.

  • If the index is not used for ORDER BY, but there is also a LIMIT clause, the optimizer may be able to avoid using merged files and sort rows in memory using in memory file sorting operations.

If multiple rows have the same value in the ORDER BY column, the server is free to return these rows in any order and may return them in different ways according to the overall execution plan. In other words, the sort order of these rows is uncertain for unordered columns.

One factor that affects the execution plan is LIMIT, so ORDER BY queries with and without LIMIT may return rows in different order. Consider this query, which is sorted by category column, but is uncertain for id and rating columns:

mysql> SELECT * FROM ratings ORDER BY category;
+----+----------+--------+
| id | category | rating |
+----+----------+--------+
|  1 |        1 |    4.5 |
|  5 |        1 |    3.2 |
|  3 |        2 |    3.7 |
|  4 |        2 |    3.5 |
|  6 |        2 |    3.5 |
|  2 |        3 |    5.0 |
|  7 |        3 |    2.7 |
+----+----------+--------+

Including LIMIT may affect the order of rows in each category value. For example, this is a valid query result:

mysql> SELECT * FROM ratings ORDER BY category LIMIT 5;
+----+----------+--------+
| id | category | rating |
+----+----------+--------+
|  1 |        1 |    4.5 |
|  5 |        1 |    3.2 |
|  4 |        2 |    3.5 |
|  3 |        2 |    3.7 |
|  6 |        2 |    3.5 |
+----+----------+--------+

In each case, the rows are sorted by the ORDER BY column, which is required by the SQL standard.

If it is important to ensure that the order of rows with and without LIMIT is the same, include other columns in the ORDER BY clause to make the order deterministic. For example, if the id value is unique, we can make the rows of a given category value display in id ORDER BY sorting as follows:

mysql> SELECT * FROM ratings ORDER BY category, id;
+----+----------+--------+
| id | category | rating |
+----+----------+--------+
|  1 |        1 |    4.5 |
|  5 |        1 |    3.2 |
|  3 |        2 |    3.7 |
|  4 |        2 |    3.5 |
|  6 |        2 |    3.5 |
|  2 |        3 |    5.0 |
|  7 |        3 |    2.7 |
+----+----------+--------+

mysql> SELECT * FROM ratings ORDER BY category, id LIMIT 5;
+----+----------+--------+
| id | category | rating |
+----+----------+--------+
|  1 |        1 |    4.5 |
|  5 |        1 |    3.2 |
|  3 |        2 |    3.7 |
|  4 |        2 |    3.5 |
|  6 |        2 |    3.5 |
+----+----------+--------+

For queries with ORDER BY or GROUP BY and LIMIT clauses, the optimizer will try to select an ordered index by default, which will speed up query execution. Before MySQL 5.7.33, this behavior could not be overridden, even if other optimizations might be faster. Starting with MySQL 5.7.33, you can set optimizer_ Preference of switch system variable_ ordering_ Set the index flag to off to turn off this optimization.

Example: first, we create and populate a table t, as follows:

mysql> CREATE TABLE t (
    ->     id1 BIGINT NOT NULL,
    ->     id2 BIGINT NOT NULL,
    ->     c1 VARCHAR(50) NOT NULL,
    ->     c2 VARCHAR(50) NOT NULL,
    ->  PRIMARY KEY (id1),
    ->  INDEX i (id2, c1)
    -> );

# [Insert some rows into table t - not shown]

Verify that preference is enabled_ ordering_ Index flag:

mysql> SELECT @@optimizer_switch LIKE '%prefer_ordering_index=on%';
+------------------------------------------------------+
| @@optimizer_switch LIKE '%prefer_ordering_index=on%' |
+------------------------------------------------------+
|                                                    1 |
+------------------------------------------------------+

Since the following query has a LIMIT clause, we want it to use ordered indexes as much as possible. In this case, as we can see from the EXPLAIN output, it uses the primary key of the table.

mysql> EXPLAIN SELECT c2 FROM t
    ->     WHERE id2 > 3
    ->     ORDER BY id1 ASC LIMIT 2
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: index
possible_keys: i
          key: PRIMARY
      key_len: 8
          ref: NULL
         rows: 2
     filtered: 70.00
        Extra: Using where

Now let's disable preference_ ordering_ Index flag and re run the same query; This time it uses index i (including id2 column used in WHERE clause) and file sorting:

mysql> SET optimizer_switch = "prefer_ordering_index=off";

mysql> EXPLAIN SELECT c2 FROM t
    ->     WHERE id2 > 3
    ->     ORDER BY id1 ASC LIMIT 2
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: range
possible_keys: i
          key: i
      key_len: 8
          ref: NULL
         rows: 14
     filtered: 100.00
        Extra: Using index condition; Using filesort

Topics: Java Database MySQL Back-end