In September, datawhale teamed up to learn the advanced operation of task05SQL

Posted by KenGR on Thu, 23 Sep 2021 16:51:52 +0200

In September, datawhale teamed up to learn task05 advanced SQL operations

1, Window function

1. Concept, basic usage

  • Window function (OLAP function)

    • Online AnalyticalProcessing
    • Real time analysis and processing of database
  • Window function allows us to selectively summarize, calculate and sort some data

    • Conventional SELECT statements query the entire table
  • General form

    <Window function> OVER ([PARTITION BY <Listing>]
                         ORDER BY <Column name for sorting>)  
    
    • The contents in [] can be omitted
    • PARTITION BY
      • For grouping: select which window to see
        • It is similar to the grouping function of the GROUP BY clause, but the PARTITION BY clause does not have the summary function of the GROUP BY clause and will not change the number of rows recorded in the original table
    • ORDER BY
      • Used for sorting: determines which rule (field) to sort in the window
  • example

    SELECT product_name
           ,product_type
           ,sale_price
           ,RANK() OVER (PARTITION BY product_type
                             ORDER BY sale_price) AS ranking
      FROM product;
    
    • Operation results

    • PARTITION BY enables you to set the window object range

      • Product is specified above_ Type enables it to be sorted by commodity category, that is, a commodity category is a small "window"
    • ORDER BY can specify which column and in which order to sort

      • Sale is specified above_ Price, which is arranged in ascending order of sales unit price

        The ORDER BY in the window function is the same as the ORDER BY at the end of the SELECT statement. You can specify the ascending / descending order through the keyword ASC/DESC. When this keyword is omitted, it will be sorted by ASC by default, that is, ascending order. In this example, the above keywords are omitted

2, Window function type

  • Roughly two categories
    • Window function using aggregate function
      • Aggregate functions such as SUM MAX MIN
    • Special window function for sorting
      • RANK DENSE_RANK

1. Special window function

SELECT product_name,
       product_type,
       sale_price,
       RANK() OVER (ORDER BY  sale_price) AS ranking,
       DENSE_RANK() OVER (ORDER BY sale_price) AS dense_ranking,
       ROW_NUMBER() OVER (ORDER BY sale_price) AS row_num
FROM product;

If you don't write rank, the header will be a little long

2. Use of aggregate function on window function

  • The usage of aggregate function in windowing function is the same as the previous special window function, but the result is a cumulative aggregate function value

  • example

    SELECT  product_id
           ,product_name
           ,sale_price
           ,SUM(sale_price) OVER (ORDER BY product_id) AS current_sum
           ,AVG(sale_price) OVER (ORDER BY product_id) AS current_avg
      FROM product;
    

  • As you can see, the result of the aggregation function is, in the order we specify, here is product_id, the total or average value of the current line and all previous lines. That is, the aggregate accumulated to the current line

3, Application of window function - calculating moving average

  • As mentioned above, when the aggregation function is used in the window function, it calculates the aggregation of all data accumulated to the current row. In fact, you can also specify a more detailed summary range. The summary scope becomes a frame

  • grammar

    <Window function> OVER (ORDER BY <Column name for sorting>
                     ROWS n PRECEDING )  
                     
    <Window function> OVER (ORDER BY <Column name for sorting>
                     ROWS BETWEEN n PRECEDING AND n FOLLOWING)
    
    • Forecasting ("before")
      • Specify the frame as "n lines until", plus its own lines
    • FOLLOWING ("after")
      • Specify the frame as "n lines after due" plus its own lines
    • BETWEEN 1 PRECEDING AND 1 FOLLOWING
      • Specify the frame as "1 line before" + "1 line after" + "itself"
  • example

    SELECT  product_id
           ,product_name
           ,sale_price
           ,AVG(sale_price) OVER (ORDER BY product_id
                                   ROWS 2 PRECEDING) AS moving_avg
           ,AVG(sale_price) OVER (ORDER BY product_id
                                   ROWS BETWEEN 1 PRECEDING 
                                            AND 1 FOLLOWING) AS moving_avg  
      FROM product;  
    

matters needing attention

4, GROUPING operator

Total and subtotal of calculation

  • Conventional GROUP BY can only get the subtotal of each category. Sometimes it is necessary to calculate the total of categories. You can use the ROLLUP keyword

  • SELECT  product_type
           ,regist_date
           ,SUM(sale_price) AS sum_price
      FROM product
     GROUP BY product_type, regist_date WITH ROLLUP;  
    

Topics: Database MySQL SQL