Mybatis < collection > use method and N+1 query problem

Posted by Thivya on Mon, 24 Jan 2022 06:15:17 +0100

catalogue

Mybatis tag nested Sql statement / nested result mapping

N+1 query questions

Lazy loading

Sql keyword In

ResultSet

Mybatis < Association > tag nested Sql statement / nested result mapping

Mybatis's result map function is really powerful. It can help you effectively convert Sql query results into corresponding Bean entities.

In order to map the query results of complex Sql statements to corresponding Bean entities (possibly multiple), you need to use the < Association > tag and < Collection > tag in MyBatis.

At the same time, the above two tags support nested select queries or nested resultMap queries.

As the document says:

  • Nested Select query: load the desired complex type by executing another SQL mapping statement.
  • Nested result mapping: use nested result mapping to handle duplicate subsets of connected results.

The choice of nesting depends on the SQL statement you use: if you construct a complex SQL statement with the Join keyword to connect multiple tables, you should nest the result mapping. If you want to load the association information in another table by executing another SQL mapping statement, you should nest the Select query.

Here is a simple example:

General part:

#desc book
id	int(11)	NO	PRI		auto_increment
name	varchar(128)	YES			
author	varchar(64)	YES			

#desc book_comment
id	int(11)	NO	PRI		auto_increment
book_id	int(11)	YES	MUL		
comment	varchar(255)	YES			
public class Book {
    private Integer id;
    private String name;
    private String author;
    //Omit getter/setter
}

public class Comment {
    private String bookId;
    private Integer id;
    private String content;
    //Omit getter/setter
}
# Model layer

# Service layer
@Service
public class BookService {
    @Autowired
    BookMapper bookMapper;
    
    //....
    public HashMap getBookById(Integer id) {
        return bookMapper.getBookById(id);
    }
    //.....
}

If it is written in a "nested Select" statement, the xml file of Mybatis should be written in this way. It mainly focuses on the column attribute in the collection tag. Its function is to pass the value of the column "id" in "bookResultMap" to "selectCommentsForBook", and then construct another Sql mapping statement to obtain the comments of the book with the corresponding id.

<select id="getBookById" parameterType="int" resultMap="bookResultMap">
        select
        * from book
        where id = #{id}
    </select>

    <select id="selectCommentsForBook" resultMap="commentResult">
        SELECT * FROM book_comment WHERE book_id = #{id}
    </select>

    <resultMap id="commentResult" type="org.sang.model.Comment">
        <id property="id" column="id"/>
        <result property="bookId" column="book_id"/>
        <result property="content" column="comment"/>
    </resultMap>

    <resultMap id="bookResultMap" type="hashmap">
        <id property="id" column="id" />
        <result property="name" column="name"/>
        <result property="author" column="author"/>
        <collection property="comments" javaType="ArrayList" ofType="org.sang.model.Comment" column="id" select="selectCommentsForBook"/>
    </resultMap>

If it is written in "nested mapping result", you need to construct a Sql statement to connect two tables. In the result mapping, the Collection tag does not need to nest the Select statement, but only the result mapping corresponding to the Comment class:

<select id="getBookById" parameterType="int" resultMap="bookResultMap">
        select
        B.id as book_id,
        B.name as book_name,
        B.author as book_author,
        A.Id as comment_id,
        A.book_id as comment_book_id,
        A.Comment as comment_comment
        from Book B left outer join book_comment A on B.id=A.book_id
        where B.id = #{id}
    </select>

    <resultMap id="bookResultMap" type="hashmap">
        <id property="id" column="book_id" />
        <result property="name" column="book_name"/>
        <result property="author" column="book_author"/>
        <collection property="comments" javaType="ArrayList" ofType="org.sang.model.Comment">
            <id property="id" column="comment_id"/>
            <result property="bookId" column="comment_book_id"/>
            <result property="content" column="comment_comment"/>
        </collection>
    </resultMap>

N+1 query questions

The Mybatis document states:

Although the writing method of nested select (corresponding to the first writing method above) is easy to understand, it performs poorly on large databases. The reason for this phenomenon is the N+1 query problem:

  • You execute a separate SQL statement to get a list of results (that is "+ 1").
  • For each record returned from the list, you execute a select query statement to load the details (i.e. "N") for each record.

Take the above example as a specific example. First, a SQL statement will be executed to obtain the list of books that meet the conditions, and then a select query will be executed for each book in the list to select the corresponding comments. Although using the indexing mechanism, the execution cost of these SQL statements will not be too large, the view of the document is that "this will lead to the execution of hundreds of SQL statements, which is sometimes undesirable".

In order to further understand the severity of the N+1 query problem, I look for relevant information. As a result, a search found that this is an old-fashioned problem.

The solutions include: (1) use the join statement In the Sql statement to connect multiple tables and query, and use the first method above to construct the result map, which will not be mentioned here; (2) Using lazy loading technology to delay the time node at which each operation In the "N query" part is executed; (3) Merge N queries into one query, which can be achieved by using the keyword In of Sql; (4) Use mybatis3 New label ResultSets after version 2.3

Lazy loading

In order to experiment with how to write lazy loading, you need to modify the configuration file to allow Mybatis to output log files. This step is omitted here.

First, it is intact. According to the above writing, it can be seen from the Debug message that N+1 executions will be executed immediately.

org.sang.mapper.BookMapper.getBookById   : ==>  Preparing: select * from book where id = ? 
o.s.m.BookMapper.selectCommentsForBook   : ====>  Preparing: SELECT * FROM book_comment WHERE book_id = ? 

Next, test the validity of lazy loading method, that is, add the fetchType attribute in the < Collection > tag and assign it as lazy. The test function is modified to the following style:

@Test
	public void contextLoads() {
		HashMap hMap = (bookService.getBookById(1));
		System.out.println("I am going to visit comment instance for now.");
		System.out.println(hMap);
	}

The output in the message log shows that the Lazy loading method does execute the corresponding Select statement when the user needs to access the Comment data.

org.sang.mapper.BookMapper.getBookById   : ==>  Preparing: select * from book where id = ? 
//.......
I am going to visit comment instance for now.
//.......
o.s.m.BookMapper.selectCommentsForBook   : ==>  Preparing: SELECT * FROM book_comment WHERE book_id = ? 
{comments=[Comment{bookId='1', id=2, content='Not bad'}], author=Lu Xun, name=Morning and evening, id=1}

But as the document says, "if you traverse the list to get nested data immediately after loading the record list, all delayed loading queries will be triggered, and the performance may become very bad". In this case, delayed loading is useless.

Sql keyword In

The idea of this method is very simple. Use the In keyword to merge the list obtained by the first query into the Sql query statement.

explain select * from test_user  where username in ('username_1','username_2','username_3')

In the case of building the corresponding index, the operation will not cost much.

The article In the following link discusses the execution efficiency of In and compares it with the efficiency of temporary table writing. (the idea of temporary table is also very simple, that is, create a new table for the set of IDS to be queried, and connect this new table with the corresponding original table)

The conclusion is that the efficiency of temporary table writing is better when the number of result items is larger.  

https://blog.csdn.net/LJFPHP/article/details/103656726

ResultSet

Some databases allow stored procedures to return multiple result sets, or execute multiple statements at once, each of which returns a result set. We can use this feature to access the database only once without using a connection.

The precondition of using ResultSet is that the database allows stored procedures to return multiple result sets or execute multiple statements at one time & return multiple result sets. Therefore, this method is not universal, so it is just a brief introduction here.

Topics: java web