Merge Join
In general, the effect of hash connection is better than that of merge connection. However, if there is an index on the source data, or the results have been sorted, sorting is not needed when performing sort merge connection, and the performance of merge connection will be better than that of hash connection.
In the following example, both the id field of the people and the depto field of dept01 have indexes, and the data scanned from the indexes has been arranged in order. You can directly go to Merge Join:
highgo=# explain select people.id from people,dept01 where people.id=dept01.deptno;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Merge Join (cost=0.86..64873.59 rows=1048576 width=4)
Merge Cond: (people.id = dept01.deptno)
-> Index Only Scan using people_pkey on people (cost=0.44..303935.44 rows=10000000 width=4)
-> Index Only Scan using idx_deptno on dept01 (cost=0.42..51764.54 rows=1048576 width=2)
(4 Row record)
To delete the index on dept01, you will find that dept01 is sorted before Merge Join in the execution plan. The example is as follows:
highgo=# explain select people.id from people,dept01 where people.id=dept01.deptno;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Merge Join (cost=136112.80..154464.29 rows=1048576 width=4)
Merge Cond: (people.id = dept01.deptno)
-> Index Only Scan using people_pkey on people (cost=0.44..303935.44 rows=10000000 width=4)
-> Materialize (cost=136112.36..141355.24 rows=1048576 width=2)
-> Sort (cost=136112.36..138733.80 rows=1048576 width=2)
Sort Key: dept01.deptno
-> Seq Scan on dept01 (cost=0.00..16918.76 rows=1048576 width=2)
(7 Row record)
In the above execution plan, you can see "Sort Key: dept01.deptno", which is to sort the id field of table dept01.