Postgresql source code (31) Btree index related system tables and overall structure

Posted by andrewgk on Thu, 10 Feb 2022 18:35:08 +0100

version: postgresql10.17

0 summary

summary

  • leaf page
    • If the level 0 page (the bottom page) has a right brother, the first one is the first one of the right brothers (it should be the last one on this page. Note that this one does not belong to this page, but the smallest one on the next page is worth copying). Note: the leftmost page of level 0 page also has this copy
    • If there is no data on the bottom page (0), the first page is the bottom page
  • root / branch
    • The data stored in the root/branch page is the minimum value pointing to the page (normally the second one, and the first one is generally the copy of the second "minimum value of current page" pointing to the right brother of the page). In particular, the data that the root/branch points to the leftmost page does not record information.
  • How to find the leftmost and rightmost of the intermediate node (branch/root)?
    • The data on the leftmost page is empty
    • The first piece of leaf pointed to by the rightmost node is the minimum value of the current page (the first piece of other pages saves the copy of the minimum value of right sibling)
--------
| meta |
--------
| root |
--------
| left branch | branch | branch | ..... | branch |
--------------------------------------------------------
| leaf | leaf | leaf | leaf | leaf | right leaf   |
--------------------------------------------------------

! left branch of data Empty
! branch of data Is the minimum value that points to the page
! right leaf The first entry is the minimum value of the current page
! leaf The first one is right sibling Minimum value of copy

1 related system table

What index types are supported?

postgres=# \d pg_am
               Table "pg_catalog.pg_am"
  Column   |  Type   | Collation | Nullable | Default 
-----------+---------+-----------+----------+---------
 amname    | name    |           | not null | 
 amhandler | regproc |           | not null | 
 amtype    | "char"  |           | not null |
 
postgres=# select * from pg_am;
 amname |  amhandler  | amtype 
--------+-------------+--------
 btree  | bthandler   | i
 hash   | hashhandler | i
 gist   | gisthandler | i
 gin    | ginhandler  | i
 spgist | spghandler  | i
 brin   | brinhandler | i

PS. the system table before 9.6 contains a lot of attribute information of method. In the new version, these attributes are saved in C code.

amhandler's correlation function is in PG_ In proc, all are C functions.

select proname, prosrc from pg_proc where oid in (330,331,332,333,334,335);
   proname   |   prosrc    
-------------+-------------
 bthandler   | bthandler
 hashhandler | hashhandler
 gisthandler | gisthandler
 ginhandler  | ginhandler
 spghandler  | spghandler
 brinhandler | brinhandler

example:

postgres=# \d t2
                 Table "public.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 id     | integer |           | not null | 
 info   | integer |           |          | 
Indexes:
    "pk_t2_a" PRIMARY KEY, btree (id)

System table meaning:

postgres=# select * from pg_index where indexrelid='pk_t2_a'::regclass;
-[ RECORD 1 ]--+-------
indexrelid     | 126817
indrelid       | 126814
indnatts       | 1          -- Number of columns
indisunique    | t          -- unique index
indisprimary   | t          -- primary key 
indisexclusion | f          -- support exclusion constraint
indimmediate   | t          -- Force every time insert Check uniqueness when
indisclustered | f          -- Clustering is done according to this index
indisvalid     | t          -- Used for query
indcheckxmin   | f          -- Index tuples need to be checked when querying xmin<Transactional xmin Can use
indisready     | t          -- Whether the index can be inserted or updated
indislive      | t          -- f Indicates that the index is being drop,Other operations should ignore this index
indisreplident | f          -- ALTER TABLE ... REPLICA IDENTITY USING INDEX
indkey         | 1          -- Which columns were built with
indcollation   | 0
indclass       | 1978
indoption      | 0
indexprs       | 
indpred        | 

-- indisreplident Subscribed update and delete,Need to specify REPLICA IDENTITY To perform update deletion

postgres=# select oid,* from pg_class order by oid desc limit 1;
-[ RECORD 1 ]-------+--------
oid                 | 126817
relname             | pk_t2_a
relnamespace        | 2200
reltype             | 0
reloftype           | 0
relowner            | 10
relam               | 403
relfilenode         | 126817
reltablespace       | 0
relpages            | 1
reltuples           | 0
relallvisible       | 0
reltoastrelid       | 0
relhasindex         | f
relisshared         | f
relpersistence      | p
relkind             | i
relnatts            | 1
relchecks           | 0
relhasoids          | f
relhaspkey          | f
relhasrules         | f
relhastriggers      | f
relhassubclass      | f
relrowsecurity      | f
relforcerowsecurity | f
relispopulated      | t
relreplident        | n
relispartition      | f
relfrozenxid        | 0
relminmxid          | 0
relacl              | 
reloptions          | 
relpartbound        |

2 data structure

|-----------------------------------------------------|
| linp0(page header) | linp1 | linp2 |                |
|-----------------------------------------------------|
|                                                     |
|                                                     |
|-----------------------------------------------------|
|      | itup2 | itup1 | special space(BTMetaPageData)|
|-----------------------------------------------------|

BTPageOpaqueData maintains the structure of B-linked-tree in special space

typedef struct BTPageOpaqueData
{
	BlockNumber btpo_prev;		/* The block number of the left brother is convenient for reverse scanning */
	BlockNumber btpo_next;		/* Right brother's block number, forward scanning*/
	union
	{
		uint32		level;		/* level in the index tree*/
		TransactionId xact;		/* Delete useless ID S */
	}			btpo;
	uint16		btpo_flags;		/* Page type */
	BTCycleId	btpo_cycleid;	/* vacuum cycle ID of latest split */
} BTPageOpaqueData;


/* Bits defined in btpo_flags */
#define BTP_LEAF 		 (1 << 0) 	/*  This flag indicates that there are leaf pages*/
#define BTP_ROOT 		 (1 << 1) 	/*  Root page*/
#define BTP_DELETED 		 (1 << 2) 	/*  Pages deleted from the index tree*/
#define BTP_META 		 (1 << 3) 	/*  Meta page*/
#define BTP_HALF_DEAD 	 (1 << 4) 	/*  Empty page, but keep it*/
#define BTP_SPLIT_END	(1 << 5)	/* rightmost page of split group */
#define BTP_ HAS_ GARBAGE (1 << 6) 	/*  page has LP_DEAD tuples, LP -- DEAD tuples*/
#define BTP_INCOMPLETE_SPLIT (1 << 7)	/* right sibling's downlink is missing */

// BTMetaPageData
typedef struct BTMetaPageData
{
	uint32		btm_magic;		/* should contain BTREE_MAGIC */
	uint32		btm_version;	/* should contain BTREE_VERSION */
	BlockNumber btm_root;		/* current root location */
	uint32		btm_level;		/* tree level of the root page */
	BlockNumber btm_fastroot;	/* current "fast" root location */
	uint32		btm_fastlevel;	/* tree level of the "fast" root page */
} BTMetaPageData;


3 observe the storage structure

summary

  • If the level 0 page (the bottom page) has a right brother, the first one is the first one of the right brothers (it should be the last one on this page. Note that this one does not belong to this page, but the smallest one on the next page is worth copying). Note: the leftmost page of level 0 page also has this copy
  • If there is no right brother in the level 0 page (the bottom page), the first item is the starting data
  • The data stored in the root page is the minimum value pointing to the page (normally the second one, and the first one is the copy of the right slicing minimum value). This first one is generally the copy of the second one (the minimum value of the current page) pointing to the right brother of the page. In particular, the data pointing to the leftmost page does not record information.

Observe the storage structure:

----------------
|  meta page   |
----------------
| root page    |
----------------

Create test table

create table t3(id int primary key, info text);

meta page

The 0th page of the test table is meta page, pointing to the root page

postgres=# \d t3
                 Table "public.t3"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 id     | integer |           | not null | 
 info   | text    |           |          | 
Indexes:
    "t3_pkey" PRIMARY KEY, btree (id)


postgres=#  select * from bt_page_stats('t3_pkey',0);
ERROR:  block 0 is a meta page

postgres=# select * from bt_page_items('t3_pkey',0);
ERROR:  block 0 is a meta page

postgres=# select * from bt_metap('t3_pkey');
 magic  | version | root | level | fastroot | fastlevel 
--------+---------+------+-------+----------+-----------
 340322 |       2 |    1 |     0 |        1 |         0

Scenario 1: 100 entries | meta page | root | branch | leaf

create table t3(id int primary key, info text);
insert into t3 select generate_series(1,100), md5(random()::text);

level = 0

select * from bt_metap('t3_pkey');

btpo = grade
btpo_flags = type

postgres=#  select * from bt_page_stats('t3_pkey',1);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags 
-------+------+------------+------------+---------------+-----------+-----------+-----------+-----------+------+------------
     1 | l    |        100 |          0 |            16 |      8192 |      6148 |         0 |         0 |    0 |          3
(1 row)

bt_page_stats data is taken from btpageopera of specialspan. Refer to btpageopera data structure for meaning

bt_page_stats
  GetBTPageStatistics
    BTPageOpaque opaque = (BTPageOpaque) PageGetSpecialPointer(page);
    ...
    stat->btpo_prev = opaque->btpo_prev;
	stat->btpo_next = opaque->btpo_next;
	stat->btpo.level = opaque->btpo.level;
	stat->btpo_flags = opaque->btpo_flags;
	stat->btpo_cycleid = opaque->btpo_cycleid;
	...

https://developer.aliyun.com/article/53701
The hierarchy can be from BT_ page_ The btpo of stats is obtained, which represents the level of the current index page.
Note that the level is not unique. For example, the level with btpo=3 may be divided into several levels.
btpo=0 indicates the lowest level. The items(ctid) stored in index pages at this level points to the heap page.
The category and level are not linked. There can be multiple levels in the category, but only the ctid content stored in the index page with level = 0 points to the heap page
The ctid content stored in index pages of other levels refers to other index pages (two-way linked list) of the same level or index pages of lower levels.
0 layer structure, only meta and root pages.
The maximum number of item s that can be stored in the root page depends on the length of the index field data and the size of the index page.

current
btpo = grade = 0
btpo_flags = type = 3 = BTP_LEAF | BTP_ROOT

#define BTP_LEAF		(1 << 0)	/* leaf page, i.e. not internal page */
#define BTP_ROOT		(1 << 1)	/* root page (has no parent) */
#define BTP_DELETED		(1 << 2)	/* page has been deleted from tree */
#define BTP_META		(1 << 3)	/* meta-page */
#define BTP_HALF_DEAD	(1 << 4)	/* empty, but still in tree */
#define BTP_SPLIT_END	(1 << 5)	/* rightmost page of split group */
#define BTP_HAS_GARBAGE (1 << 6)	/* page has LP_DEAD tuples */
#define BTP_INCOMPLETE_SPLIT (1 << 7)	/* right sibling's downlink is missing */

btpo knows that this is a level 0 node, and the level 0 node points to the heap tuple

postgres=# select * from bt_page_items('t3_pkey',1) limit 10;
 itemoffset |  ctid  | itemlen | nulls | vars |          data           
------------+--------+---------+-------+------+-------------------------
          1 | (0,1)  |      16 | f     | f    | 01 00 00 00 00 00 00 00
          2 | (0,2)  |      16 | f     | f    | 02 00 00 00 00 00 00 00
          3 | (0,3)  |      16 | f     | f    | 03 00 00 00 00 00 00 00
          4 | (0,4)  |      16 | f     | f    | 04 00 00 00 00 00 00 00
          5 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00
          6 | (0,6)  |      16 | f     | f    | 06 00 00 00 00 00 00 00
          7 | (0,7)  |      16 | f     | f    | 07 00 00 00 00 00 00 00
          8 | (0,8)  |      16 | f     | f    | 08 00 00 00 00 00 00 00
          9 | (0,9)  |      16 | f     | f    | 09 00 00 00 00 00 00 00
         10 | (0,10) |      16 | f     | f    | 0a 00 00 00 00 00 00 00
         
postgres=# select * from t3 where ctid='(0,10)';
 id |               info               
----+----------------------------------
 10 | 986dd76dcec6d1b705ea163fe6e11f37

Scenario 2: 1000 | meta | root | branch | leaf

create table t4(id int primary key, info text);
insert into t4 select generate_series(1,1000), md5(random()::text);

level = 1 (only leaf, meta and root are excluded)

select * from bt_metap('t4_pkey');

1. root page ID=3

postgres=# select * from bt_metap('t4_pkey');
 magic  | version | root | level | fastroot | fastlevel 
--------+---------+------+-------+----------+-----------
 340322 |       2 |    3 |     1 |        3 |         1

2. Page stats

-- btpo=0: It's already at the bottom
-- btpo_flags=1: BTP_LEAF
postgres=# select * from bt_page_stats('t4_pkey',1);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags 
-------+------+------------+------------+---------------+-----------+-----------+-----------+-----------+------+------------
     1 | l    |        367 |          0 |            16 |      8192 |       808 |         0 |         2 |    0 |          1

-- btpo=0: It's already at the bottom
-- btpo_flags=1: BTP_LEAF
postgres=# select * from bt_page_stats('t4_pkey',2);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags 
-------+------+------------+------------+---------------+-----------+-----------+-----------+-----------+------+------------
     2 | l    |        367 |          0 |            16 |      8192 |       808 |         1 |         4 |    0 |          1

-- btpo=1: Not the bottom
-- btpo_flags=2: BTP_ROOT
postgres=# select * from bt_page_stats('t4_pkey',3);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags 
-------+------+------------+------------+---------------+-----------+-----------+-----------+-----------+------+------------
     3 | r    |          3 |          0 |            13 |      8192 |      8096 |         0 |         0 |    1 |          2

-- btpo=0: It's already at the bottom
-- btpo_flags=1: BTP_LEAF
postgres=# select * from bt_page_stats('t4_pkey',4);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags 
-------+------+------------+------------+---------------+-----------+-----------+-----------+-----------+------+------------
     4 | l    |        268 |          0 |            16 |      8192 |      2788 |         2 |         0 |    0 |          1

postgres=# select * from bt_page_stats('t4_pkey',5);
ERROR:  block number out of range

3. root page items

Point to three pages. data stores the minimum value stored in this leaf page

postgres=# select * from bt_page_items('t4_pkey',3);
 itemoffset | ctid  | itemlen | nulls | vars |          data           
------------+-------+---------+-------+------+-------------------------
          1 | (1,1) |       8 | f     | f    | 
          2 | (2,1) |      16 | f     | f    | 6f 01 00 00 00 00 00 00
          3 | (4,1) |      16 | f     | f    | dd 02 00 00 00 00 00 00

First level 0 page:

postgres=# select * from bt_page_items('t4_pkey',1);
 itemoffset |  ctid   | itemlen | nulls | vars |          data           
------------+---------+---------+-------+------+-------------------------
          1 | (3,7)   |      16 | f     | f    | 6f 01 00 00 00 00 00 00
          2 | (0,1)   |      16 | f     | f    | 01 00 00 00 00 00 00 00
          3 | (0,2)   |      16 | f     | f    | 02 00 00 00 00 00 00 00
          4 | (0,3)   |      16 | f     | f    | 03 00 00 00 00 00 00 00
...
        365 | (3,4)   |      16 | f     | f    | 6c 01 00 00 00 00 00 00
        366 | (3,5)   |      16 | f     | f    | 6d 01 00 00 00 00 00 00
        367 | (3,6)   |      16 | f     | f    | 6e 01 00 00 00 00 00 00

Second level 0 page:

postgres=# select * from bt_page_items('t4_pkey',2);
 itemoffset |  ctid   | itemlen | nulls | vars |          data           
------------+---------+---------+-------+------+-------------------------
          1 | (6,13)  |      16 | f     | f    | dd 02 00 00 00 00 00 00
          2 | (3,7)   |      16 | f     | f    | 6f 01 00 00 00 00 00 00
          3 | (3,8)   |      16 | f     | f    | 70 01 00 00 00 00 00 00
          4 | (3,9)   |      16 | f     | f    | 71 01 00 00 00 00 00 00
...
        365 | (6,10)  |      16 | f     | f    | da 02 00 00 00 00 00 00
        366 | (6,11)  |      16 | f     | f    | db 02 00 00 00 00 00 00
        367 | (6,12)  |      16 | f     | f    | dc 02 00 00 00 00 00 00

Fourth level 0 page:

postgres=# select * from bt_page_items('t4_pkey',4);
 itemoffset |  ctid   | itemlen | nulls | vars |          data           
------------+---------+---------+-------+------+-------------------------
          1 | (6,13)  |      16 | f     | f    | dd 02 00 00 00 00 00 00
          2 | (6,14)  |      16 | f     | f    | de 02 00 00 00 00 00 00
          3 | (6,15)  |      16 | f     | f    | df 02 00 00 00 00 00 00
          4 | (6,16)  |      16 | f     | f    | e0 02 00 00 00 00 00 00
...
        266 | (8,38)  |      16 | f     | f    | e6 03 00 00 00 00 00 00
        267 | (8,39)  |      16 | f     | f    | e7 03 00 00 00 00 00 00
        268 | (8,40)  |      16 | f     | f    | e8 03 00 00 00 00 00 00

Scenario 3: 10000000 | meta | root | branch | leaf

Summary:

  • The first number of ctid indicates the page ID, and the second number is the offset
  • btpo_flags = 0 is branch page, btpo_flags=2 is the root page
create table t5(id int primary key, info text);  
insert into t5 select trunc(random()*10000000), md5(random()::text) from generate_series(1,1000000) on conflict on constraint t5_pkey do nothing;

level = 2 (with branch and leaf) root is 412

select * from bt_metap('t5_pkey');
 magic  | version | root | level | fastroot | fastlevel 
--------+---------+------+-------+----------+-----------
 340322 |       2 |  412 |     2 |      412 |         2

Check the root (the data stores the maximum value of the left brother)
btpo = 2 (on the second floor)
btpo_flags = 2 = BTP_ROOT

postgres=# select * from bt_page_stats('t5_pkey', 412);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags 
-------+------+------------+------------+---------------+-----------+-----------+-----------+-----------+------+------------
   412 | r    |         11 |          0 |            15 |      8192 |      7936 |         0 |         0 |    2 |          2
   
postgres=# select * from bt_page_items('t5_pkey', 412);
 itemoffset |   ctid   | itemlen | nulls | vars |          data           
------------+----------+---------+-------+------+-------------------------
          1 | (3,1)    |       8 | f     | f    | 
          2 | (2714,1) |      16 | f     | f    | 66 07 0a 00 00 00 00 00
          3 | (1233,1) |      16 | f     | f    | a0 8c 16 00 00 00 00 00
          4 | (2381,1) |      16 | f     | f    | 0c 62 23 00 00 00 00 00
          5 | (583,1)  |      16 | f     | f    | 4d 71 30 00 00 00 00 00
          6 | (2286,1) |      16 | f     | f    | 6b 80 3d 00 00 00 00 00
          7 | (1062,1) |      16 | f     | f    | 59 fb 4b 00 00 00 00 00
          8 | (2046,1) |      16 | f     | f    | de b2 5a 00 00 00 00 00
          9 | (411,1)  |      16 | f     | f    | d6 df 6a 00 00 00 00 00
         10 | (2006,1) |      16 | f     | f    | 06 6d 79 00 00 00 00 00
         11 | (1380,1) |      16 | f     | f    | 8c 34 8a 00 00 00 00 00

It seems that there are 11 branch page s, including 3, 2714, 1233 and 2381.

postgres=# select * from bt_page_stats('t5_pkey', 3);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags 
-------+------+------------+------------+---------------+-----------+-----------+-----------+-----------+------+------------
     3 | i    |        214 |          0 |            15 |      8192 |      3876 |         0 |      2714 |    1 |          0

postgres=# select * from bt_page_stats('t5_pkey', 2714);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags 
-------+------+------------+------------+---------------+-----------+-----------+-----------+-----------+------+------------
  2714 | i    |        273 |          0 |            15 |      8192 |      2696 |         3 |      1233 |    1 |          0

Let's take a look at the content of the branch, for example, 3 pages:

postgres=# select * from bt_page_items('t5_pkey', 3);
 itemoffset |   ctid   | itemlen | nulls | vars |          data           
------------+----------+---------+-------+------+-------------------------
          1 | (1852,1) |      16 | f     | f    | 66 07 0a 00 00 00 00 00
          2 | (1,1)    |       8 | f     | f    | 
          3 | (3176,1) |      16 | f     | f    | 5c 08 00 00 00 00 00 00
          4 | (1612,1) |      16 | f     | f    | ef 11 00 00 00 00 00 00
          5 | (3045,1) |      16 | f     | f    | 2d 1a 00 00 00 00 00 00
          6 | (724,1)  |      16 | f     | f    | e2 24 00 00 00 00 00 00
          7 | (2822,1) |      16 | f     | f    | ae 2d 00 00 00 00 00 00
          8 | (1369,1) |      16 | f     | f    | 96 38 00 00 00 00 00 00
          9 | (2612,1) |      16 | f     | f    | ac 42 00 00 00 00 00 00
         10 | (315,1)  |      16 | f     | f    | 0d 4e 00 00 00 00 00 00
         11 | (2458,1) |      16 | f     | f    | d6 58 00 00 00 00 00 00
...
...

What does branch data store?

Check the content of leaf page 1852. The first item is the copy of the minimum value of the right brother; The second is the current minimum value.
So data is the smallest data to save to the page.

postgres=# select * from bt_page_items('t5_pkey', 1852);
 itemoffset |    ctid    | itemlen | nulls | vars |          data           
------------+------------+---------+-------+------+-------------------------
          1 | (151,77)   |      16 | f     | f    | 2b 18 0a 00 00 00 00 00
          2 | (1231,33)  |      16 | f     | f    | 66 07 0a 00 00 00 00 00     <-----------
          3 | (4098,24)  |      16 | f     | f    | 67 07 0a 00 00 00 00 00
          4 | (2373,32)  |      16 | f     | f    | 73 07 0a 00 00 00 00 00
          5 | (7888,95)  |      16 | f     | f    | 9e 07 0a 00 00 00 00 00

How to find the minimum value?

Check the root and find that only (3,1) has no data. Data should have saved the minimum value of the right brother. No description (3,1) is the first page.

postgres=# select * from bt_page_items('t5_pkey', 412);
 itemoffset |   ctid   | itemlen | nulls | vars |          data           
------------+----------+---------+-------+------+-------------------------
          1 | (3,1)    |       8 | f     | f    | 
          2 | (2714,1) |      16 | f     | f    | 66 07 0a 00 00 00 00 00
          3 | (1233,1) |      16 | f     | f    | a0 8c 16 00 00 00 00 00
          4 | (2381,1) |      16 | f     | f    | 0c 62 23 00 00 00 00 00

View branch

postgres=# select * from bt_page_items('t5_pkey', 3);
 itemoffset |   ctid   | itemlen | nulls | vars |          data           
------------+----------+---------+-------+------+-------------------------
          1 | (1852,1) |      16 | f     | f    | 66 07 0a 00 00 00 00 00
          2 | (1,1)    |       8 | f     | f    | 
          3 | (3176,1) |      16 | f     | f    | 5c 08 00 00 00 00 00 00
          4 | (1612,1) |      16 | f     | f    | ef 11 00 00 00 00 00 00
          5 | (3045,1) |      16 | f     | f    | 2d 1a 00 00 00 00 00 00
          6 | (724,1)  |      16 | f     | f    | e2 24 00 00 00 00 00 00
          7 | (2822,1) |      16 | f     | f    | ae 2d 00 00 00 00 00 00
          8 | (1369,1) |      16 | f     | f    | 96 38 00 00 00 00 00 00
          9 | (2612,1) |      16 | f     | f    | ac 42 00 00 00 00 00 00

Check the leaf. The first one is the minimum value of the right sibling, and the second one is the minimum value of the current page.

postgres=# select * from bt_page_items('t5_pkey', 1);
 itemoffset |    ctid    | itemlen | nulls | vars |          data           
------------+------------+---------+-------+------+-------------------------
          1 | (212,57)   |      16 | f     | f    | 5c 08 00 00 00 00 00 00
          2 | (7238,62)  |      16 | f     | f    | 2a 00 00 00 00 00 00 00
          3 | (430,12)   |      16 | f     | f    | 2b 00 00 00 00 00 00 00
          4 | (5535,81)  |      16 | f     | f    | 3e 00 00 00 00 00 00 00
          5 | (1278,47)  |      16 | f     | f    | 3f 00 00 00 00 00 00 00
          6 | (4255,45)  |      16 | f     | f    | 46 00 00 00 00 00 00 00

View data

postgres=# select * from bt_page_items('t5_pkey', 1);
 itemoffset |    ctid    | itemlen | nulls | vars |          data           
------------+------------+---------+-------+------+-------------------------
          1 | (212,57)   |      16 | f     | f    | 5c 08 00 00 00 00 00 00
          2 | (7238,62)  |      16 | f     | f    | 2a 00 00 00 00 00 00 00
          3 | (430,12)   |      16 | f     | f    | 2b 00 00 00 00 00 00 00
          4 | (5535,81)  |      16 | f     | f    | 3e 00 00 00 00 00 00 00
          5 | (1278,47)  |      16 | f     | f    | 3f 00 00 00 00 00 00 00
          6 | (4255,45)  |      16 | f     | f    | 46 00 00 00 00 00 00 00

postgres=# select * from t5 where ctid='(7238,62)';
 id |               info               
----+----------------------------------
 42 | faca6c2f77cfb6eea5e9ee6bf2740578

postgres=# select min(id) from t5;
 min 
-----
  42

ref

https://developer.aliyun.com/article/53701

The next article begins to turn to the source code

Topics: Database PostgreSQL pgsql