This paper briefly introduces the new function multi-value index introduced in 8.0.17 below. As the name implies, the index can create multiple secondary index items on the same Primary key. In fact, it already supports the basic function of array type (feel that the official future will introduce array column type like pg), and based on arrayTo build a secondary index, which means that the secondary index can have more records than the clustered index records, so the index cannot be used for queries of general meaning, but only through specific interface functions, as illustrated in the following example.
This article does not do a thorough understanding of the code, only records the relevant entry functions, so that you can quickly review them when you encounter future work.A connection to the worklog is attached at the end, and interested friends can read the worklog directly to find out how it works.
Example
Extracted from official documents
root@test 04:08:50>show create table customers\G *************************** 1. row *************************** Table: customers Create Table: CREATE TABLE `customers` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `modified` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `custinfo` json DEFAULT NULL, PRIMARY KEY (`id`), KEY `zips` ((cast(json_extract(`custinfo`,_latin1'$.zip') as unsigned array))) ) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=latin1 1 row in set (0.00 sec) root@test 04:08:53>select * from customers; +----+---------------------+-------------------------------------------------------------------+ | id | modified | custinfo | +----+---------------------+-------------------------------------------------------------------+ | 1 | 2019-08-14 16:08:50 | {"user": "Jack", "user_id": 37, "zipcode": [94582, 94536]} | | 2 | 2019-08-14 16:08:50 | {"user": "Jill", "user_id": 22, "zipcode": [94568, 94507, 94582]} | | 3 | 2019-08-14 16:08:50 | {"user": "Bob", "user_id": 31, "zipcode": [94477, 94536]} | | 4 | 2019-08-14 16:08:50 | {"user": "Mary", "user_id": 72, "zipcode": [94536]} | | 5 | 2019-08-14 16:08:50 | {"user": "Ted", "user_id": 56, "zipcode": [94507, 94582]} | +----+---------------------+-------------------------------------------------------------------+ 5 rows in set (0.00 sec)
The index is available through three functions, member of, json_contains, json_overlaps
root@test 04:09:00>SELECT * FROM customers WHERE 94507 MEMBER OF(custinfo->'$.zipcode'); +----+---------------------+-------------------------------------------------------------------+ | id | modified | custinfo | +----+---------------------+-------------------------------------------------------------------+ | 2 | 2019-08-14 16:08:50 | {"user": "Jill", "user_id": 22, "zipcode": [94568, 94507, 94582]} | | 5 | 2019-08-14 16:08:50 | {"user": "Ted", "user_id": 56, "zipcode": [94507, 94582]} | +----+---------------------+-------------------------------------------------------------------+ 2 rows in set (0.00 sec) root@test 04:09:41>SELECT * FROM customers WHERE JSON_CONTAINS(custinfo->'$.zipcode', CAST('[94507,94582]' AS JSON)); +----+---------------------+-------------------------------------------------------------------+ | id | modified | custinfo | +----+---------------------+-------------------------------------------------------------------+ | 2 | 2019-08-14 16:08:50 | {"user": "Jill", "user_id": 22, "zipcode": [94568, 94507, 94582]} | | 5 | 2019-08-14 16:08:50 | {"user": "Ted", "user_id": 56, "zipcode": [94507, 94582]} | +----+---------------------+-------------------------------------------------------------------+ 2 rows in set (0.00 sec) root@test 04:09:54>SELECT * FROM customers WHERE JSON_OVERLAPS(custinfo->'$.zipcode', CAST('[94507,94582]' AS JSON)); +----+---------------------+-------------------------------------------------------------------+ | id | modified | custinfo | +----+---------------------+-------------------------------------------------------------------+ | 1 | 2019-08-14 16:08:50 | {"user": "Jack", "user_id": 37, "zipcode": [94582, 94536]} | | 2 | 2019-08-14 16:08:50 | {"user": "Jill", "user_id": 22, "zipcode": [94568, 94507, 94582]} | | 5 | 2019-08-14 16:08:50 | {"user": "Ted", "user_id": 56, "zipcode": [94507, 94582]} | +----+---------------------+-------------------------------------------------------------------+ 3 rows in set (0.00 sec)
Application program interface
multi-value index is an implementation of functional index. The definition of a column is a virtual column whose value is an array taken from json column
If the same value exists on the array, only one is stored on the index.Supported types: DECIMAL, INTEGER, DATETIME,VARCHAR/CHAR.There can only be one multi-value column on the index.
The following is a brief description of the interface functions
Array maximum capacity:
Entry function: ha_innobase::mv_key_capacity
Insert record:
Entry function row_ins_sec_index_multi_value_entry
Build the tuple by classing Multi_value_entry_builder_insert, then call the normal interface function row_ins_sec_index_entry to insert into the secondary index.
The parsed, sorted and unweighted data is stored in struct multi_value_data, and the pointer is in dfield_t::data. The multi_value_data structure is also a memory representation of the multi-value concrete value
Delete records:
Entry function: row_upd_del_multi_sec_index_entry
Build tuple based on class Multi_value_entry_builder_normal and delete it from index in turn
Update Records
Entry function: row_upd_multi_sec_index_entry
Since not all secondary index records may need to be updated, you need to calculate a diff to find out which records you want to update calc_row_difference--> innobase_get_multi_value_and_diff and set a bitmap that needs to be updated
rollback
Correlation function:
row_undo_ins_remove_multi_sec row_undo_mod_upd_del_multi_sec row_undo_mod_del_mark_multi_sec
When rolling back, get the value of the multi-value column from the undo log by trx_undo_rec_get_multi_value, and build and store it in the field data by interface Multi_value_logger::read
Record undo log
Function: trx_undo_store_multi_value
Store multi-value information in Undo log through Multi_value_logger::log.'Multi_value_logger'is an auxiliary class that records the value of multi-value column and how to read it out
purge secondary index record
Entry function:
row_purge_del_mark row_purge_upd_exist_or_extern_func |--> row_purge_remove_multi_sec_if_poss
Reference Documents
WL#10604: Create multi-value index
WL#8763: support multi-value functional index for InnoDB
WL#8955: Add support for multi-valued indexes
Official Documents