New Feature Interpretation | MySQL 8.0 Regular Replacement

Posted by cljones81 on Thu, 25 Jul 2019 09:12:28 +0200

Originality: Yang Taotao

MySQL has always supported regular matching, but regular substitution was not supported until MySQL 8.0. For such scenarios, it used to be either processed on MySQL or processed on Application side.

For example, I want to replace the substring of the third action in the column str1 of table y1 with dble.

1. Write your own storage function in the SQL layer. As a demonstration, MySQL does not recommend writing such functions.

mysql
DELIMITER $$
USE `ytt`$$
DROP FUNCTION IF EXISTS `func_instr_simple_ytt`$$
CREATE DEFINER=`root`@`localhost` FUNCTION `func_instr_simple_ytt`(
f_str VARCHAR(1000), -- Parameter 1
f_substr VARCHAR(100), -- Parameter 2
f_replace_str varchar(100),
f_times int -- times counter.only support 3.
) RETURNS varchar(1000)
BEGIN
declare v_result varchar(1000) default 'ytt'; -- result.
declare v_substr_len int default 0; -- search string length.

set f_times = 3; -- only support 3.
set v_substr_len = length(f_substr);
select instr(f_str,f_substr) into @p1; -- First real position .
select instr(substr(f_str,@p1+v_substr_len),f_substr) into @p2; Secondary virtual position.
select instr(substr(f_str,@p2+ @p1 +2*v_substr_len - 1),f_substr) into @p3; -- Third virtual position.
if @p1 > 0 && @p2 > 0 && @p3 > 0 then -- Fine.
select
concat(substr(f_str,1,@p1 + @p2 + @p3 + (f_times - 1) * v_substr_len - f_times)
,f_replace_str,
substr(f_str,@p1 + @p2 + @p3 + f_times * v_substr_len-2)) into v_result;
else
set v_result = f_str; -- Never changed.
end if;
-- Purge all session variables.
set @p1 = null;
set @p2 = null;
set @p3 = null;
return v_result;

end;
$$
DELIMITER ;

-- Call functions to update:
mysql> update y1 set str1 = func_instr_simple_ytt(str1,'action','dble',3);
Query OK, 20 rows affected (0.12 sec)
Rows matched: 20 Changed: 20 Warnings: 0

2. Export to replace import with sed or other tools. The steps are as follows: (recommended)

1) Export the records of table y 1.

mysql
mysql> select * from y1 into outfile '/var/lib/mysql-files/y1.csv';
Query OK, 20 rows affected (0.00 sec)

2) Replace the exported data with sed.

shell
root@ytt-Aspire-V5-471G:/var/lib/mysql-files# sed -i 's/action/dble/3' y1.csv

3) Import the processed data again and complete it.

mysql
mysql> truncate y1;
Query OK, 0 rows affected (0.99 sec)

mysql> load data infile '/var/lib/mysql-files/y1.csv' into table y1;
Query OK, 20 rows affected (0.14 sec)
Records: 20 Deleted: 0 Skipped: 0 Warnings: 0

The above two are still recommended to export to handle and re-import, the performance is better, but also do not have to work hard to write their own function code.

MySQL 8.0 is very simple for the above scenarios, and a function is done.

mysql
mysql> update y1 set str1 = regexp_replace(str1,'action','dble',1,3) ;
Query OK, 20 rows affected (0.13 sec)
Rows matched: 20 Changed: 20 Warnings: 0

There's also a regexp_instr that's very useful, especially in this particular scenario that highlights the current number of times. For example, define the SESSION variable @a.

mysql
mysql> set @a = 'aa bb cc ee fi lucy 1 1 1 b s 2 3 4 5 2 3 5 561 19 10 10 20 30 10 40';
Query OK, 0 rows affected (0.04 sec)

Get the location of the second substring of the number that appears at least twice.

mysql
mysql> select regexp_instr(@a,'[:digit:]{2,}',1,2);
+--------------------------------------+
| regexp_instr(@a,'[:digit:]{2,}',1,2) |
+--------------------------------------+
| 50 |
+--------------------------------------+
1 row in set (0.00 sec)

So let's see how multi-byte character support works.

mysql
mysql> set @a = 'China, USA, Russia, Japan, China, Beijing, Shanghai, Shenzhen, Guangzhou, Beijing, Shanghai, Wuhan, Dongguan, Beijing, Qingdao and Beijing';
Query OK, 0 rows affected (0.00 sec)

mysql> select regexp_instr(@a,'Beijing',1,1);
+-------------------------------+
| regexp_instr(@a,'Beijing',1,1) |
+-------------------------------+
| 17 |
+-------------------------------+
1 row in set (0.00 sec)

mysql> select regexp_instr(@a,'Beijing',1,2);
+-------------------------------+
| regexp_instr(@a,'Beijing',1,2) |
+-------------------------------+
| 29 |
+-------------------------------+
1 row in set (0.00 sec)

mysql> select regexp_instr(@a,'Beijing',1,3);
+-------------------------------+
| regexp_instr(@a,'Beijing',1,3) |
+-------------------------------+
| 41 |
+-------------------------------+
1 row in set (0.00 sec)

In conclusion, here I mention two of the most useful regular matching functions of MySQL 8.0, regexp_replace and regexp_instr. There is a perfect solution for similar scenarios in the past.

Topics: Database MySQL Session SQL shell