How does SQL split and intercept comma-separated strings?

Posted by harryman100 on Thu, 10 Feb 2022 06:01:59 +0100

This article was written because of a recent need:
Known field values for a table are irregular comma-separated strings with the following styles:

Now split the field into four fields by comma with the following styles:

I have not had a ready-made function in Baidu for a long time, so I can only think of my own way.
First of all, the simplest and most clumsy way to do this is to use the substring function and charindex function, which is to intercept the string. The charindex function is to get the position of a character in the string.

That is, use the charindex function to get the position of commas 1,2,3, and then intercept them, taking into account that if some fields do not have three commas, an error will be reported.

I've customized a function with this idea. It's not a lot of rubbish. I'll paste the main code and explain it step by step:

First, get the number of commas in the string:

Define a variable called @cfq (before splitting), take it as an input, and get the number of commas in the variable. Baidu has a very complex writing here. In fact, there is a clever way:

select @dhsl=(select len(@cfq)-len(replace(@cfq,',','')))

This means total character length - character length after removing commas = number of commas.
Programming, ideas are important.
Add a new entry called @type, int, which means the number of strings you want to look up.
Make a judgement: If @type>number of commas+1, or if the entry @cfq is empty, return the empty value directly to prevent errors. The code is as follows:

if @type>@dhsl+1  or  isnull(@cfq,'')='' ---Returns null if more than its own character count
  begin 
  select @cfhzfc =''    ---Split String
  end

2. Define a length variable @len followed by intercept to be used, and add a comma after the value passed in by @cfq to prevent subsequent errors:

select @len =0,@cfq=@cfq+','

When the number of commas is 0, that is, there is only a valid string to get the value of @cfq directly from the first sub-character to the character before the comma.

This is a bit of a twist. The number of commas is 0, which is the number of commas that were originally passed in. Then we added a comma at the end of @cfq. I said it was to prevent errors. Look at the code

if @dhsl=0
 begin 
 select @cfhzfc1 =SUBSTRING(@cfq,1,charindex(',',@cfq)-1)
 end 

The third parameter after the charindex function refers to where to start retrieving, and where to start without passing. So let's look at the code when @dhsl=1:

if  @dhsl=1
 begin 
   select @cfhzfc1 =SUBSTRING(@cfq,1,charindex(',',@cfq)-1)
   select @len =len(@cfhzfc1)+2
   select @cfhzfc2 =SUBSTRING(@cfq,@len,charindex(',',@cfq,@len)-@len)
 end

@len gets the position of the first character beyond the first comma, and charindex(',', @cfq,@len) gets the position of the next comma.

3. Write out the acquisition method of the third and fourth strings according to the same logic, that is, when the number of commas in the original input is 2 and 3, the acquisition data code is as follows:

if  @dhsl=2
begin
 select @cfhzfc1 =SUBSTRING(@cfq,1,charindex(',',@cfq)-1)
 select @len =len(@cfhzfc1)+2

 select @cfhzfc2 =SUBSTRING(@cfq,@len,charindex(',',@cfq,@len)-@len)
 select @len =@len+len(@cfhzfc2)+1

 select @cfhzfc3 =SUBSTRING(@cfq,@len,charindex(',',@cfq,@len)-@len)
end 

if  @dhsl=3
begin
select@cfhzfc1 =SUBSTRING(@cfq,1,charindex(',',@cfq)-1)
select @len =len(@cfhzfc1)+2

 select @cfhzfc2 =SUBSTRING(@cfq,@len,charindex(',',@cfq,@len)-@len)
 select @len =@len+len(@cfhzfc2)+1

 select @cfhzfc3 =SUBSTRING(@cfq,@len,charindex(',',@cfq,@len)-@len)
 select @len =@len+len(@cfhzfc3)+1

 select @cfhzfc4 =SUBSTRING(@cfq,@len,charindex(',',@cfq,@len)-@len)
end

4. Finally, we output the desired results according to the requirements of @type:

select @cfhzfc =
case @type when 1 then  @cfhzfc1
when 2 then  @cfhzfc2
when 3 then  @cfhzfc3
when 4 then  @cfhzfc4
end

5. I named this function dhzfcf and called it as follows:

select dbo.dhzfcf(crq,1) Field 1   
, dbo.dhzfcf(crq,2) Field 2   
,dbo.dhzfcf(crq,3) Field 3   
, dbo.dhzfcf(crq,4) Field 4    from #temp1

This gives you the result you want at the beginning of the article

Later in the process of communicating with colleagues, I found that there are actually several ways to make such a split, the simplest is to directly replace the comma with ",", and then use insert into to fetch, interested students can study it by themselves.

Okay, this is where today's share comes in. Those who don't understand it will read it several times. Welcome to comment, pay attention, collect and forward. For more wonderful sharing, please refer to this historical article!

Topics: Database MySQL SQL string