The implementation of string splitting function split in c + +

Posted by David Rech on Fri, 22 Nov 2019 17:34:51 +0100

Preface

When learning the basic usage of string in c + +, it is found that istringstream[1] of s sstream can input string in the way similar to console, but in essence, this behavior is equivalent to splitting a string by using space. Therefore, considering that this feature can be used to realize the string splitting function not found in c + + library functions

string src("Avatar 123 5.2 Titanic K");
istringstream istrStream(src); //Establish src to istrStream connection
string s1, s2;
int n;  double d;  char c;
istrStream >> s1 >> n >> d >> s2 >> c;
//Values separated by spaces are entered into the corresponding variables

Implementation details

The purpose is to call a function, just like in js, which can easily get the string array after processing, and then adjust the parameters according to the actual situation of c + +.

1. Input and output:

string* split(int& length, string str, const char token = ' ')

Return: the first address of the processed string array
Passed in: String str, separator token (default parameter is space), and reference parameter length, indicating the array length dynamically allocated after processing

2. Data transparent processing:
Because istringstream, like cin, regards space as the boundary between data, so when the separator is not a space, you need to replace the incoming separator with a space, and you need to transparently process the original space in advance
Character substitution uses replace() [2] in Library algorithm

  const char SPACE = 0;
  if(token!=' ') {
    // Replace the original space with an invisible character in ASCII
    replace(str.begin(), str.end(), ' ', SPACE); 
    // Then, the space of separator transposition is handed to the string stream for processing
    replace(str.begin(), str.end(), token, ' ');
  }
  Suppose the input string is: "a b,c,d,e,f g"
  Separator is not a space: ','
  Is replaced by: "aSPACEb c d e fSPACEg"

3. Data segmentation:

  //Instantiate a string input stream. The input parameter is the string to be processed
  istringstream i_stream(str); 
    //Set length to zero
  length = 0; 
  queue<string> q;
  //Use a string instance s to receive the data from the input stream, queue and count
  string s;
  while (i_stream>>s) {
    q.push(s);
    length++;
  }

4. Array generation:

  //Opening up a string array space dynamically according to the counting result
  string* results = new string[length]; 
  //Put the data in the queue into the array
  for (int i = 0; i < length; i++) {
    results[i] = q.front();
    //Restore the replaced space
    if(token!=' ') replace(results[i].begin(), results[i].end(), SPACE, ' ');
    q.pop();
  }

Complete code

#include <iostream>
#include <string>
#include <queue>
#include <sstream>
#include <algorithm>
using namespace std;

string* split(int& length, string str,const char token = ' ') {
  const char SPACE = 0;
  if(token!=' ') {
    replace(str.begin(), str.end(), ' ', SPACE);
    replace(str.begin(), str.end(), token, ' ');
  }
  istringstream i_stream(str);
  queue<string> q;
  length = 0;
  string s;
  while (i_stream>>s) {
    q.push(s);
    length++;
  }
  string* results = new string[length];
  for (int i = 0; i < length; i++) {
    results[i] = q.front();
    q.pop();
    if(token!=' ') replace(results[i].begin(), results[i].end(), SPACE, ' ');
  }
  return results;
}

//Test:
int main() {
  int length;
  string* results = split(length, "a b,c,d,e,f g", ',');
  for (int i = 0; i < length; i++) cout<<results[i]<<endl;
  return 0;
}

Reference resources

[1] C++ string class (C++ string) complete introduction
[2] Replace specified characters with C++ string

Topics: C++ ascii