“This is the 18th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021”

Tag strings in C++

Marking a string means splitting the string based on some delimiter. There are many ways to tag strings. Four of them are explained in this article:

Using string streams

A string stream is associated with a string object that allows you to read a stream from a string as if it were a stream. Here is the C++ implementation:

#include <bits/stdc++.h>

using namespace std;

int main(a)
{
	
	string line = "juejin is a must try";
	vector <string> tokens;
	stringstream check1(line);	
	string intermediate;
	while(getline(check1, intermediate, ' ')){
		tokens.push_back(intermediate);
	}

	for(int i = 0; i < tokens.size(a); i++) cout << tokens[i] <<'\n';
}
Copy the code

The output

juejin
is
a
must
try
Copy the code

usestrtok()

// Split string[] based on the given delimiter. And returns the next token. It needs to be called in the loop to get all the tokens. It returns NULL when there are no more tags.
char * strtok(char str[], const char *delims); 
Copy the code

Here is the C++ implementation:

#include <stdio.h>
#include <string.h>

int main(a){
	char str[] = "juejin-for-juejin";
	char *token = strtok(str, "-");
	while(token ! =NULL) {printf("%s\n", token);
		token = strtok(NULL."-");
	}
	return 0;
}
Copy the code

The output

juejin
for
juejin
Copy the code

Another example of strtok() :

#include <string.h>
#include <stdio.h>

int main(a){
	char gfg[100] = " juejin - for - juejin - Contribute";
	const char s[4] = "-";
	char* tok;
	tok = strtok(gfg, s);
	while(tok ! =0) {
		printf(" %s\n", tok);
		tok = strtok(0, s);
	}
	return (0);
}
Copy the code

The output

juejin 
for 
juejin
Contribute
Copy the code

usestrtok_r()

Like the strtok() function in C, strtok_r() performs the same task of parsing a string into a sequence of tokens. Strtok_r () is a reentrant version of strtok(). We can call strtok_r() in two ways

Here is a simple C++ program to show strtok_r() in use:

#include<stdio.h>
#include<string.h>

int main(a){
	char str[] = "juejin for juejin";
	char *token;
	char *rest = str;
	while ((token = strtok_r(rest, "", &rest)))
		printf("%s\n", token);

	return(0);
}
Copy the code

The output

juejin
for
juejin
Copy the code

Using STD: : sregex_token_iterator

In this approach, tokenization is done on the basis of regular expression matching. More suitable for use cases that require multiple delimiters.

Here is a simple C++ program to show the use of STD ::sregex_token_iterator:

#include <iostream>
#include <regex>
#include <string>
#include <vector>

std::vector<std::string> tokenize(const std::string str,const std::regex re){
	std::sregex_token_iterator it{ str.begin(),str.end(), re, - 1 };
	std::vector<std::string> tokenized{ it, {} };
	tokenized.erase(std::remove_if(tokenized.begin(),tokenized.end(),[](std::string const& s) {
            return s.size() = =0;
            }),
	tokenized.end());
	return tokenized;
}


int main(a){
	const std::string str = "Separate strings with Spaces and commas.";
	const std::regex re(R"([\s|,]+)");
	const std::vector<std::string> tokenized = tokenize(str, re);
	for (std::string token : tokenized) std::cout << token << std::endl;
	return 0;
}
Copy the code

The output

Separate strings into Spaces and commasCopy the code

Getline () function and character array

In C++, stream classes support line-oriented functions, with getline() and write() performing input and output functions, respectively. The getLine () function reads the entire line of text ending in a new line or up to the maximum limit. Getline () is a member function of the istream class with the following syntax:

// (buffer, stream size, delimiter)
istream& getline(char*, int size, char='\n')

// The delimiter is treated as '\n'
istream& getline(char*, int size)
Copy the code

This function does the following:

  1. Extract characters up to the delimiter.
  2. Stores characters in a buffer.
  3. The maximum number of characters extracted is size-1.

Note that the terminator (or separator) can be any character (such as’, ‘, ‘, or any special character, etc.). Terminators are read but not saved in the buffer, but replaced by null characters.

// the C++ program that displays getline() with an array of characters
#include <iostream>
using namespace std;

int main(a)
{
	char str[20];
	cout << "Enter Your Name::";

	// Look at the use of getLine () with array STR. Replace the above statement with cin >> STR and see the difference in output
	cin.getline(str, 20);

	cout << "\n Your name is:" << str;
	return 0;
}
Copy the code

Input:

Whales fallCopy the code

Output:

Your name is: Whale fallCopy the code

In the above program, the statement cine.getline (STR, 20) reads the string until it encounters a newline or the maximum number of characters (20 here). Try functions with different constraints and see the output.