Writing in the front

Strings are an important data type, and regular expressions give programmers more ability to manipulate strings. The creators of ES6 have added many new features to strings and regular expressions. Linglong will make a comprehensive summary in the future.

The full text takes about 10 minutes to read

string

Better Unicode support

Unicode is a character set. To contain all the characters in the world in a set, as long as the computer supports this one character set, it can display all the characters, no more garbled characters.

Before ES6, JS strings were built around 16-bit character encodings. Each 16-bit sequence is an encoding unit that represents a character. Unicode0 introduces the extended character set, and the 16-bit character encoding will no longer contain any characters. Coding rules change as a result.

For UTF-16, code points can be represented by more than one coding unit.

For UTF-16, each of the first 2^16 code points is represented as a 16-bit coding unit. This range is referred to as the basic multitext plane BMP. When this is exceeded, a proxy pair is introduced, specifying two 16-bit coding units to represent a code point, i.e. a 32-bit auxiliary plane character. A 32-bit proxy pair represents a character length of 1, but the length property value is 2.

If you want to learn more about he can refer to nguyen one a piece of log: www.ruanyifeng.com/blog/2014/1… The code points mentioned in the log are code points

1.1codePointAt (0) method

Before es6, the charCodeAt () method returns the value corresponding to each 16-bit coding unit of a character, and then the codePointAt method is added in es6. CodePointAt (0) returns the codePointAt position 0, or the codePointAt position 0. With multiple encoding units > upper hexadecimal FFFF, the charCodeAt(0) method returns the first encoding unit at position 0.

So you can use this method to determine the number of coding units that a character occupies

function is32Bit(c) {
    
    return c.codePointAt(0) > 0xFFFF;
}
console.log(is32Bit("Geely"));  //true
console.log(is32Bit("a"));    //false
Copy the code

1.2 String.fromcodePoint () method

The codePointAt() method retrieves the code point of a String in a String, or you can use the string.fromCodePoint () method to generate a word from the specified code point

console.log(String.fromCodePoint(134071)); / /,Copy the code

1.3 the normalize () method

Equivalence can occur when comparing characters or sorting, but there are two cases of equivalence

  • The equivalence of the specification is that the code points of the two sequences are indistinguishable from any point of view
  • Compatible sequences of code points look different, but can be used interchangeably in certain situations. But it is not equivalent in strict mode unless the equivalence relationship is normalized in some way

The normalize() method provides a standardized form of Unicode, which can accept an optional string argument. There are four Unicode standardized forms

  • Decompose in standard equivalence and then reassemble in standard equivalence (” NFC “), default option
  • Decompose in a standard equivalent manner (” NFD “)
  • Decompose in a compatible equivalent manner (” NFKC “)
  • Decompose in a compatible manner and then reassemble in a standard equivalent manner

1.4 Regular expression U modifier

Adding the U modifier after the regular expression switches encoding unit mode to character mode, where the agent pair is not treated as two characters.

But the length attribute still returns the number of encoding units of the string, not the number of code points. But you can also solve this problem with regular expressions with the U modifier.

function codePointerLength(text) {
    let result = text.match(/[\s\S]/gu);
    return result ? result.length:0;
}
console.log(codePointerLength("Ji ABC")); / / 4Copy the code

Checks whether the U modifier is supported

The u modifier can cause syntax errors when used in non-ES6-compatible JavaScript engines and can be checked for support by using the following function.

function hasRegExpU() {
    try{
        var pattern = new Regexp("."."u");
        return ture;
    }catch (ex) {
        return false; }}Copy the code

2. Other string changes

2.1 String recognition in a string

The indexOf () method is used to examine a substring within a string. In ES6, three methods are provided to achieve a similar effect

  • The startWith() method, which checks the specified text at the beginning of the string, returns true, or false otherwise.
  • The incledes() method returns true if the specified text is detected in the string, false otherwise.
  • The endWith() method, which as the name implies checks at the end, is used in the same way as above.

The above three methods take two arguments, the first of which specifies that the text to be searched is a character. The second is that the index value of the starting search position is a number. The second argument, endwith, is not specified. Matches usually start at the end of the string. Demonstrate the following

let mes = "hello world";

console.log(mes.startWith("hello"));
console.log(mes.endWith("!"));
console.log(mes.includes("o"));

console.log(mes.startWith("o"));
console.log(mes.endWith("d!"));
console.log(mes.includes("x"));

console.log(mes.startWith("o", 4)); console.log(mes.endWith("o", 8)); console.log(mes.includes("o", 8)); //9 results are as follows:true true true   false true false    true true false
Copy the code

console.log(mes.endWith(“o”,8)); The match will start with the second O in position 7. Index value – Length of text to search =8-1

2.2 repeat () method

Es6 adds a repeat () method for strings that takes an argument of type number and returns a new string that repeats that number of times.

console.log(x.repeat(3)); //"xxx"

Copy the code


Regular expression

1. Other regular expression changes

1.1 Regular expression Y modifier

The Y modifier is sticky to the regular expression, starting with the lastIndex property of the regular expression. If there is no match at the specified location, the match stops and the result is returned.

let text =  'hello1 hello2 hello3';
letpatt = /hello\d\s? /, result = patt.exec(text);letgPatt = /helllo\d\s? /g, gResult = gPatt.exec(text);letyPatt = /hello\d\s? /y, yResult = yPatt.exec(text); console.log(resut[0]); //"hello1 "
console.log(gResut[0]);   //"hello1 "
console.log(yResut[0]);   //"hello1 "

patt.lastIndex = 1;
gPatt.lastIndex = 1;
yPatt.lastIndex = 1;

result = patt.exec(text);
gResult = gPatt.exec(text);
yResult = yPatt.exec(text);

console.log(resut[0]);   //"hello1 "
console.log(gResut[0]);   //"hello2 "console.log(yResut[0]); // Throw an errorCopy the code

Of the three regular expressions, the first has no modifier, the second global modifier g, and the third uses the Y modifier.

The first match starts with the h character. When lastIndex = 1; After that, expressions with no modifiers are automatically ignored and the result is hello1 again. The g modifier matches from character E, the output hello2, yResul matches from character E, ello H is not the same, and the result is null, so an error is raised.

When the y modifier is executed, the last digit of the last matching character is stored in lastIndex. If the y modifier is null, lastIndex is reset to 0. The g modifier is the same.

The lastIndex property is designed only when the exec () and test () methods of the regular expression object are called; for example, the natch() method of the string does not trigger sticky behavior.

The sticky attribute can be used to check whether y modifiers exist. If the JS engine supports sticky modifiers, the stickey attribute value is true, otherwise it is false

let patt = /hello\d/y;
console.log(patt.sticky);
Copy the code

1.2 Copying regular expressions

In ES5, you can copy a regular expression by passing it as a parameter to its constructor. However, the fact that the first argument is a regular expression can not use the second argument, es6 changed this behavior, the second argument can be a modifier.

let re1 = /ab/i;
let re2 = new RegExp(re1,"g");
console.log(re1.toString());  // "/ab/i"
console.log(re2.toString());  // "/ab/g"
Copy the code

1.3 flags properties

The new FLAGS attribute in ES6 returns all modifiers applied to the current regular expression

let re = /ab/g;
console.log(re.source);   //"ab"
console.log(re.flags);    //"g"
Copy the code

2. Template literals

2.1 Basic Syntax

In a nutshell, replace double, single quotes with backapostrophes (‘).

If you want to use an apostrophe in a string, use \ escape. Such as

letmessage = `\`hello\`! `; console.log(message);Copy the code

The result is a hello!

2.2 Simplified multi-line strings

Before ES6, arrays or string concatenation were used to create multi-line strings. In ES6, you only needed to wrap lines directly in the code, which also changed the length attribute. All Spaces in the backapostrophe are part of the string.

letmessage = `Multiline string`; console.log(message); console.log(message.length); / / 16 = 6 + 9 + 1Copy the code

2.3 String placeholders

In a template literal, you can embed any valid JavaScript expression in a placeholder and output it to the result as part of a string.

Placeholders typically consist of ${} and can contain any JavaScript expression in the middle. Template literals are themselves JavaScript expressions, so one template literal can be embedded within another.

let name = "sarah";
let message = `my${`name is${name}. `} `; console.log(message); //my name is sarah.Copy the code

Message is a template literal that contains name is${name}. This template literal.

2.4 Label Templates

Use raw values in template literals

The String is converted to a native String using the string. raw tag

let message = String.raw`Mul\nddd`;
console.log(message); //"Mul\\nddd"Because \n originated from \\n.Copy the code

conclusion

  • The codePointAt method and string.fromCodePoint () method in ES6 convert code points to strings.

  • When comparing strings, the normalize() method normalizes strings. Normalization takes four forms to compare strings more accurately

  • There are other new methods that detect substrings within a string and copy strings.

  • In es6’s re section, new syntax for copying regular expressions through constructors, new y modifiers in regular expressions, and template literals that replace double and single quotes with backapostrophes have been added.