The introduction

Continuing with the Swift documentation, from the previous section, Basic Operations, we learned about the basic operator operations on Swift, basically the same as operators on C and OC; However, there are special operators on Swift that simplify code, such as?? Operators,… Operators,.. < operator and single value field. For those unfamiliar, go to the previous section (sections 7, 8, 9). Now, let’s look at Swift’s strings and character-related content. Due to the long space, here is a section to record, next, let’s begin!

This section is fairly straightforward, so if you’re already familiar with it, you can skip it and move on to the next section: Collection types

Strings and characters

A string is a series of characters, such as “Hello, world” or “albatross”. Swift strings are represented by string types. The contents of a string can be accessed in a variety of ways, including as collections of character values.

Swift’s string and character types provide a fast, Unicode-compatible way to handle text in code. The syntax for string creation and manipulation is lightweight, readable, and similar to the syntax for string C string concatenation is very simple, just by choosing between two string + operators, and string variability by constants or variables, just like any other value quickly. Strings can also be used to insert constants, variables, text, and expressions into longer strings, a process called string interpolation. This makes it easy to create custom string values for display, storage, and printing.

Despite its simple syntax, Swift’s string type is a fast, modern implementation of strings. Each string consists of Unicode characters independent of the encoding and supports access to these characters in various Unicode representations.

The string type for Swift is wired through Foundation’s NSString class. Foundation also extends String to expose methods defined by NSString. This means that if you import Foundation, you can access those NSString methods on strings without casting them.

For more information about using strings in Foundation and Cocoa, see Bridging Strings and NSStrings.

1 string

You can include predefined string values in your code as string text. A string literal is a sequence of characters surrounded by double quotes (“).

Use a string literal as an initial value for a constant or variable:

let someString = "Some string literal value"
Copy the code

Note that Swift infers a string type for the someString constant because it is initialized with a string literal value.

1.1 Multi-line strings

If you need a string that spans several lines, use a multi-line string text – a sequence of characters surrounded by three double quotes:

let quotation = """ The White Rabbit put on his spectacles. "Where shall I begin,
please your Majesty?" he asked. "Begin at the beginning," the King said gravely, "and go on
till you come to the end; then stop.""""
Copy the code

A multiline string literal includes all lines between its opening and closing quotes. The string starts on the first line after the opening quote (” “”) and ends before the closing quote, which means that none of the following strings begin or end with a newline character:

let singleLineString = "These are the same."
let multilineString = """
These are the same.
"""
Copy the code

When the source code contains a newline character in a multi-line string literal, the newline character also appears in the value of the string. If you want to use newlines to make your source code easier to read, but you don’t want newlines to be part of string values, write a backslash () at the end of these lines:

let softWrappedQuotation = """ The White Rabbit put on his spectacles. "Where shall I begin, \
please your Majesty?" he asked. "Begin at the beginning," the King said gravely, "and go on \
till you come to the end; then stop.""""
Copy the code

To make a multi-line string start or end with a newline, write a blank line as the first or last line. Such as:

let lineBreaks = """ This string starts with a line break. It also ends with a line break. """
Copy the code

Multiple line strings can be indented to match the surrounding code. The whitespace before the closing quotation mark (“””) tells Swift which whitespace to ignore before all other lines. However, if a space is added at the beginning of a line in addition to the closing quotation mark, that space is included.

In the example above, even though the entire multi-line string is indented, the first and last lines of the string do not start with any Spaces. The middle line has more indentation than the closing quotation mark, so it starts with an extra four-space indentation.

1.2 Special Characters in the String

A string can contain the following special characters:

  • Escaping special characters \ 0 (null character), \ (backslash), \ t tabs (level), (wrap), \ r \ n (return), “(double quotes), and the ‘(single quotation marks)
  • Any Unicode scalar value, written as \u{n}, where n is a hexadecimal number of 1-8 bits (Unicode will be discussed in Unicode below)

The code below shows four examples of these special characters. The wiseWords constant contains two escaped double quotes. DollarSign, blackHeart, and sparklingHeart constants demonstrate the Unicode scalar format:

let wiseWords = "\"Imagination is more important than knowledge\" - Einstein"
// "Imagination is more important than knowledge" - Einstein
let dollarSign = "\u{24}"        // $,  Unicode scalar U+0024
let blackHeart = "\u{2665}"      // ♥,  Unicode scalar U+2665
let sparklingHeart = "\u{1F496}" // 💖, Unicode scalar U+1F496
Copy the code

Because multi-line string literals use three double quotes instead of one, it is possible to include a single double quote (“) in a multi-line string literal without escaping. To contain text “”” in a multi-line string, at least one quotation mark must be escaped. Such as:

let threeDoubleQuotationMarks = """
Escaping the first quotation mark \"""
Escaping all three quotation marks \"\"\"
"""
Copy the code

1.3 Long String Delimiter

You can place string literals in the extension delimiter to include special characters in the string without calling their effects. You place the string in quotes (“) and surround it with a numeric symbol (#). For example, printing the string text #” line1 \nLine 2″# will print the newline escape sequence (\n) instead of printing the string across two lines.

If special effects of characters in string text are required, the number of numeric symbols in the string following the escape character () is matched. For example, if your string is #”Line 1\nLine 2″# and you want to wrap a Line, you can use #”Line 1#nLine 2″# instead. Similarly, ###”Line1###nLine2″### interrupts the line.

String text created using extension delimiters can also be multi-line string text. You can use extended delimiters to include text “”” in a multi-line string, overriding the default behavior of terminating text. Such as:

let threeMoreDoubleQuotationMarks = # "" "
Here are three more double quotes: """"""#
Copy the code

2 Initializes an empty string

To create an empty string value as a starting point for building a longer string, you can assign an empty string literal to the variable, or initialize a new string instance using the initializer syntax:

var emptyString = ""               // empty string literal
var anotherEmptyString = String()  // initializer syntax
// these two strings are both empty, and are equivalent to each other
Copy the code

Check whether a string isEmpty by examining its isEmpty property:

if emptyString.isEmpty {
    print("Nothing to see here")
}
// Prints "Nothing to see here"
Copy the code

3 String variability

You can indicate whether a particular string can be modified (or changed) by assigning it to a variable (in which case it can be modified), or to a constant (in which case it cannot be modified):

var variableString = "Horse"
variableString += " and carriage"
// variableString is now "Horse and carriage"

let constantString = "Highlander"
constantString += " and another Highlander"
// this reports a compile-time error - a constant string cannot be modified
Copy the code

Note that this method is different from string altering in Objective-C and Cocoa, where you can choose between two classes (NSString and NSMutableString) to indicate whether a string can be changed.

4 A string is a value type

Swift’s string type is a value type. If you create a new string value, the string value is copied when it is passed to a function or method, or when it is assigned to a constant or variable. In each case, a new copy of the existing string value is created and passed or assigned instead of the original value. Value types are described in structs and enumerations are value types.

Swift’s copy-by-default string behavior ensures that when a function or method passes you a string value, no matter where it came from, you clearly own the string value. You can be sure that the string you pass will not be modified unless you modify it yourself.

Behind the scenes, Swift’s compiler optimizes the use of strings so that actual copying occurs only when absolutely necessary. This means that you can always get good performance when using strings as value types.

5 Using Characters

You can access individual character values by iterating through the string in a for-in loop:

for character in "Dog! 🐶" {
    print(character) } // D // o // g // ! / / 🐶Copy the code

For-in Loops See For-in Loops.

Alternatively, you can create independent character constants or variables from single-character string text by providing character type annotations:

let exclamationMark: Character = "!"
Copy the code

String values can be constructed by passing an array of character values as arguments to its initializer:

let catCharacters: [Character] = ["C"."a"."t"."!"."🐱"]
let catString = String(catCharacters)
print(catString)
// Prints "Cat! 🐱"
Copy the code

6 Connect a character string to a character

String values can be added (or concatenated) using the addition operator (+) to create a new string value:

let string1 = "hello"
let string2 = " there"
var welcome = string1 + string2
// welcome now equals "hello there"
Copy the code

You can also append a string value to an existing string variable using the add assignment operator (+=) :

var instruction = "look over"
instruction += string2
// instruction now equals "look over there"
Copy the code

You can append a character value to a string variable using the string append() method:

let exclamationMark: Character = "!"
welcome.append(exclamationMark)
// welcome now equals "hello there!"
Copy the code

Note that you cannot append strings or characters to existing character variables, because character values must contain only a single character.

If you use multiple lines of string text to build lines of a longer string, you want every line in the string to end with a newline, including the last line. Such as:

let badStart = """
one
two
"""
let end = """
three
"""
print(badStart + end)
// Prints two lines:
// one
// twothree

let goodStart = """ one two """
print(goodStart + end)
// Prints three lines:
// one
// two
// three
Copy the code

In the above code, concatenating badStart with end yields a two-line string, which is not the expected result. Because the last line of badStart does not end with a newline, it is merged with the first line of end. In contrast, goodStart has a newline character at the end of both lines, so when it is merged with end, the result is three lines as expected.

7 String Interpolation

String interpolation is a method of constructing new string values by including the values of constants, variables, text, and expressions within the string text. String interpolation can be used in single – and multi-line string text. Each item you insert into the string text is wrapped in a pair of parentheses, followed by a backslash ():

let multiplier = 3
let message = "\(multiplier) times 2.5 is \(Double(multiplier) * 2.5)"
// message is "3 times 2.5 is 7.5"
Copy the code

In the example above, the value of multiplier is inserted as (Multiplier) into the string literal. When string interpolation is evaluated to create the actual string, this placeholder is replaced with the actual value of the multiplier.

The value of the Multiplier is also part of the larger expression following the string. This expression evaluates Double(multiplier)* 2.5 and inserts the result (7.5) into the string. In this case, the expression is written (Double(Multiplier) * 2.5) when it is contained in string literals.

You can use the extended string delimiter to create strings that contain characters that would otherwise be treated as string interpolation. Such as:

print(#"Write an interpolated string in Swift using \(multiplier)."#)
// Prints "Write an interpolated string in Swift using \(multiplier)."
Copy the code

To use string interpolation in strings with extended delimiters, match the number of numeric symbols after the backslash with the number of numeric symbols at the beginning and end of the string. Such as:

print(#"6 times 7 is \#(6 * 7)."#)
// Prints "6 times 7 is 42."
Copy the code

Note that expressions written inside parentheses of an inline string cannot contain unescaped backslashes (), carriage returns, or newlines. However, they can contain other string literals.

8 Unicode

Unicode is an international standard for encoding, representing, and processing text in different writing systems. It enables you to represent almost any character from any language in a standardized form, and to read and write these characters from external sources, such as text files or Web pages. Swift’s string and character types are fully Unicode compatible, as described in this section.

8.1 Unicode scalar values

Behind the scenes, Swift’s native string type is built from Unicode scalar values. A Unicode scalar value is A unique 21-bit numeric character or modifier, such as the Latin lowercase letter A(” A “) : U+0061, or the leading chick (” 🐥 “) : U+1F425.

Note that not all 21-bit Unicode scalar values are assigned to characters-some are reserved for future assignment or use in UTF-16 encodings. Scalar values assigned to characters usually also have a name, such as the Latin lowercase letter A and the preceding CHICK in the example above.

8.2 Extending the glyph cluster

Each instance of the Swift character type represents an extended letter cluster. The extended Grapheme cluster is a sequence of one or more Unicode scalars that (when combined) produces a readable character.

Here’s an example. The letter E can be represented as a single Unicode scalar E (a Latin lowercase e with an acute Angle, or U+00E9). However, the same letter can also be represented as a pair of scalars — the standard letter E (Latin lowercase e, or U+0065), followed by the combined accented scalar (U+0301). The combined accent scalar is graphically applied to the scalar before it, converting E to E when rendered by a Unicode-enabled text rendering system.

In both cases, the letter E is represented as a Swift character value that represents an extended alphabetic group. In the first case, the cluster contains a single scalar; In the second case, it is a cluster of two scalars:

let eAcute: Character = "\u{E9}"/ / elet combinedEAcute: Character = "\u{65}\u{301}"// e followed by ́ // eAcute is eCopy the code

The extended Grapheme cluster is a flexible way to represent many complex script characters as single character values. For example, Korean syllables in Korean letters can be represented as pre-formed or decomposed sequences. Either representation can be used as a single character value in Swift:

let precomposed: Character = "\u{D55C}"/ / 한let decomposed: Character = "\u{1112}\u{1161}\u{11AB}"// ᄒ, ᅡ, ᆫ // precomposed is forest, decomposed is forestCopy the code

The extended Grapheme cluster allows the use of scalars to encapsulate markup (such as composite encapsulating circles, or U+20DD), thus encapsulating other Unicode scalars as part of a single character value:

let enclosedEAcute: Character = "\u{E9}\u{20DD}"/ / enclosedEAcute ⃝ is eCopy the code

Scalars of Unicode locale designators can be combined in pairs to form a character value, such as a combination of the locale designator letter U (U+1F1FA) and the locale designator letter S (U+1F1F8) :

let regionalIndicatorForUS: Character = "\u{1F1FA}\u{1F1F8}"/ / regionalIndicatorForUS is 🇺 🇸Copy the code

9 Computing Characters

To retrieve a count of character values in a string, use the count attribute of the string:

let unusualMenagerie = "Koala 🐨, Snail 🐌, Penguin 🐧, Dromedary 🐪"
print("unusualMenagerie has \(unusualMenagerie.count) characters")
// Prints "unusualMenagerie has 40 characters"
Copy the code

Note that Swift’s use of the extended Grapheme cluster for character values means that string concatenation and modification may not always affect the number of characters in a string.

For example, if you initialize a new string with a four-word cafe and then add a string that combines the acute accent (U + 0301) end, the resulting string still has a character count of 4 e and the fourth character, not E:

var word = "cafe"
print("the number of characters in \(word) is \(word.count)")
// Prints "the number of characters in cafe is 4"

word += "\u{301}"    // COMBINING ACUTE ACCENT, U+0301

print("the number of characters in \(word) is \(word.count)")
// Prints "The number of characters in cafe is 4"

Copy the code

Note that the extended Grapheme cluster can consist of multiple Unicode scalars. This means that different characters — and different representations of the same character — need to store different amounts of memory. Because of this, characters in Swift do not take up the same amount of memory in the string representation. Therefore, the number of characters in the string cannot be counted without iterating through the string to determine the extended Grapheme cluster boundary. If you use a particularly long string value, note that the count attribute must traverse the entire string of Unicode scalars to determine the characters of the string.

The count attribute does not always return the same number of characters as the length attribute of an NSString that contains the same characters. The length of an NSString is based on the number of 16-bit code units in the string UTF-16 representation, not the number of Unicode extended character groups in the string.

10 Access and modify the character string

Strings can be accessed and modified through string methods and properties or subscript syntax.

10.1 String Index

Each String value has an associated index type, String.index, corresponding to the position of each Character in the String.

As mentioned above, different characters need to store different memory, so each Unicode scalar must be traversed from the beginning or end of the string in order to determine which character is in a particular location. For this reason, Swift strings cannot be indexed by integer values.

Use the startIndex property to access the position of the first character in the string. The endIndex attribute is the position after the last character in the string. Therefore, the endIndex attribute is not a valid argument to a string subscript. If the string is empty, startIndex and endIndex are equal. (Note: Note where marked in red)

Use the index(before:) and index(after:) methods of String to access the indexes before and after a given index. To access indexes that are farther away from a given index, you can use the index(_:offsetBy:) method instead of calling one of the methods multiple times.

Characters at a particular string index can be accessed using subscript syntax.

let greeting = "Guten Tag!"
greeting[greeting.startIndex]
// G
greeting[greeting.index(before: greeting.endIndex)]
// !
greeting[greeting.index(after: greeting.startIndex)]
// u
let index = greeting.index(greeting.startIndex, offsetBy: 7)
greeting[index]
// a
Copy the code

Attempting to access an index outside of the string range or a character in an index outside of the string range will trigger a runtime error.

greeting[greeting.endIndex] // Error
greeting.index(after: greeting.endIndex) // Error
Copy the code

Use the indices property to access all indexes of a single character in a string.

for index in greeting.indices {
    print("\(greeting[index]) ", terminator: "")
}
// Prints "G u t e n T a g ! "
Copy the code

Note that you can use the startIndex and endIndex attributes and the index(before:), index(after:), and index(_:offsetBy:) methods on any collection protocol compliant type. This includes strings (as shown here) as well as collection types such as arrays, dictionaries, and sets.

10.2 Insertion and Deletion

Insert (_:at:) inserts a character at the specified index, and insert(contentsOf:at:) inserts the contentsOf another string at the specified index.

var welcome = "hello"
welcome.insert("!", at: welcome.endIndex)
// welcome now equals "hello!"

welcome.insert(contentsOf: " there", at: welcome.index(before: welcome.endIndex))
// welcome now equals "hello there!"
Copy the code

Remove a single character at the specified index from the string using the remove(at:) method; Remove substrings in the specified range using the removeSubrange(_:) method:

welcome.remove(at: welcome.index(before: welcome.endIndex))
// welcome now equals "hello there"

letrange = welcome.index(welcome.endIndex, offsetBy: -6).. <welcome.endIndex welcome.removeSubrange(range) // welcome now equals"hello"
Copy the code

Pay attention to You can use on any meet RangeReplaceableCollection protocol type insert (ats), insert (contentsOf: at:), remove (at) and removeSubrange () method. This includes strings (as shown here) as well as collection types such as arrays, dictionaries, and sets.

10.3 SubStrings

When you get substrings from strings — for example, using subscripts or methods like prefix(_:) — the result is an instance of substring, not another string. Substrings in Swift have most of the same methods as strings, which means you can treat substrings just like strings. However, unlike strings, substrings are used for a short time when performing operations on strings. When you are ready to store results for a long time, you can convert a substring to an instance of String. Such as:

let greeting = "Hello, world!"
let index = greeting.firstIndex(of: ",")?? greeting.endIndexlet beginning = greeting[..<index]
// beginning is "Hello"

// Convert the result to a String for long-term storage.
let newString = String(beginning)
Copy the code

Like strings, each substring has a memory region that stores the characters that make up the substring. The difference between a string and a substring is that, as a performance optimization, a substring can reuse part of the memory used to store the original string, or part of the memory used to store another substring. (Strings have similar optimizations, but if two strings share memory, they are equal.) This performance optimization means that there is no performance cost to copying memory until a string or substring is modified. As mentioned above, substrings are not good for long-term storage – because they reuse the storage of the original string, the entire original string must be kept in memory whenever any of its substrings are used.

In the example above, greeting is a string, which means that it has a memory region that stores the characters that make up the string. Because “start” is a substring of “greeting,” it repeats the memory used in the greeting. Instead, newString is a string — when it is created from a substring, it has its own storage. The following figure shows these relationships:

Note that both strings and substrings conform to the StringProtocol protocol, which means that it is often convenient for string manipulation functions to accept StringProtocol values. Such functions can be called with string or substring values.

11 Comparing Strings

Swift provides three ways to compare text values: string and character equality, prefix equality, and suffix equality.

11.1 Character Strings are Equal to Characters

The “equal to” operator (==) and the “not equal to” operator (! =) checks, as described in the comparison operator:

let quotation = "We're a lot alike, you and I."
let sameQuotation = "We're a lot alike, you and I."
if quotation == sameQuotation {
    print("These two strings are considered equal")
}
// Prints "These two strings are considered equal"
Copy the code

An extended Grapheme cluster of two string values (or two character values) is considered equal if they are equivalent in specification. If the extended Grapheme clusters have the same linguistic meaning and look, they are equivalent to the standard, even though they are made up of different Unicode scalars behind the scenes.

For example, the Latin letter E with a sharp sound (U+00E9) is the standard equivalent of the Latin letter E (U+0065) followed by a sharp sound (U+0301). Both of these extended Grapheme clusters are valid ways to represent the character E, so they are considered equivalent in specification:

// "Voulez vous - UN cafe?" using LATIN SMALL LETTER E WITH ACUTE
let eAcuteQuestion = "Voulez-vous un caf\u{E9}?"

// "Voulez vous - UN cafe?" using LATIN SMALL LETTER E and COMBINING ACUTE ACCENT
let combinedEAcuteQuestion = "Voulez-vous un caf\u{65}\u{301}?"

if eAcuteQuestion == combinedEAcuteQuestion {
    print("These two strings are considered equal")
}
// Prints "These two strings are considered equal"
Copy the code

In contrast, the Latin capital letter A(U + 0041, or “A”), which is used in English, is not the same as the Slavic capital letter A(U + 0410, or “A”) and is Russian. The two characters are visually similar, but have different linguistic meanings:

let latinCapitalLetterA: Character = "\u{41}"

let cyrillicCapitalLetterA: Character = "\u{0410}"

iflatinCapitalLetterA ! = cyrillicCapitalLetterA {print("These two characters are not equivalent.")
}
// Prints "These two characters are not equivalent."
Copy the code

Note that string and character comparisons in Swift are not locale sensitive.

11.2 Prefixes and suffixes are equal

To check if a string has a particular string prefix or suffix, you can call the hasPrefix(:) and hasSuffix(:) methods of the string, both of which take a string argument and return a Boolean value.

The following example considers an array of strings representing the location of the scene in the first two acts of Shakespeare’s Romeo and Juliet:

let romeoAndJuliet = [
    "Act 1 Scene 1: Verona, A public place"."Act 1 Scene 2: Capulet's mansion"."Act 1 Scene 3: A room in Capulet's mansion"."Act 1 Scene 4: A street outside Capulet's mansion"."Act 1 Scene 5: The Great Hall in Capulet's mansion"."Act 2 Scene 1: Outside Capulet's mansion"."Act 2 Scene 2: Capulet's orchard"."Act 2 Scene 3: Outside Friar Lawrence's cell"."Act 2 Scene 4: A street in Verona"."Act 2 Scene 5: Capulet's mansion"."Act 2 Scene 6: Friar Lawrence's cell"
]
Copy the code

You can count the number of scenes in Act 1 using the hasPrefix(_:) method and the romeoAndJuliet array:

var act1SceneCount = 0
for scene in romeoAndJuliet {
    if scene.hasPrefix("Act 1 ") {
        act1SceneCount += 1
    }
}
print("There are \(act1SceneCount) scenes in Act 1")
// Prints "There are 5 scenes in Act 1"
Copy the code

Also, use hasSuffix(_:) to count the number of scenes that take place in or around capulet House and Friar Lawrence’s cell:

var mansionCount = 0
var cellCount = 0
for scene in romeoAndJuliet {
    if scene.hasSuffix("Capulet's mansion") {
        mansionCount += 1
    } else if scene.hasSuffix("Friar Lawrence's cell") {
        cellCount += 1
    }
}
print("\(mansionCount) mansion scenes; \(cellCount) cell scenes")
// Prints "6 mansion scenes; 2 cell scenes"
Copy the code

Note that the hasPrefix(:) and hasSuffix(:) methods perform a character-by-character comparison of standard equivalence between the extended grapheme clusters in each string, as described in string and character equality.

Unicode representation of 12 strings

When a Unicode string is written to a text file or other storage, the Unicode scalars in the string are encoded in one of several Unicode-defined encoding forms. Each form encodes a string into small chunks called code units. These include the UTF-8 encoding (encoding a string as an 8-bit code unit), UTF-16 encoding (encoding a string as a 16-bit code unit), and UTF-32 encoding (encoding a string as a 32-bit code unit).

Swift provides several different ways to access Unicode representations of strings. You can iterate over the string using a for-in statement to access a single character value as a Group of Unicode extended characters. This process is described in character processing.

Alternatively, access a string value from one of the other three Unicode-compliant representations:

  • A collection of UTF-8 code units (accessed via the UTF8 property of the string)
  • A collection of UTF-16 code units (accessed via the UTF16 property of the string)
  • A collection of 21-bit Unicode scalar values equivalent to the UTF-32 encoding of a string (accessed via the unicodeScalars property of the string)

Each example below shows a different representation of the following strings, which were formed by people’s D, O, G, and Year (Double exclamation mark, or Unicode scalar U+203C), and 🐶 characters (dog face, or Unicode scalar U+1F436):

let dogString = "Dog ‼ 🐶"
Copy the code

12.1 the utf-8 said

You can access the UTF-8 representation of a string by iterating through the UTF8 property of the string. This property is of type String. UTF8View, which is a collection of unsigned 8-bit values corresponding to each byte represented by the string UTF-8:

for codeUnit in dogString.utf8 {
    print("\(codeUnit) ", terminator: "")}print("")
// Prints "68 111 103 226 128 188 240 159 144 182"
Copy the code

In the example above, the first three decimal codeUnit values (68, 111, 103) represent the characters D, O, and G, and their UTF-8 representation is the same as their ASCII representation. The next three decimal code unit values (226, 128, 188) are the three-byte UTF-8 representation of the double exclamation point character. The last four codeUnit values (240, 159, 144, 182) are four-byte UTF-8 representations of dog face characters.

12.2 UTF – 16 said

You can access the UTF-16 representation of a string by iterating through the UTF16 property of the string. This property is of type String. UTF16View is a collection of unsigned 16-bit values, each corresponding to a 16-bit code unit in the string UTF-16 representation:

for codeUnit in dogString.utf16 {
    print("\(codeUnit) ", terminator: "")}print("")
// Prints 68 111 103 8252 55357 56374
Copy the code

Similarly, the first three codeUnit values (68, 111, 103) represent characters D, O, and G, and their UTF-16 code units are the same as the values in the UTF-8 representation of strings (because these Unicode scalars represent ASCII characters).

The fourth codeUnit value (8252) is the decimal equivalent of the hexadecimal value 203C, which represents the Unicode scalar U+203C for the double exclamation mark character. This character can be represented as a single code unit in UTF-16.

The fifth and sixth codeUnit values (55357 and 56374) are UTF-16 proxy pair representations of the dog face character. These values are the high proxy value for U+D83D (55357 in decimal) and the low proxy value for U+DC36 (56374 in decimal).

12.3 Unicode scalar representation

You can access the Unicode scalar representation of a string value by iterating through the unicodeScalars property of the string. The type of this property is UnicodeScalarView, which is a collection of values of type UnicodeScalar.

Each UnicodeScalar has a value attribute that returns the 21-bit value of a scalar, expressed as a UInt32:

for scalar in dogString.unicodeScalars {
    print("\(scalar.value) ", terminator: "")}print("")
// Prints 68 111 103 8252 128054
Copy the code

The value attributes of the first three UnicodeScalar values (68, 111, 103) again represent the characters D, O, and G.

The fourth codeUnit value (8252) is also the decimal equivalent of the hexadecimal value 203C, which represents the Unicode scalar U+203C, representing the double exclamation point character.

The fifth and final UnicodeScalar value attribute 128054 is the decimal equivalent of the hexadecimal value 1F436, which represents the Unicode scalar U+1F436 for the DOG FACE character.

In addition to querying their value attributes, each UnicodeScalar value can also be used to construct a new string value, such as string interpolation:

for scalar in dogString.unicodeScalars {
    print("\(scalar) "Last year} // D // O/g // Year / / 🐶Copy the code

conclusion

This chapter covers strings and characters, including string initialization, how multi-line strings are represented, string variability, strings as value types, how to use characters, methods to concatenate strings and characters, the use of string differentials, Unicode, and string attributes. Swift on the string index use and OC is still very different, there are corresponding cases in the article, we had better own code operation, so as to deepen the impression. Ok, this part of the content is also relatively simple, have mastered the friends can skip to the next chapter.

Finally, if you like this article, you can give me a star to encourage me to do it more actively

Previous section: Basic operations

Next chapter: Collection types

Swift-strings and Characters