preface

This article introduces the concept and use of sequences in Kotlin and the principles behind the lazy collection operation to optimize the performance of collection chain calls.

This article will take about 5 minutes to read and 7 hours to write

directory

Sequence (Sequence)

concept

When the Kotlin set operators are used to make chained calls, such as map and filter, intermediate sets are created inside the function. In the following example, map and filter are used to filter out male members of the User set, and the result is a set.

users.map(User :: sex)
     .filter {it.sex.equals("male")}
Copy the code

Usage of sequences

The use of sequences is as simple as adding the asSeqence() function after the collection

users.asSequence()
     .map(User :: sex)
     .filter {it.sex.equals("male")}
Copy the code

Here we introduce a concept where User :: User is a member reference, described below

Member References

concept

A member reference makes it easy to call a member of a class that contains properties or methods of the corresponding class. The sex property of the User member is returned, as shown below:

User :: sex
Copy the code

Member references can easily be assigned to other variables or functions, such as the male example above, or they can be written in a slightly more complicated way, as follows:

users.map(user : User -> user.sex)
     .filter {it.sex.equals("male")}
Copy the code

Visible member references are written in a more readable way.

Talk about the sequence

Let’s go back to the sequence introduction. As mentioned above, both map and filter create intermediate sets inside the function. This leads to a problem. If the source list, namely users, has too many elements, the chain processing of the set becomes inefficient because multiple intermediate sets are created. However, if the set to be processed is first converted to a sequence using the asSequence() method, then the map and filter operations become very efficient. There are several assumptions about whether to use sequences for collection operations that, if not used properly, can lead to a performance penalty. Here is a summary of the usage scenario:

Sequential performance test

As mentioned above, one of the conditions for using sequences is to process large amounts of data. What is the threshold? If you add 100 to the price of each item, the number of items with odd prices will be calculated. Count () is used to find the number of elements in the set that meet the count condition. The code is as follows:

/** ** **
data class Commodity(var name: String, var price: String)
Copy the code
import java.util.*

fun main(args: Array<String>) {

    val commodityList = ArrayList<Commodity>()

    for (i in 0.1000000.) {
        val goods = Commodity("Commodity$i", i * 5)
        commodityList.add(goods)
    }

    val startTime = System.currentTimeMillis()

    commodityList
            .asSequence() Using this function means using the Kotlin sequence function
            .map { it.price + 100 }
            .count { it % 2! =0 }

    println("consume time is ${System.currentTimeMillis() - startTime} ms")}Copy the code

The line graph of test results is as follows, where the abscissa is the number of elements in the set, the ordinate is the code execution time, the orange line represents unused sequence, and the blue line represents used sequence:

The following conclusions can be drawn from the figure:

  • Above the threshold mentioned above, which is roughly “a million” elements, using sequences can result in roughly a 90 percent performance improvement
  • At less than “100,000” elements, using sequences actually degrades performance

Why do sequences improve the performance of collection operations?

  1. Sequences operate lazily on collections.
  2. There is no additional need to create an intermediate collection to hold the intermediate result of the chain operation

For the first point, inertia can be confusing because it has to do with the way sets are evaluated after using sequences. Here’s an example where you need to multiply each number by 2 in an integer data set and find the first element in the set that is less than 10.

var result = listOf(2.4.6.8.10).asSequence
                  .map(it * 2)
                  .find(it > 10)
Copy the code

The find() method looks for the first element in the collection that meets the criteria and returns the value found. The following figure shows Kotlin using sequences

As can be seen from the figure, inertia means that when operators such as map or filter are used, the code will not traverse all sets in the order of execution. In other words, it is lazy to map and filter the first element in the set and then perform find operation on that element. The result is immediately returned when a find condition is found, which shows that sequence optimization works best when there is a combination of operators such as map and find or last.

summary

  1. Using sequence operators when performing collection operations reduces the time and footprint of collection operations because of lazy operations and because the sequence does not create intermediate collections while performing operations.
  2. Sequential operations are good, but they can be inefficient depending on the business scenario.

reference

  • Real Kotlin
  • Kotlin series of Sequences (source code) fully analyzed

Your likes or comments are the biggest encouragement to my writing!