“This is the 15th day of my participation in the First Challenge 2022. For details: First Challenge 2022.”


These are some brief notes on the design of the Parallel Iterator trait. This article does not describe how to use parallel iterators. If necessary, browse ParallelIterator.

challenge

Parallel iterators are more complex than sequential iterators. The reason is that they must be able to divide themselves and operate in parallel between the separated parts.

There are currently two different usage patterns for the design of a Parallel iterator; As we will see, not all iterators support both patterns. Here’s why there are two models:

Pull

Producer and UnindexedProducer traits

In this mode, the iterator is required to produce the next item by calling next(). This is basically just like a normal iterator, but with one difference: you can split the iterator into two parts, generating unrelated items in different threads.

Describe the two traits:

In the Producer Trait, splitting is done through split_at(), which accepts an index that should be split. Only iterators with indexes can work in this mode because they know exactly how much data they will produce and how to locate the accessed indexes.

In the UnindexedProducer Trait, splitting is accomplished through split(), which only requires producers to roughly split themselves into two parts. This is useful when the length or memory layout is unknown, such as strings, or when the length may exceed usize, such as Range

on 32-bit platforms.

Theoretically, any Producer could use a non-indexed approach, but we don’t currently use one. When you know the exact length, split() is simply implemented as split_at(length/2).

Push

Consumer and the UnindexedConsumer Trait

In this mode, the iterator gives each item in turn and then processes it. This is the opposite of a normal iterator. It’s more like a call to for_each: Consumer () is called with that item every time a new item is generated. (The attributes themselves are a little more complicated, because they support states that can be traversed and eventually reduced.) Unlike producers, consumers come in two varieties, the difference being how they are segmented.

In the Consumer trait, splitting is done by splIT_AT, which accepts an index to split the data. All iterators can work in this mode. In this way, the segments generated have a basic estimate of how much data they consume.

In the UnindexedConsumer trait, splitting is done by split_off_left(). In this case, there is no index: the partitioned parts have to prepare themselves to handle an unknown amount of data flow, and they don’t know where the data is in the overall data flow.

For the UnindexedConsumer trait, not all consumers can operate in this mode. For example, it works for for_each and reduce, but it does not work for collect_into_vec, because in this case each item is dependent on its final position in the target set.