Collect is arguably the most powerful terminal operation in Stream, and you can use it to get almost any aggregate of data you want.


There are two methods in the Stream interface

  <R> R collect(Supplier<R> supplier,
                  BiConsumer<R, ? super T> accumulator,
                  BiConsumer<R, R> combiner);

 <R, A> R collect(Collector<? super T, A, R> collector);Copy the code

It is clear that the first is equivalent to a simple implementation and the second is advanced. More complex operations are encapsulated in the Collector interface and provide static methods for consumers to invoke. Let’s break it down one by one.

Simple call form

The simple call form is the first type of interface, which looks like this

  <R> R collect(Supplier<R> supplier,
                  BiConsumer<R, ? super T> accumulator,
                  BiConsumer<R, R> combiner);Copy the code

The call method is as follows. Obviously, the first parameter supplier stores the result container, the second parameter accumulator is the operation of how the result is added to the container, and the third parameter combiner is the aggregation policy of multiple containers.

String concat = stringStream.collect(StringBuilder::new, StringBuilder::append,StringBuilder::append).toString();
// This is equivalent to the above, so it should be clearer
String concat = stringStream.collect(() -> new StringBuilder(),(l, x) -> l.append(x), (r1, r2) -> r1.append(r2)).toString();Copy the code

According to the requirements of Collect, the container sum is required first, then the operation sum+x, aggregate operation sum1+sum2 is added, then it is easy to write. After reading the following code, I will have a good understanding of it, and then I will look at the advanced usage. Of course, using the sum method is the best solution, and this is just an example application.

// Since primitive types are immutable, we use arrays as containers
final Integer[] integers = Lists.newArrayList(1.2.3.4.5)
        .stream()
        .collect(() -> new Integer[]{0}, (a, x) -> a[0] += x, (a1, a2) -> a1[0] += a2[0]);Copy the code

There is a Person class with two attributes of type and name. Then use Collect to collect it into the Map collection, where the key is type and the value is Person. As shown in the code below, the method is mastered when you understand it.

   Lists.<Person>newArrayList().stream()
        .collect(() -> new HashMap<Integer,List<Person>>(),
            (h, x) -> {
              List<Person> value = h.getOrDefault(x.getType(), Lists.newArrayList());
              value.add(x);
              h.put(x.getType(), value);
            },
            HashMap::putAll
        );Copy the code

Collector Advanced Call

The Collector interface is the ultimate weapon in making the Collect operation powerful. For most operations, it can be broken down into its main steps, providing initial containers -> adding elements to containers -> concurrent aggregation of multiple containers -> performing operations on aggregated results. The Collector interface also provides of static methods to help you maximize your determination Administrators also provide the Collectors class, which encapsulates most common collection operations. CollectorImpl is a Collector implementation class. The interface cannot be instantiated.

    // The initial container
     Supplier<A> supplier(a);
    // Add to the container operation
    BiConsumer<A, T> accumulator(a);
    // Multi-container aggregation operation
    BinaryOperator<A> combiner(a);
    // The result of the aggregation operation
    Function<A, R> finisher(a);
    // An optimized status field in the operation
    Set<Characteristics> characteristics(a);Copy the code

Collectors method encapsulation

The Collectors interface is a mixture of the previous five function interfaces. The following section describes how the Collectors interface is used.

toList()

Container: ArrayList::new Add container operations: List::add Merge multiple containers: left.addall (right); return left; CastingIdentity () optimizesoperation status field: CH_ID. If this looks simple, the Map,Set, and other operations should be implemented similarly.

   public static<T> Collector<T, ? , List<T>> toList() {return new CollectorImpl<>((Supplier<List<T>>) ArrayList::new, List::add,
                                   (left, right) -> { left.addAll(right); return left; },
                                   CH_ID);
    }Copy the code

joining()

Container: StringBuilder::new Add container operation: StringBuilder:: appEnd Merge multiple containers: r1.append(R2); return r1; Result operation after aggregation: StringBuilder::toString Optimization operation status field: CH_NOID

    public staticCollector<CharSequence, ? , String> joining() {return new CollectorImpl<CharSequence, StringBuilder, String>(
                StringBuilder::new, StringBuilder::append,
                (r1, r2) -> { r1.append(r2); return r1; },
                StringBuilder::toString, CH_NOID);
    }Copy the code

Now let’s do a complicated one

groupingBy()

GroupingBy is an advanced form of toMap that compensates for the fact that toMap cannot provide multiple collection operations for values. For example, toMap is not so convenient for returning Map

>, etc. GroupingBy focuses on encapsulating Key and Value processing. Analyze the following code, where the classifier is the processing of the key Value, the mapFactory is the container type specified by the Map, and the downstream is the collection operation of the Value. This code does not analyze the specific code, but only puts the Value into the specified container one by one.
,list

   public static<T, K, D, A, M extends Map<K, D>> Collector<T, ? , M> groupingBy(Function<?super T, ? extends K> classifier,
                                  Supplier<M> mapFactory,
                                  Collector<? super T, A, D> downstream) {
       .......
    }Copy the code

For the collection operation done by the native collect method before, it can be easily rewritten into the form of groupBy

// Native form
   Lists.<Person>newArrayList().stream()
        .collect(() -> new HashMap<Integer,List<Person>>(),
            (h, x) -> {
              List<Person> value = h.getOrDefault(x.getType(), Lists.newArrayList());
              value.add(x);
              h.put(x.getType(), value);
            },
            HashMap::putAll
        );
/ / groupBy form
Lists.<Person>newArrayList().stream()
        .collect(Collectors.groupingBy(Person::getType, HashMap::new, Collectors.toList()));
// Because I have operations on values, I can be more flexible in converting values
Lists.<Person>newArrayList().stream()
        .collect(Collectors.groupingBy(Person::getType, HashMap::new, Collectors.mapping(Person::getName,Collectors.toSet())));Copy the code

reducing()

Reducing is a collection of a single value that returns a single entity class T container: boxSupplier(Identity), wrapped with an Object[] array of length 1. The reason, of course, is an immutable pot adding a container operation: A [0] = op. Apply (a [0], t) several containers: a [0] = make the apply ([0] a [0], b); return a; A -> a[0]; CH_NOID; CH_NOID;

  public static<T> Collector<T, ? , T> reducing(T identity, BinaryOperator<T> op) {return new CollectorImpl<>(
                boxSupplier(identity),
                (a, t) -> { a[0] = op.apply(a[0], t); },
                (a, b) -> { a[0] = op.apply(a[0], b[0]); return a; },
                a -> a[0],
                CH_NOID);
    }Copy the code

Then the next step is to transform some operations of Collect before

// Native operation
final Integer[] integers = Lists.newArrayList(1.2.3.4.5)
        .stream()
        .collect(() -> new Integer[]{0}, (a, x) -> a[0] += x, (a1, a2) -> a1[0] += a2[0]);
/ / reducing operations
final Integer collect = Lists.newArrayList(1.2.3.4.5)
        .stream()
        .collect(Collectors.reducing(0, Integer::sum));    
// Of course Stream also provides reduce operations
final Integer collect = Lists.newArrayList(1.2.3.4.5)
        .stream().reduce(0, Integer::sum)Copy the code

Possible problems

Record any minor errors you encounter using the tool in production

Exception generated by toMap

ToMap operates in the following code, with exceptions coming from two aspects

  1. The operation callsmap.mergeMethod, which reports nPE if the value is null, even if you are using a hashMap that accepts null values. I don’t know why it’s designed like this.
  2. The conflict merge policy, the third parameter, is not specifiedBinaryOperator<U> mergeFunctionIf a duplicate key is encountered, it will be thrown directlyIllegalStateExceptionTherefore, attention should be paid.

conclusion

By now, the operation of COLLECT should be very clear. I hope that through these examples, I can grasp the core, namely the function of the several functions in the Collector interface. I hope it will be helpful to you.

Personal blog mrDear. Cn, welcome to exchange