As mentioned in previous articles, the core of a Stream is Collectors, which collect processed data. Collectors provides many powerful apis for collecting the resulting data into lists, sets, maps, and even more complex structures (nested combinations of the three).

There are many apis for overloading functions. I personally classify them into three categories, as follows:

  1. Data collection: Set, Map, and list
  2. Aggregation reduction: statistics, summation, maximum, average, string concatenation, protocol
  3. Pre – and post-processing: partitioning, grouping, and custom operations

Use the API

I’ll cover some of the common apis here, but I won’t cover all of them, because there are too many, and the combination of them is too scary and complex.

The data collection

  1. Collectors. ToCollection () converts data to a Collection. Any implementation of a Collection, such as ArrayList and HashSet, can be used. This method takes an implementation object of a Collection or an input to a Collection factory.

    Example:

     				//List
            Stream.of(1.2.3.4.5.6.8.9.0)
                    .collect(Collectors.toCollection(ArrayList::new));
            
            //Set
            Stream.of(1.2.3.4.5.6.8.9.0)
                    .collect(Collectors.toCollection(HashSet::new));
    
    Copy the code
  2. Collectors. ToList () and Collectors. ToSet () are similar to Collectors. The default containers are ArrayList and HashSet. I thought that the two methods would use Collectors. ToCollection () internally, but instead a new CollectorImpl was created internally.

    Expectation:

    		public static<T> Collector<T, ? , List<T>> toList() {return toCollection(ArrayList::new);
        }
    
       
        public static<T> Collector<T, ? , Set<T>> toSet() {return new toCollection(HashSet::new);
        }
    Copy the code

    Actual:

    		public static<T> Collector<T, ? , List<T>> toList() {return new CollectorImpl<>((Supplier<List<T>>) ArrayList::new, List::add,
                                       (left, right) -> { left.addAll(right); return left; },
                                       CH_ID);
        }
        
        public static<T> Collector<T, ? , Set<T>> toSet() {return new CollectorImpl<>((Supplier<Set<T>>) HashSet::new, Set::add,
                                       (left, right) -> { left.addAll(right); return left; },
                                       CH_UNORDERED_ID);
        }
    Copy the code

    Beginning is really don’t know how to think, the author is later found CollectorImpl is in need of a Set < Collector. The Characteristics > (Set), because the Set is disorderly, The implementation in toSet() passes in CH_UNORDERED_ID, but the toCollection() method is always CH_ID. If anyone knows, please let me know.

    Example:

    				//List
            Stream.of(1.2.3.4.5.6.8.9.0)
                    .collect(Collectors.toList());
    
            //Set
            Stream.of(1.2.3.4.5.6.8.9.0)
                    .collect(Collectors.toSet());
    Copy the code
  3. Collectors. ToMap () and Collectors. ToConcurrentMap (), see who know righteousness, collecting into the Map and the ConcurrentMap, default using HashMap and ConcurrentHashMap. ToConcurrentMap () supports parallel Collection. Both types have three overloaded methods, Map and ConcurrentMap. The difference between them and Collection is that Map is k-V. Therefore, when collecting a Map, you must specify the K(basis) to collect. Here the minimum parameters toMap() and toConcurrentMap() are key fetch, value to store.

    Example: Here we use the Student structure, which contains id and name.

    public class Student{
    
            / / the only
            private String id;
    
            private String name;
    
            public Student(a) {}public Student(String id, String name) {
                this.id = id;
                this.name = name;
            }
    
            public String getId(a) {
                return id;
            }
    
            public void setId(String id) {
                this.id = id;
            }
    
            public String getName(a) {
                return name;
            }
    
            public void setName(String name) {
                this.name = name; }}Copy the code

    Note: k is the ID, value can be the object itself, or a field of the object can be specified. As you can see, the map’s collection customization is very high.

     				
            Student studentA = new Student("20190001"."Xiao Ming");
            Student studentB = new Student("20190002"."Little red");
            Student studentC = new Student("20190003"."Ding");
    
    
            // function.identity () gets the object itself, so the result is Map
            
              = id-> Student
            ,student>
            // Serial collection
         Stream.of(studentA,studentB,studentC)
                    .collect(Collectors.toMap(Student::getId,Function.identity()));
    
            // Concurrent collection
            Stream.of(studentA,studentB,studentC)
                    .parallel()
                    .collect(Collectors.toConcurrentMap(Student::getId,Function.identity()));
    
            / / = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
    
            / / Map < String, the String > the id - > name
            // Serial collection
            Stream.of(studentA,studentB,studentC)
                    .collect(Collectors.toMap(Student::getId,Student::getName));
    
            // Concurrent collection
            Stream.of(studentA,studentB,studentC)
                    .parallel()
                    .collect(Collectors.toConcurrentMap(Student::getId,Student::getName));
    Copy the code

    What if the key is duplicate? Here we assume that there are two students with the same ID. If they have the same ID, the larger Student will be discarded when the Map is converted.

     				//Map<String,Student>
            Stream.of(studentA, studentB, studentC)
                    .collect(Collectors
                            .toMap(Student::getId,
                                    Function.identity(),
                                    BinaryOperator
                                            .maxBy(Comparator.comparing(Student::getName))));
    
            
            // If it is complicated, write an imperative
            //Map<String,Student>
            Stream.of(studentA, studentB, studentC)
                    .collect(Collectors
                            .toMap(Student::getId,
                                    Function.identity(),
                                    (s1, s2) -> {
                                
                                        // if s1>s2 returns 1,s1==s2 returns 0, otherwise returns -1
                                        if (((Student) s1).name.compareTo(((Student) s2).name) < -1) {
                                            return s2;
                                        } else {
                                            returns1; }}));Copy the code

    If you don’t want to use the default HashMap or ConcurrentHashMap, the third overloaded method can also use a custom Map object (Map factory).

    				 // Customize LinkedHashMap
            //Map<String,Student>
            Stream.of(studentA, studentB, studentC)
                    .collect(Collectors
                            .toMap(Student::getId,
                                    Function.identity(),
                                    BinaryOperator
                                            .maxBy(Comparator.comparing(Student::getName)),
                                    LinkedHashMap::new));
    Copy the code

Polymerization reduction

  1. There are three overloading methods, including StringBuilder, append, and custom separators. This is useful when you want to turn a list into a String. This can be done by specifying a delimiter, which is handy), prefix, and suffix.

    Example:

    		
     				Student studentA = new Student("20190001"."Xiao Ming");
            Student studentB = new Student("20190002"."Little red");
            Student studentC = new Student("20190003"."Ding");
    
    				// Use delimiter: 201900012019000220190003
            Stream.of(studentA, studentB, studentC)
                    .map(Student::getId)
                    .collect(Collectors.joining());
    
            // use ^_^ as a delimiter
            / / ^_^ ^_^ 20190003 20190002 20190001
            Stream.of(studentA, studentB, studentC)
                    .map(Student::getId)
                    .collect(Collectors.joining("^_^"));
    
            // use ^_^ as a delimiter
            //[] is the prefix and suffix
            / / [20190001 ^_^ ^_^ 20190002 20190003]
            Stream.of(studentA, studentB, studentC)
                    .map(Student::getId)
                    .collect(Collectors.joining("^_^"."["."]"));
    Copy the code
  2. Collectors. Counting () indicates the number of statistics elements. This has the same effect as stream.count (), and returns a wrapper Long and a basic Long.

    Example:

    				// Long 8
            Stream.of(1.0, -10.9.8.100.200, -80)
                    .collect(Collectors.counting());
            
            // If it is only for statistics, there is no need to use Collectors, which consume resources
            // long 8
            Stream.of(1.0, -10.9.8.100.200, -80)
                    .count();
    Copy the code
  3. Collectors. MinBy () and maxBy() work the same way as stream.min () and stream.max (). Collectors. MinBy () and maxBy() apply to advanced scenarios.

    Example:

    		 		// maxBy 200
            Stream.of(1.0, -10.9.8.100.200, -80)
                    .collect(Collectors.maxBy(Integer::compareTo)).ifPresent(System.out::println);
    
            // max 200
            Stream.of(1.0, -10.9.8.100.200, -80)
                    .max(Integer::compareTo).ifPresent(System.out::println);
    
            // minBy -80
            Stream.of(1.0, -10.9.8.100.200, -80)
                    .collect(Collectors.minBy(Integer::compareTo)).ifPresent(System.out::println);
    
            // min -80
            Stream.of(1.0, -10.9.8.100.200, -80)
                    .min(Integer::compareTo).ifPresent(System.out::println);
    Copy the code
  4. Collectors. SummingInt (), Collectors. SummarizingLong (), Collectors, summarizingDouble () the three respectively for int, long, double type data a total operation, It returns a SummaryStatistics, which includes the number count, sum, min, average, and Max. Although IntStream, DoubleStream, and LongStream can all be summations, they are just summing. Summing is also very handy if you want to do one-off statistics, averages, etc.

    Example:

    				//IntSummaryStatistics{count=10, sum=55, min=1, average=5.500000, Max =10}
            Stream.of(1.2.3.4.5.6.7.8.9.10)
                    .collect(Collectors.summarizingInt(Integer::valueOf));
    
            / / DoubleSummaryStatistics {count = 10, sum = 55.000000, min = 1.000000, business = 5.500000, Max = 10.000000}
            Stream.of(1.2.3.4.5.6.7.8.9.10)
                    .collect(Collectors.summarizingDouble(Double::valueOf));
    
            //LongSummaryStatistics{count=10, sum=55, min=1, average=5.500000, Max =10}
            Stream.of(1.2.3.4.5.6.7.8.9.10)
                    .collect(Collectors.summarizingLong(Long::valueOf));
    
    
            / / 55
            Stream.of(1.2.3.4.5.6.7.8.9.10).mapToInt(Integer::valueOf)
                    .sum();
    
            / / 55.0
            Stream.of(1.2.3.4.5.6.7.8.9.10).mapToDouble(Double::valueOf)
                    .sum();
    
            / / 55
            Stream.of(1.2.3.4.5.6.7.8.9.10).mapToLong(Long::valueOf)
                    .sum();
    Copy the code
  5. Collectors. AveragingInt (), Collectors. AveragingDouble (), Collectors, averagingLong averaging (), suitable for advanced scenarios, this again later.

    Example:

    				Stream.of(1.2.3.4.5.6.7.8.9.10)
                    .collect(Collectors.averagingInt(Integer::valueOf));
    
            Stream.of(1.2.3.4.5.6.7.8.9.10)
                    .collect(Collectors.averagingDouble(Double::valueOf));
    
            Stream.of(1.2.3.4.5.6.7.8.9.10)
                    .collect(Collectors.averagingLong(Long::valueOf));
    Copy the code
  6. There seems to be much the same thing as stream.reduce (), both of which are protocol operations. Treating. Counting () is implemented with reducing(), as shown in this code:

    public static<T> Collector<T, ? , Long> counting() {return reducing(0L, e -> 1L, Long::sum);
        }
    Copy the code

    In that case, we implement a protocol operation that sums the length of all student names.

    Example:

     				//Optional[6]
            Stream.of(studentA, studentB, studentC)
                    .map(student -> student.name.length())
                    .collect(Collectors.reducing(Integer::sum));
    
            / / 6
            // Alternatively, specify an initial value, which prevents normal execution without elements
            Stream.of(studentA, studentB, studentC)
                    .map(student -> student.name.length())
                    .collect(Collectors.reducing(0, (i1, i2) -> i1 + i2));
    
    
            / / 6
            // Or do not convert at first, when the specification is converted
            Stream.of(studentA, studentB, studentC)
                    .collect(Collectors.reducing(0, s -> ((Student) s).getName().length(), Integer::sum));
    Copy the code

Before and after treatment

  1. Collectors. GroupingBy () and Collectors. GroupingByConcurrent (), the difference between the two is only a single thread and multithreading usage scenarios. Why is groupingBy classified as pre – and post-processing? GroupingBy is grouped before the data is collected, and the grouped data is passed to the downstream collector.

    This is a function of groupingBy’s longest parameter: classifier is the classifier, mapFactory is the factory of the map, and downstream is the collector of the downstream, which can do a lot of operations before the downstream data is delivered.

    public static<T, K, D, A, M extends Map<K, D>> Collector<T, ? , M> groupingBy(Function<?super T, ? extends K> classifier,
                                      Supplier<M> mapFactory,
                                      Collector<? super T, A, D> downstream) 
    Copy the code

    Example: groupingByConcurrent() is the same argument as a set of integers divided into positive, negative, and zero.

    				//Map<String,List<Integer>>
            Stream.of(-6, -7, -8, -9.1.2.3.4.5.6)
                    .collect(Collectors.groupingBy(integer -> {
                        if (integer < 0) {
                            return "Less than";
                        } else if (integer == 0) {
                            return "Equal";
                        } else {
                            return "More than"; }}));//Map<String,Set<Integer>>
            // Customize the downstream collector
            Stream.of(-6, -7, -8, -9.1.2.3.4.5.6)
                    .collect(Collectors.groupingBy(integer -> {
                        if (integer < 0) {
                            return "Less than";
                        } else if (integer == 0) {
                            return "Equal";
                        } else {
                            return "More than";
                        }
                    },Collectors.toSet()));
    
            //Map<String,Set<Integer>>
            // Customize the map container and downstream collector
            Stream.of(-6, -7, -8, -9.1.2.3.4.5.6)
                    .collect(Collectors.groupingBy(integer -> {
                        if (integer < 0) {
                            return "Less than";
                        } else if (integer == 0) {
                            return "Equal";
                        } else {
                            return "More than";
                        }
                    },LinkedHashMap::new,Collectors.toSet()));
    Copy the code
  2. Collectors. PartitioningBy (literally) words is called partition is good, but most partitioningBy data can only be divided into two parts, because partitioningBy partition based on Predicate, Predicate can only have true and false results, and all partitioningBy can only divide data into two groups at most. PartitioningBy has the same parameters except that the classifier is different from groupingBy.

    Example:

     //Map<Boolean,List<Integer>>
            Stream.of(0.1.0.1)
                    .collect(Collectors.partitioningBy(integer -> integer==0));
    
            //Map<Boolean,Set<Integer>>
            // Customize the downstream collector
            Stream.of(0.1.0.1)
                    .collect(Collectors.partitioningBy(integer -> integer==0,Collectors.toSet()));
    Copy the code
  3. Collectors. Mapping () Allows you to define fields to collect.

    Example:

    			//List<String>
            Stream.of(studentA,studentB,studentC)
                    .collect(Collectors.mapping(Student::getName,Collectors.toList()));
    Copy the code
  4. Collectors. CollectingAndThen () after collection operation, if you want to collect data and then take some action, so this is very useful.

    Example: Here after collection into listIterator, only a simple example, the specific implementation of the logic is very open to imagination.

    //listIterator 
    Stream.of(studentA,studentB,studentC)
                    .collect(Collectors.collectingAndThen(Collectors.toList(),List::listIterator));
    Copy the code

conclusion

Collectors. As the core of the Stream, it has rich and powerful tools. Almost all the service code I have written is too difficult to get rid of. It took me about six hours to assemble these apis to write a sort ID, but it worked. This is a complicated operation when I write a business

There is also something, like the Stream operator and Collectors. If you can use Steam’s operators, use similar collector features to reduce overhead.