As a Java developer, you often work with collections. Previously, the first response to such a requirement was to go for-each, write the algorithm framework inside the loop, and then write a number of private methods to be called to handle the details. Having recently fallen in love with Java 8’s functional programming style and the awesome Stream API, pipelining is always the first thing that comes to mind when confronted with this kind of problem.

This article records a complex collection processing requirement that I met today. I could have done it in an imperative style, and it felt more intuitive and easy to understand. But in the end, I still couldn’t resist the temptation of flow, because the assembly line is really sweet.

Problem description

First, through the database query operation and some follow-up processing of query results, we get the set of List<Map<String, Object>> type. Secondly, the elements in the list are grouped or categorized based on the combination of certain keys. Then, a signed summation operation is performed on the elements in each grouping based on a key. Finally, output the processed collection of types List<Map<String, Object>>. In addition, since the data may have been sorted by some key before processing, the original sorting rules should be maintained after processing

The text description is quite abstract, so we combine the specific data (the data here is not the actual data encountered in the work, it is pure fiction, and if there is any similarity, it is pure coincidence) to summarize the requirements. The set data before processing is shown in the following code:

List<Map<String, Object>> generateData() {
    List<Map<String, Object>> data = new ArrayList<>();

    Map<String, Object> rec1 = new HashMap() {{
        put("Product"."A");
        put("Lot".1);
        put("Sign"."Positive");
        put("Units".100);
    }};
    Map<String, Object> rec2 = new HashMap() {{
        put("Product"."A");
        put("Lot".1);
        put("Sign"."Positive");
        put("Units".200);
    }};
    Map<String, Object> rec3 = new HashMap() {{
        put("Product"."A");
        put("Lot".1);
        put("Sign"."Negative");
        put("Units".100);
    }};
    Map<String, Object> rec4 = new HashMap() {{
        put("Product"."A");
        put("Lot".2);
        put("Sign"."Negative");
        put("Units".300);
    }};
    Map<String, Object> rec5 = new HashMap() {{
        put("Product"."A");
        put("Lot".2);
        put("Sign"."Negative");
        put("Units".400);
    }};
    Map<String, Object> rec6 = new HashMap() {{
        put("Product"."B");
        put("Lot".1);
        put("Sign"."Positive");
        put("Units".400);
    }};
    Map<String, Object> rec7 = new HashMap() {{
        put("Product"."B");
        put("Lot".2);
        put("Sign"."Negative");
        put("Units".150);
    }};

    data.add(rec1);
    data.add(rec2);
    data.add(rec3);
    data.add(rec4);
    data.add(rec5);
    data.add(rec6);
    data.add(rec7);
    return data;
}
Copy the code

The elements in the list are of type Map

, and each element contains four keys: Product, Lot, Sign, and Units. Product/Lot is now regarded as a compound ID, and Sign and Units are combined to represent the quantity of the Product (either positive or negative). The requirements are as follows: the output is still a list of elements of type Map

, which contains three keys: Product, Lot, and Units. Product/Lot (regarded as compound ID) is not repeated, and Units are the signed sum of multiple records corresponding to the compound ID.
,>
,>

For example, the output set in this example would contain four elements, A1 (for Product A + Lot 1), A2, B1, and B2, for Units 200, -700, 400, and -150.

Also, the original list is sorted by ascending Product/Lot, requiring the output list to be sorted by this rule as well

Imperative style solutions

The requirements should now be clearer. In traditional imperative style, the code is written as follows (leaving the sorting requirement aside for now) :

import org.apache.commons.lang3.StringUtils;
import java.util.*;

public void traditionalSolution(a) {
    List<Map<String, Object>> data = generateData();

    Map<String, List<Map<String, Object>>> groupedData = grouping(data);
    List<Map<String, Object>> result = aggregating(groupedData);
    System.out.println(result);
}

private Map<String, List<Map<String, Object>>> grouping(List<Map<String, Object>> data) {
    Map<String, List<Map<String, Object>>> result = new HashMap<>();
    for (Map<String, Object> record : data) {
        String compositeKey = getCompositeKey(record);
        if(! result.containsKey(compositeKey)) { List<Map<String, Object>> products =new ArrayList<>();
            result.put(compositeKey, products);
        }
        result.get(compositeKey).add(record);
    }

    return result;
}

private List<Map<String, Object>> aggregating(Map<String, List<Map<String, Object>>> productsByKey) {
    List<Map<String, Object>> result = new ArrayList<>();
    for (Map.Entry<String, List<Map<String, Object>>> entry : productsByKey.entrySet()) {
        List<Map<String, Object>> products = entry.getValue();
        Map<String, Object> merged = merging(products);
        result.add(merged);
    }

    return result;
}

private Map<String, Object> merging(List<Map<String, Object>> records) {
    Integer sum = 0;
    String product = "";
    Integer lot = 0;

    for (Map<String, Object> record : records) {
        product = (String) record.get("Product");
        lot = (Integer) record.get("Lot");

        Integer factor = ("Positive".equals(record.get("Sign"))?1 : -1);
        Integer units = (Integer)record.getOrDefault("Units".0) * factor;

        sum += units;
    }

    Map<String, Object> result = new HashMap<>();
    result.put("Product", product);
    result.put("Lot", lot);
    result.put("Units", sum);

    return result;
}

private String getCompositeKey(Map<String, Object> rec) {
    return StringUtils.join(rec.get("Product"), rec.get("Lot"));
}
Copy the code

The output is:

[{Lot=1, Product=A, Units=200}, {Lot=2, Product=B, Units=-150}, {Lot=2, Product=A, Units=-700}, {Lot=1, Product=B, Units=400}]
Copy the code

As you can see, the results are in line with our expectations (the order is definitely out of order, but we’ll leave this requirement aside for now). This is solved in 60 lines of code, which is not a lot, but it’s not easy to see the logic at a glance. In addition, requirements such as groupings, if you use the Stream API, can be easily handled and logically clear. Next, part of the code is rewritten based on the Stream API for brevity

Introduce partial shunt operation

First, modify the grouping method implementation and leave the rest unchanged (step by step)

import static java.util.stream.Collectors.*;
private Map<String, List<Map<String, Object>>> grouping(List<Map<String, Object>> data) {
    return data.stream().collect(groupingBy(this::getCompositeKey));
}
Copy the code

A: wow! We can see that it takes only one line of code to solve a problem that used to take 11 lines, and the purpose of the method is clear. The output is as follows:

[{Lot=2, Product=B, Units=-150}, {Lot=1, Product=A, Units=200}, {Lot=2, Product=A, Units=-700}, {Lot=1, Product=B, Units=400}]
Copy the code

The data in the list remains the same as before, but the order of the elements has changed because of different implementations within the pipeline. But it doesn’t matter here, if we need to, we can go further and sort the results to make sure that the output is exactly as expected.

Can we go further?

We certainly hope to do a similar reconfiguration for aggregating and merging methods, and to achieve the same effect. However, as we can see from imperative style code, this is not so easy. Aggregating and merging methods contain their own loop structures, and the gating methods are called within the loop body of the aggregating method.

Further analysis reveals that the merging method accepts arguments of type List

>, and the output is type Map

. This behavior is abstracted similar to that of the Collectors. Reducing operation, which applies a reduction function to each element in the list in turn and collapses it into a single element.
,>

On the other hand, the operation idea of traditionalSolution method is grouping -> reduction -> collection. The grouping operation is completed in the preceding step, and the reduction operation is done by the merging method. The collection type returned after grouping is Map, and our final target type is List, so we need to gating to convert the results returned by the collector to another type. Usually with Collectors. CollectingAndThen factory to do this method returns the collector (see the second edition of the Java combat P132).

Further integration

The code for further integration looks like this:

import org.apache.commons.lang3.StringUtils;
import java.util.*;
import static java.util.stream.Collectors.*;

public void functionalSolution(a) {
    List<Map<String, Object>> data = generateData();

    List<Map<String, Object>> processedData =
        data.stream()
            .map(this::transform)
            .collect(collectingAndThen(
                    groupingBy(this::getCompositeKey, reducing(this::accumulating)),                                // The Collector to convert
                    map -> map.values().stream().filter(Optional::isPresent).map(Optional::get).collect(toList())   // Conversion function
            ));
    System.out.println(processedData);
}

private Map<String, Object> transform(Map<String, Object> originalData) {
    Map<String, Object> newData = new HashMap<>(originalData);
    Integer units = (Integer)originalData.getOrDefault("Units".0);
    // get and remove "Sign"
    Integer factor = ("Negative".equals(newData.remove("Sign"))? -1 : 1);

    newData.put("Units", factor * units);
    return newData;
}

private Map<String, Object> accumulating(Map<String, Object> m1, Map<String, Object> m2) {
    Map<String, Object> accumulated = new HashMap<>(m1);

    Integer units1 = (Integer)m1.getOrDefault("Units".0);
    Integer units2 = (Integer)m2.getOrDefault("Units".0);

    accumulated.put("Units", units1 + units2);
    return accumulated;
}

private String getCompositeKey(Map<String, Object> rec) {
    return StringUtils.join(rec.get("Product"), rec.get("Lot"));
}
Copy the code

The brief analysis is as follows: The entire data processing process begins with groupingBy groups, then reduces the data within each group by reducing, and finally transforms the collector composed of the previous two steps by collectingAndThen. Change the output type from Map

>> to List

>.


,>

It is worth noting that, for processing purposes, the map operation precedes the above operation, converting the “Units” value for each element to a signed integer and removing the unwanted field “Sign”. I did a test where instead of doing this conversion, subsequent calls might have incorrect symbols because the default initial value for a single-argument version is the first element. If the first element’s “Sign” value is “Negative” and there is only one element in a bucket, the reduction function passed to us will have no effect and will not invert “Units”. If the three-argument version of the reducing function is called, the initial value is harder to give. Thus, we are in a dilemma. So, I think it’s a good move to get around this problem by doing map processing first. In addition, the filter(Optional::isPresent).map(Optional::get) pipeline removes the Optional problem, and then collectos.tolist () is collected.

In this way, 40 lines of code (actually just over 30 lines, because pipelining splits one line into multiple lines for code clarity) solves the same problem. At this point, the result is as follows, which is clearly expected.

[{Lot=2, Product=B, Units=-150}, {Lot=1, Product=A, Units=200}, {Lot=2, Product=A, Units=-700}, {Lot=1, Product=B, Units=400}]
Copy the code

Addressing sorting requirements

The sorting requirements are not really relevant to the rest of the requirements, so this article will discuss them independently after the rest of the requirements are implemented. The sorting requirement can be broken down into two main points:

  1. Multi-field combination sort
  2. The reverse order

Requirement 1 is very simple. It mainly uses the new Comparing and thenComparing methods of Java 8 Comparator and the new sort method of List. The code is as follows:

private Comparator<Map<String, Object>> getComparator() {
    return Comparator.comparing((Map<String, Object> map) -> (String)map.get("Product"))
            .thenComparing(map -> (Integer)map.get("Lot"));
}

@Test
public void functionalSolution(a) {
    // omit existing code

    System.out.println(processedData);
    processedData.sort(getComparator());
    System.out.println(processedData);
}
Copy the code

The output is as follows, and clearly the output is as expected

[{Lot=2, Product=B, Units=-150}, {Lot=1, Product=A, Units=200}, {Lot=2, Product=A, Units=-700}, {Lot=1, Product=B, Units=400}]
[{Lot=1, Product=A, Units=200}, {Lot=2, Product=A, Units=-700}, {Lot=1, Product=B, Units=400}, {Lot=2, Product=B, Units=-150}]
Copy the code

Now add requirement 2, which requires the List to be sorted in reverse order by “Product” and “Lot” values. Introduce the reversed method of the Comparator and change the code as follows:

private Comparator<Map<String, Object>> getComparator() {
    return Comparator.comparing((Map<String, Object> map) -> (String)map.get("Product")).reversed()
            .thenComparing(map -> (Integer)map.get("Lot")).reversed();
}
Copy the code

In this case, the output is as follows

[{Lot=2, Product=B, Units=-150}, {Lot=1, Product=A, Units=200}, {Lot=2, Product=A, Units=-700}, {Lot=1, Product=B, Units=400}]
[{Lot=2, Product=A, Units=-700}, {Lot=1, Product=A, Units=200}, {Lot=2, Product=B, Units=-150}, {Lot=1, Product=B, Units=400}]
Copy the code

Alas, observations show that the order of output is not what we expected. In functional style, the intent of our declarative code is clear: the intent of our comparator is indeed to sort both fields in reverse order. Why is that?

After a bit of groping, I found an overloaded version of comparing and thenComparing that takes two parameters. Make the following changes to the code

private Comparator<Map<String, Object>> getComparator() {
    return Comparator.comparing((Map<String, Object> map) -> (String)map.get("Product"), Comparator.reverseOrder())
            .thenComparing(map -> (Integer)map.get("Lot"), Comparator.reverseOrder());
}
Copy the code

This time the output order is as expected

[{Lot=2, Product=B, Units=-150}, {Lot=1, Product=A, Units=200}, {Lot=2, Product=A, Units=-700}, {Lot=1, Product=B, Units=400}]
[{Lot=2, Product=B, Units=-150}, {Lot=1, Product=B, Units=400}, {Lot=2, Product=A, Units=-700}, {Lot=1, Product=A, Units=200}]
Copy the code

But I still have no idea why the former scheme failed. The article says:

The following two sorts are completely different and must be distinguished.

  1. Comparator.comparing(ClassX::extractY).reversed();
  2. Comparator.comparing(ClassX::extractY, Comparator.reverseOrder());

1 is to get the sorting results before sorting, 2 is to directly sort, many people will be confused and lead to understanding errors; 2 is better understood. Use 2 is recommended

In addition, there are examples of this article can be referred to.

Thus, it can be explained that the output of the initial reverse arrangement scheme does not meet the expectation: The first reversed the “Product” order (so that the “Product B” records were in front), and the second reversed the “Lot” order (so that the “Product B” records were moved to the back). Thinking of the first scenario in this way, make the following changes to the code, and the results are exactly as expected.

private Comparator<Map<String, Object>> getComparator() {
    return Comparator.comparing((Map<String, Object> map) -> (String)map.get("Product"))
            .thenComparing(map -> (Integer)map.get("Lot")).reversed();
}
Copy the code

But this scheme is still too intuitive, mind around a long time not to say, once the number of fields added to sort, the whole order, reverse order relationship is more chaotic, so the second scheme is simpler and more intuitive.

There is also a null pit, which means that a NullPointerException will be thrown if one of the sorting fields in a record is null. The solution is nullsFirst or nullsLast. Doc explains nullsFirst as follows:

Returns a null-friendly comparator that considers {@code null} to be less than non-null. When both are {@code null}, they are considered equal. If both are non-null, the specified {@code Comparator} is used to determine the order. If the specified comparator is {@code null}, then the returned comparator considers all non-null values to be equal.

For example, knowing that the “Lot” value might be missing, modifying the comparator code as follows can cause the sorting code to run successfully again:

private Comparator<Map<String, Object>> getComparator() {
    return Comparator.comparing((Map<String, Object> map) -> (String)map.get("Product"), Comparator.reverseOrder())
            .thenComparing(map -> (Integer)map.get("Lot"), Comparator.nullsFirst(Comparator.reverseOrder()));
}
Copy the code