Eclipse Collections: Take Java Streams to the next level

The key points

Eclipse Collections is a high-performance Java Collections framework that adds rich functionality to the native JDK Collections.

Streams is a very popular feature of the JDK, but it lacks some features and relies heavily on older collections implementations and lengthy apis.

Eclipse Collections provides an alternative to traditional JDK data structures and supports data structures such as Bag and Multimap.

Refactoring Streams into Eclipse Collections helps improve code readability and reduce memory footprint.

Best of all, refactoring Streams using Eclipse Collections is easy!

Java Streams, introduced in Java 8, is great — it allows us to make full use of lambda expressions to replace looping iterative code, bringing it closer to functional programming style.

However, despite the improvements Streams has made, it is ultimately just an extension of the existing collections framework, and still carries a lot of baggage.

Domestic and overseas architecture deduction under All in AI

Luji Thinking Go language micro-service transformation complete process

Ali Cainiao global cross-domain RPC architecture practice pain points and difficulties

Didi map engine architecture practice and AI technology application

Smart contracts provide a basic environment for the implementation of blockchain applications. What can we do based on smart contracts?

Related Sponsors

Can we improve it further? Can we have richer interfaces and cleaner, more readable code? Can we save more memory than traditional collections? Can we support functional programming better and more seamlessly?

The answer is yes! Eclipse Collections (formerly known as GS Collections) is an alternative to the Java Collections framework that we can use for our purposes.

In this article, we’ll demonstrate several examples of reconstructing Eclipse Collections data structures and apis from standard Java code and how to save memory.

There will be many code examples that show how to change code that uses standard Java collections and Streams to code that uses the Eclipse Collection framework.

Before diving into the code, we’ll spend some time looking at what Eclipse Collections are, why we need them, and why we need to recompose Eclipse Collections from conventional Java.

The history of Eclipse Collections

Eclipse Collections was originally created by Goldman Sachs, whose application platform had a large distributed cache component. The system stores hundreds of GIGABytes of data in memory (and is still running in production).

In fact, a cache is a Map in which we store and read objects. These objects can contain other maps and collections. Initially, the cache was built based on standard data structures in the java.util.* package. It is clear, however, that these collections have two significant disadvantages: inefficient memory usage and very limited interfaces (resulting in repetitive and hard-to-read code). Because the problems stem from the implementation of collections, they cannot be solved with additional code bases. To solve both problems, Goldman Sachs decided to create a new collection framework from scratch.

It seemed like a radical solution at the time, but it worked. The framework is now hosted by the Eclipse Foundation.

At the end of the article, we shared links that will help you learn more about the project itself, learn how to use Eclipse Collections, and become a code contributor to the project.

Why refactor to Eclipse Collections?

What are the benefits of Eclipse Collections? Because it offers richer apis, more efficient memory usage, and better performance. Eclipse Collections is, in our opinion, the richest collection library in the Java ecosystem. And it is fully compatible with collections in the JDK.

Easily migrate

Before delving into these benefits, it’s important to note that migrating to Eclipse Collections is very easy and you don’t have to do it all at once. Eclipse Collections is fully compatible with the JDK’s java.util.* List, Set, and Map interfaces. It is also compatible with other libraries in the JDK, such as Collectors. Our data structure inherits these JDK interfaces, so they can be used as JDK equivalents (although the Stack interface is incompatible, as are the new immutable collections, which don’t exist in the JDK).

Richer apis

Eclipse Collections, which implements the Java.util. List, Set, and Map interfaces, have richer apis that we’ll explore in the code examples that follow. Some types are missing from the JDK, such as Bag, Multimap, and BiMap. A Bag is a multiple set that can contain repeating elements. Logically, we can think of it as a mapping of elements to the number of times they occur. BiMap is an “inverted” Map that not only finds values by keys, but also keys by values. Multimap is a Map whose values are sets (such as Key->List, Key->Set, etc.).

Eager or lazy?

With Eclipse Collections, we can easily switch between lazy and eager implementation modes, which is helpful for writing, understanding, and debugging functional code. Unlike the Streams API, eager is the default mode. If you want to use lazy mode, just call.aslazy () on your data structure before starting your logical code.

Immutable collection interface

With immutable sets, you can write more correct code at the API level through immutability. In this case, the correctness of the program is guaranteed by the compiler to avoid surprises during execution. With immutable collections and richer interfaces, you can write purely functional code in Java.

Primitive type set

Eclipse Collections also provides containers of primitive types, all of which have immutable equivalents. It’s worth noting that JDK Streams supports ints, long, and double, while Eclipse Collections supports all eight primitive types and can define Collections that hold primitive values directly (unlike their boxing objects, For example, Eclipse Collections IntList is a List of ints, while List<Integer> in the JDK is a List of crated raw values.

There is no “bun” method

What is the “bun” method? This is a metaphor invented by Oracle Java lead designer Brian Goetz. A hamburger (two slices of bread with meat between them) represents a typical streaming code structure. When using Java Streams, if you want to do anything, you must place your methods between two “loaves of bread” — the stream() (or parallelStream()) method in front and the collect() method behind. These breads aren’t really nutritious, but you can’t eat meat without them. In Eclipse Collections, these methods are not required. The following example demonstrates the BUN method in the JDK: Suppose we have a list with their names and ages, and we want to extract the names of people older than 21:

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__var people = List.of(new Person("Alice ", 19), new Person("Bob ", 52), new Person("Carol ", 35)); var namesOver21 = people.stream() // Bun .filter(person -> person.getAge() > 21) // Meat .map(Person::getName) // Meat .collect(Collectors.toList()); // Bun namesOver21.forEach(System.out::println); __Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

Here is the code for Eclipse Collections – no bun method required!

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__var people = Lists. Immutable. Of (new Person(" Alice ", 19), new Person(" Bob ", 52), new Person(" Carol ", 35)); var namesOver21 = people .select(person -> person.getAge() > 21) // Meat, no buns .collect(Person::getName); // Meat namesOver21.forEach(System.out::println); __Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

Any type you need

In Eclipse Collections, there are types and methods for each case, and you can find them based on your needs. There’s no need to remember their names — just think about what kind of data structure you need. Do you need a mutable or immutable set? Sort of? What type of data do you want to store in a collection — raw values or objects? What kind of combination do you need? Lazy, eager or parallel? Following the methods outlined in the diagram below, we can easily build the data structures we need.

Instantiate them through the factory method

This is similar to the factory methods of the List, Set, and Map interfaces in Java 9, but with more options!

Part of the method of grouping by category

The collection type itself provides a rich API that can be used directly. These collection types inherit from the RichIterable interface (or PrimitiveIterable). We’ll see some of these apis in the following examples.

More methods

Word clouds — this isn’t new, is it? That’s not entirely unreasonable, though — it makes some important points. First, there are so many methods, covering every iteration pattern imaginable, that they can be used directly on collection types. Second, the number of words in the word cloud is proportional to the number of methods. There are multiple implementations of different collection types optimized for specific types.

Example: word count

Let’s start with something simple.

Given a text (in this case, a nursery rhyme), count the number of occurrences of each word in the text, and the output is the set of words and the corresponding number of occurrences for each word.

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__@BeforeClass static public void loadData() { words = Lists.mutable.of(( "Bah, Bah, black sheep,\n " + "Have you any wool? \n ").split("[ ,\n?] + ")); }__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

Note that we will use the Eclipse Collections factory method to calculate the words. This is equivalent to arrays.aslist (…) in the JDK. Method, but it returns an instance of MutableList. Since the MutableList interface is fully compatible with the JDK’s List, we can use this type in the JDK and Eclipse Collections examples below.

First, let’s look at an implementation that doesn’t use Streams:

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__@Test public void countJdkNaive() { Map<String, Integer> wordCount = new HashMap<>(); words.forEach(w -> { int count = wordCount.getOrDefault(w, 0); count++; wordCount.put(w, count); }); System.out.println(wordCount); Assert. AssertEquals (2, wordCount. Get (" Bah "). IntValue ()); Assert. AssertEquals (1, wordCount. Get (" sheep ".) intValue ()); }__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

As you can see, we create a string-to-INTEGER HashMap (mapping each word to its occurrences), iterate over each word and get its occurrences from the Map, defaulting to zero if the word doesn’t exist. We then increment the value and store it back into the Map. This is not a good implementation because we focus on the “how” rather than the “what” and the performance is not very good. Let’s try rewriting it with Streams:

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__@Test public void countJdkStream()  { Map<String, Long> wordCounts = words.stream() .collect(Collectors.groupingBy(w -> w, Collectors.counting())); Assert. AssertEquals (2, wordCounts. Get (" Bah "). IntValue ()); Assert. AssertEquals (1, wordCounts. Get (" sheep ".) intValue ()); }__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

In this case, the code is more readable, but still not very efficient. You also need to know how to use methods of the Collectors class — methods that are not easy to find because they are not part of the Streams API.

An efficient way to do this is to introduce a separate counter class and store it as a value in the Map. For example, we have a class called Counter that holds an integer value and provides a method increment() that increments the value by 1. We can then rewrite the above code as:

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__@Test public void countJdkEfficient() { Map<String, Counter> wordCounts = new HashMap<>(); words.forEach( w -> { Counter counter = wordCounts.computeIfAbsent(w, x -> new Counter()); counter.increment(); }); Assert. AssertEquals (2, wordCounts. Get (" Bah "). IntValue ()); Assert. AssertEquals (1, wordCounts. Get (" sheep ".) intValue ()); }__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

This was actually a very efficient solution, but we had to write a whole new class (Counter).

The Eclipse Collection Bag provides a solution tailored and optimized for this problem.

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__ @Test public void countEc() { Bag<String> bagOfWords = wordList.toBag(); // toBag() is a method on MutableList assert. assertEquals(2, bagofwords. occurrencesOf(" Bah ")); Assert. AssertEquals (1, bagOfWords occurrencesOf (" sheep ")); Assert. AssertEquals (0, bagOfWords. OccurrencesOf (" Cheburashka ")); // null safe - returns a zero instead of throwing an NPE }__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

All we need to do is call the collection’s toBag() method. Also, we can avoid possible thrown NPES by not calling the intValue() method of the object directly.

Example: zoo

Suppose we have a zoo. In the zoo we have animals that live on different kinds of food.

We would like to inquire some information about animals and the food they eat:

The most popular food
A list of animals and the amount of food they like
Food item
Types of food
Meat and non-meat

These code snippets have been tested using the Java Microbenchmark Harness (JMH) framework. We’ll go through the code and compare them. For specific performance comparison results, see the “JMH Benchmark Results” section below.

These are animals and the foods they like to eat (each food has a name, type and quantity).

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__private static final Food BANANA = The new Food (" Banana ", FoodType. FRUIT, 50); Private static final Food APPLE = new Food(" APPLE ", FoodType.FRUIT, 30); Private static final Food CAKE = new Food(" CAKE ", FoodType.DESSERT, 22); Private static final Food CEREAL = new Food(" CEREAL ", FoodType.DESSERT, 80); Private static final Food = new Food(" oai ", 97); Private static final Food CARROT = new Food(" CARROT ", FoodType.VEGETABLE, 27); Private static final Food = new Food(" HAMBURGER ", FoodType.MEAT, 3); Private static MutableList<Animal> zooAnimals = Lists. Mutable. With (new Animal(" ZigZag ", AnimalType. Lists. Mutable. With (BANANA, APPLE)), New Animal(" Tony ", AnimalType.TIGER, Lists. Mutable. New Animal(" Phil ", AnimalType.GIRAFFE, Lists. Mutable. With (CAKE, CARROT)), New Animal(" Lil ", AnimalType. Lists.mutable.with(SPINACH)),__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

Example 1 — Most popular food.

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__@Benchmark
public List<Map.Entry<Food, Long>> mostPopularFoodItemJdk()
{
    //output: [Hamburger=2]
    return zooAnimals.stream()
     .flatMap(animals -> animals.getFavoriteFoods().stream())
     .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
     .entrySet()
     .stream()
     .sorted(Map.Entry.<Food, Long>comparingByValue().reversed())
     .limit(1)
     .collect(Collectors.toList());
}__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

We first flow zooAnimals and flatMap each animal () to its favorite food, returning a stream. Next, we group the foods using the food identifier as the keyword and the quantity as the value, so that we can determine the number of animals for each food. This is work for Collectors. Counting (). To sort it, we call the Map’s entrySet() method, stream it, sort it by the reverse value (this value is the count of each food, and if we want to know the most popular food, we need to sort it in reverse order), then call limit(1) to return the maximum value, and finally, We collect it into a List.

The most popular food was [Hamburger = 2].

Next, let’s look at how Eclipse Collections can be used to do the same thing.

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__@Benchmark
public MutableList<ObjectIntPair<Food>> mostPopularFoodItemEc()
{
    //output: [Hamburger:2]
    MutableList<ObjectIntPair<Food>> intIntPairs = zooAnimals.asLazy()
            .flatCollect(Animal::getFavoriteFoods)
            .toBag()
            .topOccurrences(1);
    return intIntPairs;
}__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

We also started by flatmapping each animal to its favorite food. Since what we really want is a food-to-quantity Map, Bag is the perfect solution to our problem. We call toBag() followed by topOccurrences(), which returns the most frequent food items. TopOccurrences (1) returns the most popular foods and as a list of ObjectIntPairs (note that int is the primitive type), resulting in [Hamberger:2].

Example 2 – Amount of food animals like: How many animals eat only one food? How many animals eat two kinds of food?

First, the IMPLEMENTATION of the JDK:

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__@Benchmark
public Map<Integer, String> printNumberOfFavoriteFoodItemsToAnimalsJdk()
{
    //output: {1=[Lil, GIRAFFE],[Simba, LION], 2=[ZigZag, ZEBRA],
    //         [Tony, TIGER],[Phil, GIRAFFE]}
    return zooAnimals.stream()
            .collect(Collectors.groupingBy(
                    Animal::getNumberOfFavoriteFoods,
                    Collectors.mapping(
                            Object::toString, 
                              // Animal.toString() returns [name,  type]
                            Collectors.joining(“,”))));
                              // Concatenate the list of animals for 
                              // each count into a string
}
__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

Then we use Eclipse Collections:

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__@Benchmark
public Map<Integer, String> printNumberOfFavoriteFoodItemsToAnimalsEc()
{
    //output: {1=[Lil, GIRAFFE], [Simba, LION], 2=[ZigZag, ZEBRA],
    // [Tony, TIGER], [Phil, GIRAFFE]}
    return zooAnimals
            .stream()
            .collect(Collectors.groupingBy(
                    Animal::getNumberOfFavoriteFoods,
                    Collectors2.makeString()));
}
__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

This example focuses on using native Java Collectors and Eclipse Collections Collector2 together. The two are not mutually exclusive. In this case, we want to get the amount of food per animal. So how do you do that? In native Java, we first use Collectors. GroupingBy to group each animal according to the number of its favorite foods. We then use the Collectors. Mapping function to map each object to its toString method, and finally call Collectors. Joining joins the strings and separates them with commas.

In Eclipse Collections, we can also use the Collectors. GroupingBy method, But the more succinct collectors2.makeString is called to get the same result (the makeString turns a stream into a comma-separated string).

Example 3 – Food Items: How many different types of food are there and what are they?

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__@Benchmark
public Set<Food> uniqueFoodsJdk()
{
    return zooAnimals.stream()
            .flatMap(each -> each.getFavoriteFoods().stream())
            .collect(Collectors.toSet());
}

@Benchmark
public Set<Food> uniqueFoodsEcWithoutTargetCollection()
{
    return zooAnimals.flatCollect(Animal::getFavoriteFoods).toSet();
}

@Benchmark
public Set<Food> uniqueFoodsEcWithTargetCollection()
{
    return zooAnimals.flatCollect(Animal::getFavoriteFoods, 
                                   Sets.mutable.empty());
}__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

There are several ways we can solve this problem! Using JDK, we stream zooAnimals, then flatMap their favorite foods, and finally collect them into a collection. If we use Eclipse Collections, we have two approaches. The first is roughly the same as the JDK version, flat foods, and then calls toSet() to put them into a collection. The second approach is interesting because it uses the concept of a target set. FlatCollect is an overloaded method, so there are several different ways to use it. Passing a collection as a second argument means we will flat the food directly into the collection and skip the intermediate list used in the first example. This intermediate result can be avoided by calling asLazy(), which waits for the final operation to finish, thus avoiding intermediate state. However, if you prefer fewer API calls or need to add results to an existing collection, consider using the target collection when converting from one type to another.

Example 4 – Meat and Non-meat: How many meat eaters are there? How many non-carnivores?

Note that in the following two examples, we chose to declare Predicate lambda explicitly (rather than inline) at the top to emphasize the difference between JDK Predicate and Eclipse Collections Predicate. Eclipse Collections had definitions of function, Predicate, and other function types long before Java 8’s java.util.function package. Function types in Eclipse Collections now extend the equivalent types in the JDK to interoperate with dependent JDK libraries.

__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__@Benchmark
public Map<Boolean, List<Animal>> getMeatAndNonMeatEatersJdk()
{
    java.util.function.Predicate<Animal> eatsMeat = animal ->
            animal.getFavoriteFoods().stream().anyMatch(
                            food -> food.getFoodType()== FoodType.MEAT);

    Map<Boolean, List<Animal>> meatAndNonMeatEaters = zooAnimals
            .stream()
            .collect(Collectors.partitioningBy(eatsMeat));
    //returns{false=[[ZigZag, ZEBRA], [Phil, GIRAFFE], [Lil, GIRAFFE]],
               true=[[Tony, TIGER], [Simba, LION]]}
    return meatAndNonMeatEaters;
}

@Benchmark
public PartitionMutableList<Animal> getMeatAndNonMeatEatersEc()
{
    org.eclipse.collections.api.block.predicate.Predicate<Animal> eatsMeat = 
           animal ->animal.getFavoriteFoods()
                   .anySatisfy(food -> food.getFoodType() == FoodType.MEAT);

    PartitionMutableList<Animal> meatAndNonMeatEaters = 
                                           zooAnimals.partition(eatsMeat);
    // meatAndNonMeatEaters.getSelected() = [[Tony, TIGER], [Simba, LION]]
    // meatAndNonMeatEaters.getRejected() = [[ZigZag, ZEBRA], [Phil, GIRAFFE], 
    //                                        [Lil, GIRAFFE]]
    return meatAndNonMeatEaters;
}__Fri Jun 22 2018 15:36:20 GMT+0800 (CST)____Fri Jun 22 2018 15:36:20 GMT+0800 (CST)__Copy the code

We wanted to separate elements between carnivores and non-carnivores. We build a Predicate “eatsMeat” that checks each animal’s favorite food to see if anyMatch (JDK) or anySatisfy (Eclipse Collections) is foodtype.meat.

In the JDK example, we make stream() to the animals and call partitioningBy(), passing in eatsMeat Predicate. What is returned is a Map with true or false as the key. “True” returns a carnivore, while “false” returns a non-carnivore.

In Eclipse Collections, we call partition() on zooAnimals and pass in the Predicate. 例 : We’ll get a PartitionMutableList which provides two methods – getSelected() and getRejected(), both of which return MutableLists. The chosen element is the carnivore, and the rejected element is the non-carnivore.

Comparison of memory usage

In the example above, the focus is on the type and interface of the collection. We mentioned at the beginning that using Eclipse Collections would lead to memory optimizations. Depending on how large and what type of collection is used in a particular application, the effect can be quite dramatic.

You can see the memory usage comparison between Eclipse Collections and the Java.util.* collection in the figure.

The horizontal axis represents the number of elements stored in the collection, and the vertical axis represents the storage overhead in kilobytes. The overhead here represents the memory used after subtracting the collection payload (so we only show the memory used by the data structure itself). After calling System.gc(), we use totalMemory() -freememory () to figure out how much memory is used. The results we observed is stable, and with Java 8 using JDK. Nashorn. Internal. Ir. Debug. ObjectSizeCalculator example of the results obtained are consistent (this program could be exactly calculated object size, Unfortunately not compatible with Java 9 and later).

The first figure shows the advantage of Eclipse Collections int lists over JDK Integer lists. The figure shows that for a million values, the list in java.util.* will have 15MB more memory (about 20MB for the JDK and 5MB for Eclipse Collections).

Map in Java is very inefficient because map.Entry objects are required, which increases memory usage.

If Map memory is inefficient, Set memory is abysmally inefficient, because the underlying implementation of Set uses Map, which wastes memory. Map.entry is not very useful because it has only one property that is useful — the key, which is the element of the collection. As a result, you’ll find that Sets and maps in Java use the same amount of memory, but sets can be made more compact, as Eclipse Collections does. It ends up using much less memory than the JDK collection, as shown in the figure above.

Finally, the fourth figure shows the advantages of a particular combination type. As mentioned earlier, Bag is just a collection that allows multiple instances of each element and maps the element to its number of occurrences. You can use Bag to count occurrences of elements. The equivalent data structure in java.util.* is a Map of the number of occurrences of elements, and the developer is responsible for updating the number of occurrences of elements. As you can see, specific data structures (BAGS) have been optimized to minimize memory usage and garbage collection.

Of course, we recommend testing every case. If Eclipse Collections were replaced with standard Java Collections, the results would certainly improve, but the extent to which they affect overall memory usage depends on the situation.

JMH benchmark test results

In this section, we’ll look at the speed of the previous examples and compare the performance of the code before and after the Eclipse Collections rewrite. The figure shows the number of Eclipse Collections and JDK operations per second in each example. Longer bars mean better results. As you can see, the speed increase is very noticeable:

It is important to emphasize that the results we present apply only to the specific example above. The results will depend on your particular situation, so be sure to test against your real-world scenario.

conclusion

Eclipse Collections have evolved over the past decade to optimize Java code and applications. It’s easy to use — ready-made data structures, and provides a smoother API than traditional streaming code. Any use cases we haven’t solved? We hope you will join us as contributors! Feel free to pull our code from GitHub and share your results! We would love to see you share your experiences with Eclipse Collections and how it affects your applications. Happy coding!

Useful links

Your one stop shop for getting started with Eclipse Collections
The Eclipse Collections GitHub project
Source code for this article
Good introductory literature to Eclipse Collections (formerly GS Collections)
Optimization Strategies with Eclipse Collections
UnifiedSet — The Memory Saver
UnifiedMap: How it works?
Bag — The Counter

About the author

Kristen O’Leary is senior vice president of The Services engineering group at Goldman Sachs. She brought multiple containers, apis, and performance enhancements to Eclipse Collections, and also teaches courses on the framework both internally and externally.

Vladimir Zakharov has over twenty years of experience in software development. He is currently managing director of Goldman Sachs ‘platform business group. He has been developing in Java for the past 18 years, and before that he used Smalltalk and other obscure programming languages.

Refactoring to Eclipse Collections: Making Your Java Streams Leaner, Meaner, and Cleaner