10 Simple Performance optimizations in Java

Java 7 ForkJoinPool and Java 8 parallel Streams help parallelize things, which is useful when you deploy Java programs to multi-core processor machines. The advantage of this parallelism over scaling across different machines on the network is that you can almost completely eliminate the latency effect because all kernels have access to the same memory. But don’t be fooled by parallel effects! Remember the following two points:

Parallelism eats away at your core. This is useful for batch processing, but a nightmare for asynchronous servers such as HTTP. Over the past few decades, we have used the single-threaded servlet model for good reason. Therefore, parallelism is only helpful when scaling up.
Parallelism has no effect on the algorithm’s Big O Notation. If your algorithm is O(n log n), and you have that algorithm running on a C kernel, you will still have an O(n log n/C) algorithm, because C is a trivial constant in the complexity of your algorithm. You’ll save wall clock time, but not complexity!

Of course, the best way to improve performance is to reduce algorithm complexity. The killer is implementing O(1) or quasi-O (1), of course, such as HashMap lookups. But that’s not always possible, let alone easy. If you can’t reduce complexity, if you tweak your algorithm where it really matters, if you can get it right, you can still get a lot of performance. Assume a visual representation of the following algorithm:

The overall complexity of the algorithm is, or if we’re dealing with a single order of magnitude. However, when you analyze this code, you might notice an interesting scenario: O(N3)O(N x O x P)

In your development box, the left branch (N -> M -> Heavy Operation) is the only branch you can see in the analyzer, because the value O of and is small in your development sample data.
However, in production, the right branch (N -> O -> P -> Easy operation or NOPE) does cause trouble. Your operations team may have solved this problem using AppDynamics or DynaTrace or some similar software.

Without production data, you might quickly draw conclusions and optimize “heavy lifting.” You ship to production, your fix doesn’t work. There is no golden rule for optimization other than the fact that:

Well-designed applications are easier to optimize
Early tuning will not solve any performance problems and will make your application less well designed, making tuning more difficult

Enough theory. Let’s assume that finding the right branch is the problem. It is likely that a very simple operation failed in production because it was called many times (if N, O, and P are large). Please read this article in the case of inevitable problems with the leaves of the algorithm. These optimizations will not help you scale. They will help you save your customers’ time temporarily, postponing difficult improvements to the overall algorithm until later! O(N3) Here are 10 of the simplest performance optimizations in Java:

Use StringBuilder

This should be the default setting in almost all Java code. Try to avoid using the + operator. Of course, you could argue that it’s just syntactic sugar for StringBuilder, for example:

String x = "a" + args.length + "b";
Copy the code

Translation as follows:

new java.lang.StringBuilder [16]dupldc <String "a"> [18]invokespecial java.lang.StringBuilder(java.lang.String) [20]aload_0 [args]arraylengthinvokevirtual java.lang.StringBuilder.append(int) : java.lang.StringBuilder [23]ldc <String "b"> [27]invokevirtual java.lang.StringBuilder.append(java.lang.String) : java.lang.StringBuilder [29]invokevirtual java.lang.StringBuilder.toString() : java.lang.String [32]astore_1 [x]
Copy the code

But what happens later if you need to modify your string using optional parts?

String x = "a" + args.length + "b"; if (args.length == 1)    x = x + args[0];
Copy the code

You will now have a second StringBuilder, which just needlessly consumes your heap memory and puts pressure on your GC. Instead write:

StringBuilder x = new StringBuilder("a"); x.append(args.length); x.append("b"); if (args.length == 1); x.append(args[0]);Copy the code

In the example above, this might not matter at all if you use an explicit StringBuilder instance, or if you rely on the Java compiler to create an implicit instance for you. But remember, we’re in the N.O.P.E. branch. For every CPU cycle we waste doing something stupid like GC or allocating the default capacity for aStringBuilder, we’re wasting N x O x P time. As a rule of thumb, always use aStringBuilder instead of the + operator. If you can, keep references to multiple StringBuilder methods if your build is more complex. Only one StringBuilder “walks” through your entire SQL AST (Abstract Syntax tree) for Shouting, if you still have StringBuffer references, please replace them with StringBuilder. You rarely need to synchronize the strings you are creating.

2. Avoid regular expressions

Regular expressions are relatively cheap and convenient. However, if you are in the N.O.P.E. branch, they are the worst thing you can do. If you absolutely must use regular expressions in computationally intensive code parts, at least cache the Pattern reference instead of recompiling it all the time:

static final Pattern HEAVY_REGEX =    Pattern.compile("(((X)*Y)*Z)*");
Copy the code

But if your regular expression is really stupid

String[] parts = ipAddress.split("\\.");
Copy the code

Then you’d really be better off resorting to plain char[] or index-based operations. For example, this completely unreadable loop does the same thing:

int length = ipAddress.length(); int offset = 0; int part = 0; for (int i = 0; i < length; i++) { if (i == length - 1 || ipAddress.charAt(i + 1) == '.') { parts[part] = ipAddress.substring(offset, i + 1); part++; offset = i + 2; }}Copy the code

This also shows why you shouldn’t do any premature optimizations. Compared to the split() version, this is not maintainable. The challenge: Smarter readers may find faster algorithms. Take-out regular expressions are useful, but they come at a cost. If you’re stuck in the N.O.P.E. branch, regular expressions must be avoided at all costs. Note the various JDK String methods that use regular expressions, such as String.replaceall (), or string.split (). Use a popular library such as Apache Commons Lang for string manipulation instead.

3. Do not use iterator()

Now, this recommendation does not actually apply to general use cases, but only to N.O.P.E. The depth of the branch. Nonetheless, you should think about it. Writing java-5-style foreach loops is convenient. You can forget all about the inside of the loop and write:

for (String value : strings) {    // Do something useful here}
Copy the code

However, each time you encounter this loop, if strings is an Iterable, you create a new instance of Iterator. If you use ArrayList, this will assign inTS an object of 3 on your heap:

private class Itr implements Iterator<E> {    int cursor;    int lastRet = -1;    int expectedModCount = modCount;    // ...
Copy the code

Instead, you can write the following equivalent loop and “waste” only a single int on the stack, which is very cheap:

int size = strings.size(); for (int i = 0; i < size; i++) { String value : strings.get(i); // Do something useful here}Copy the code

… Or, if your list hasn’t really changed, you can even manipulate its array version:

for (String value : stringArray) {    // Do something useful here}
Copy the code

Iterators, Iterable, and foreach loops are useful from the standpoint of writability and readability, as well as from an API design standpoint. However, they create a small new instance on the heap at each iteration. If you run the iteration multiple times, make sure you avoid creating useless instances and write index-based iterations instead.

4. Do not call that method

Some are simple and expensive. In our N.O.P.E. branch example, we don’t have such a method in the leaf, but you might have one. Let’s assume that your JDBC driver needs to go through an incredible amount of trouble to calculate resultSet.wasnull (). Your own SQL framework code might look something like this:

if (type == Integer.class) { result = (T) wasNull(rs, Integer.valueOf(rs.getInt(index))); } // And then... static final <T> T wasNull(ResultSet rs, T value)throws SQLException { return rs.wasNull() ? null : value; }Copy the code

Resultset.wasnull () now, this logic is called every time you get an int from the ResultSet. But the getInt() contract reads:

Returns: column value; If the value is SQL NULL, the return value is 0

So a simple but potentially huge improvement to the above would be:

static final <T extends Number> T wasNull( ResultSet rs, T value)throws SQLException { return (value == null || (value.intValue() == 0 && rs.wasNull())) ? null : value; }Copy the code

So, it’s simple: the point is not to call expensive methods in the “leaf nodes” of the algorithm, but to cache the calls, or avoid them if the method contract allows.

5. Use primitives and stacks

In the example above, it used a lot of generics and was forced to use wrapper types byte, short, int, and long — at least until generics were made special in Java 10 and the project Valhalla. But you may not have this constraint in your code, so you should do everything you can to replace it:

// Goes to the heapInteger i = 817598;
Copy the code

Like this:

// Stays on the stackint i = 817598;
Copy the code

Things get even worse when using arrays:

// Three heap objects! Integer[] i = { 1337, 424242 };Copy the code

Like this:

// One heap object.int[] i = { 1337, 424242 };
Copy the code

As you delve into the N.O.P.E. branch, you should be very careful with wrapper types. Chances are you’re putting a lot of pressure on your GC, which has to clean up your mess all the time. A particularly useful optimization might be to take some primitive type and create a large one-dimensional array of it, along with a few delimiter variables to indicate exactly where your encoded object is on the array. Trove4jint [] is an excellent primitive collection library that is a little more complex than your average and comes with LGPL. Exception There is one exception to this rule: and BooleanByte rarely have enough values to be fully cached by the JDK. You can write:

Boolean a1 = true; // ... syntax sugar for:Boolean a2 = Boolean.valueOf(true); Byte b1 = (byte) 123; // ... syntax sugar for:Byte b2 = Byte.valueOf((byte) 123);
Copy the code

The same is true for low values of other basic integer types, including char, short, int, and long. But not when you call the constructor, only when you automatically box or call theType.valueof ()!

Never call a constructor on a wrapper type unless you really want a new instance

This fact can also help you write a complex April Fool’s joke for your colleagues off-heap of course, you may also want to try off-heap libraries, although they are more strategic decisions than local optimizations.

Avoid recursion

Modern functional programming languages like Scala encourage recursion because they provide ways to optimize tail-recursive algorithms back to iterative algorithms. If your language supports such optimizations, you’re probably fine. But even then, the slightest change in the algorithm can produce a branch that prevents your recursion from being tail-recursive. Hopefully the compiler will detect this! Otherwise, you might waste a lot of stack frames that could be implemented with just a few local variables. When you get into the N.O.P.E. branch, you always prefer iteration to recursion

7. Use entrySet()

When you want to traverse aMap and need keys and values, you must have a good reason to write the following:

for (K key : map.keySet()) { V value : map.get(key); }Copy the code

Instead of the following:

for (Entry<K, V> entry : map.entrySet()) { K key = entry.getKey(); V value = entry.getValue(); }Copy the code

When you’re in the N.O.P.E. branch, you should be wary of maps anyway, because a lot of O(1) map access operations are still a lot of operations. And access isn’t free. But at the very least, if you can’t do without maps, use it entrySet() to iterate over them! In any case, the Map.Entry instance is there; you just need to access it. EntrySet () is always used when both keys and values are needed during a Map iteration.

8. Use EnumSet or EnumMap

In some cases, the number of possible keys in a map is known in advance — for example, when using a configuration map. If the number is relatively small, you should really consider using EnumSetor EnumMap instead of the regular HashSetor HashMap. This is easily explained by looking at enummap.put () :

private transient Object[] vals; public V put(K key, V value) { // ... int index = key.ordinal(); vals[index] = maskNull(value); / /... }Copy the code

The essence of this implementation is that instead of having a hash table, we have an array of indexed values. To find the mapping entry when inserting a new value, all we have to do is query the enum for its constant ordinal, which is generated by the Java compiler on each enum type. If this is a global configuration map (that is, only one instance), the increased access speed will help EnumMap vastly outperform HashMap, which may use less heap memory, but must run hashCode() and equals() on each key. Enum and EnumMap are very close friends. When you use enumeration-like structures as keys, actually consider using those structures as enumerations and using them as keys in EnumMap.

9. Optimize your hashCode() and equals() methods

If you can’t use EnumMap, at least optimize the hashCode() and equals() methods. A good hashCode() method is necessary because it will prevent further calls to equals(), which is much more expensive, because it will generate more different hash buckets for each instance set. Within each class hierarchy, there can be popular and simple objects. The simplest and fastest implementation of hashCode() looks like this:

// AbstractTable, a common Table base implementation: @Overridepublic int hashCode() { // [#1938] This is a much more efficient hashCode() // implementation compared to that of standard // QueryParts return name.hashCode(); }Copy the code

Name Where is the table name? We don’t even consider the schema or any other attributes of the table, because the table names are usually different enough in the database. Also, its name is a string, so it already has a cached hashCode() value in it. Annotations are important because AbstractTableextends is a common base implementation of any AST (Abstract syntax tree) element AbstractQueryPart. The generic AST element does not have any attributes, so it cannot make any assumptions about the optimized implementation. Therefore, the overridden method looks like this: hashCode()

// AbstractQueryPart, a common AST element// base implementation: @Overridepublic int hashCode() { // This is a working default implementation. // It should be overridden by concrete subclasses, // to improve performance return create().renderInlined(this).hashCode(); }Copy the code

In other words, the entire SQL rendering workflow must be triggered to compute the hash code of a normal AST element. Things get more interesting equals()

// AbstractTable, a common Table base implementation: @Overridepublic boolean equals(Object that) { if (this == that) { return true; } // [#2144] Non-equality can be decided early, // without executing the rather expensive // implementation of AbstractQueryPart.equals() if (that instanceof AbstractTable) { if (StringUtils.equals(name, (((AbstractTable<? >) that).name))) { return super.equals(that); } return false; } return false; }Copy the code

First thing: Always abort (not only on the NOPE branch) each equals() method ahead of time if:

this == argument
this “incompatible type” argument

Note that the latter condition includes argument == null if your instanceof is used to check for compatible types. We previously blotted about this in 10 Subtle Best Practices When Coding Java. Now, after stopping the comparison early in the obvious case, you might also want to stop the comparison early when partial decisions can be made. For example, the convention table.equals () is that two tables are considered equal, and they must have the same name regardless of the specific implementation type. For example, these two terms cannot be equal:

com.example.generated.Tables.MY_TABLE
DSL.tableByName(“MY_OTHER_TABLE”)

If argument cannot equal this, and we can easily check it, then let’s do so and abort if the check fails. If the check succeeds, we can still get from super. Since most objects in the universe are not equal, we will save a lot of CPU time by shortcuts.

10. Think in sets, not individual elements

Last but not least, there is one thing that has nothing to do with Java, but applies to any language. In addition, we will leave the N.O.P.E. branch, as this recommendation may only help you migrate to, or something similar. Unfortunately, many programmers think in terms of simple local algorithms. They are solving the problem step by step, branch by branch, cycle by cycle, method by method. This is the imperative and/or functional programming style. While it’s getting easier to model the “bigger picture” as you move from pure imperative to object-oriented (still imperative) to functional programming, all of these styles lack something only SQL and R and similar languages have: declarative programming. In SQL (which we like because it’s O(N3)O(n log n)) you can declare the results you want from your database without having any effect on the algorithm. The database can then consider all available metadata (such as constraints, keys, indexes, and so on) to figure out the best possible algorithm. In theory, this has been the main idea behind SQL and relational calculus from the beginning. The main advantage of using collections is that your algorithms become much more concise.

Instead of:

// Pre-Java 8Set result = new HashSet(); for (Object candidate : someSet) if (someOtherSet.contains(candidate)) result.add(candidate); // Even Java 8 doesn't really helpsomeSet.stream() .filter(someOtherSet::contains) .collect(Collectors.toSet());Copy the code

Some might argue that functional programming and Java 8 will help you write simpler, more concise algorithms. That’s not necessarily true. You can turn an imperative Java-7 loop into a collection of functional Java-8 streams, but you’re still writing the same algorithm. Writing sqL-like expressions is different.

SomeSet intersecting SomeOtherSet

This can be done in 1000 ways with the implementation engine. EnumSet As we know today, automatically convert these two collections to the sensible approach INTERSECT before running an operation. Maybe we can parallelize stream.parallel () without INTERSECT making any low-level calls

11, conclusion

In this article, we discuss optimization on the N.O.P.E. branch, that is, in the depths of high complexity algorithms.

Each query generates A StringBuilder only on a single
Our template engine actually parses characters rather than using regular expressions
We use arrays whenever possible, especially when iterating over listeners
We stay away from JDBC methods that we don’t have to call
And so on…

This article is based on Google Translate, the sentence has made a simple adjustment, but some places may still read awkward, interested can go to the original address to read.

PS: In case you can’t find this article, please click “like” to browse and find it.