preface

Since Groovy was used in the previous project to abstract some business, the effect was relatively good, and there were some pits in the process, so a brief record to share how I realized it step by step, here you can understand:

1. Why groovy as the rules scripting engine

2. Understand the fundamentals of Groovy and how Java integrates

3. Analyze some problems and pits of Groovy and Java integration

4. What performance optimization was made when used in the project

5, Some tips to consider when using

Problems that rule scripts can solve

In the Internet era, with the rapid development of services, the speed of iteration and product access is getting faster and faster, requiring some flexible configurations. Methods are usually as follows:

1. The most traditional way is that Java programs are directly written to provide several adjustable parameter configurations and then packaged into independent business module components. After adding parameters or simply adjusting rules, they are adjusted to online again.

2. Use an open source solution, such as the Drools rule engine, which is suitable for complex business systems

Use dynamic scripting engines: Groovy, simpleEl, QLExpress

Introducing rule scripts to abstract the business can greatly improve efficiency. For example, the author before the loan approval system, the order of the loan after the charge sheet passes through several processes of reverse: accept single orders should be decided according to the results of the risk control system are given after the process, and the reverse rule for different product order is not consistent, each access to a new product, code farmers have to write a pile for the product process logic; Rules for existing products often need to be changed. Therefore, I want to use the dynamic parsing execution of the script engine to use the rule script to abstract out the torsion of the process and improve efficiency.

How to choose wheels

Traditional open source solutions such as Acitivities and Drools are too heavy for my business given the complexity of our business. Ali has some open source projects. For different rule scripts, performance, stability, and syntactic flexibility need to be considered. Overall, groovy is chosen for the following reasons:

1, long history, large scope of use, less pit

2, and Java compatibility: seamless Java code, even if you don’t understand groovy syntax

3. Grammar sugar

4. The project cycle is short and the online time is urgent 😢

Abstraction of project flow

This is because different businesses treat logic differently when the process is reversed. Let’s consider A simple case where the project itself has A business process reversal for different loan orders. For example, the order can be shifted from process A to process B or process C, depending on the execution of each Strategy Unit (see figure below) : Each Strategy Unit returns A Boolean value after execution. The specific logic can be defined by itself. Here we assume that: if the conditions of all Strategy Unit A are met (that is, each execution Unit returns true), the order will be reversed to Scenario B; If all the conditions of Strategy Unit B are met, the order is reversed to Scenario C.

Why multiple StrategyLogicUnits? This is because in my project, in order to facilitate configuration, the StrategyLogicUnit configuration of the whole process is displayed on the UI, which is more readable and only needs to modify the execution logic in a certain unit.

The data upon which each StrategyLogicUnit is executed can be abstracted into a Context, which contains two parts of data: one is business data, such as products of orders and risk control data dependent on orders, and the other is rule execution data: The context of the execution data of the rule engine can be designed according to different services. The context of the execution data of the rule engine mainly considers breakpoint rerun and policy group: For example, the association between different policy groups and products can be designed. This part of the business coupling is relatively large. This paper focuses on Groovy.

Context can be understood as the input and output of StrategyLogicUnit. StrategyLogicUnit is executed in Groovy. We can display and configure each executed StrategyLogicUnit in a configurable manner. You can make logical decisions based on the information contained in the context, or change the value of the context object.

Process-based integration of Groovy with Java

So how do we combine Groovy and Java based on the above flow? Based on the above design, the Execution of Groovy scripts essentially just takes the context object and makes logical judgments based on the key information in the context object and outputs the result. And the result is also stored in the context. Let’s take a look at how Groovy integrates with Java:

GroovyClassLoader

With Groovy’s GroovyClassLoader, it dynamically loads a script and executes it. GroovyClassLoader is a Groovy-custom classloader that parses Groovy classes used in loading Java classes.

GroovyShell

GroovyShell allows you to evaluate arbitrary Groovy expressions in Java classes (and even Groovy classes). You can use Binding objects to enter parameters into expressions and eventually return the result of Groovy expression evaluations via GroovyShell.

GroovyScriptEngine

GroovyShell is often used to derive opposing scripts or expressions, and it is better to use GroovyScript tengine instead of multiple scripts that are related to each other. GroovyScriptEngine loads Groovy scripts from locations you specify (file systems, urls, databases, and so on) and reloads them as the scripts change. Like GroovyShell, GroovyScriptEngine allows you to pass in parameter values and return the values of the script.

GroovyClassLoader, for example

You can do all three, so let’s take a GroovyClassLoader example and show you how to integrate with Java:

For example, we assume that orders with an application amount greater than 20000 are introduced into Maven in The SpringBoot project in process B

< the dependency > < groupId > org. Codehaus. Groovy < / groupId > < artifactId > groovy -all < / artifactId > < version > 2.4.10 < / version > </dependency>Copy the code

Define the Java interface that Groovy executes:

public interface EngineGroovyModuleRule {
    boolean run(Object context);
}
Copy the code

Abstract out a Groovy template file and place it under resource for loading:

The import com. Groovyexample. Groovy. * s class implements EngineGroovyModuleRule {Boolean run (Object context) {% s / / business execution logic: Configurable}}Copy the code

Next is mainly analytical Groovy template file, can be cached template file, parse I was conducted through spring PathMatchingResourcePatternResolver; The StrategyLogicUnit String below is the logic of the specific business rule. Configure the logic of this part. For example, we assume that the execution logic is: when the amount of order application is greater than 20000, go to process A, and the simple example of the code is as follows:

ConcurrentHashMap<String,String> ConcurrentHashMap = new ConcurrentHashMap(128); final String path ="classpath*:*.groovy_template";
        PathMatchingResourcePatternResolver resolver = new PathMatchingResourcePatternResolver();
        Arrays.stream(resolver.getResources(path))
                .parallel()
                .forEach(resource -> {
                    try {
                        String fileName = resource.getFilename();
                        InputStream input = resource.getInputStream();
                        InputStreamReader reader = new InputStreamReader(input);
                        BufferedReader br = new BufferedReader(reader);
                        StringBuilder template = new StringBuilder();
                        for(String line; (line = br.readLine()) ! = null; ) { template.append(line).append("\n");
                        }
                        concurrentHashMap.put(fileName, template.toString());
                    } catch (Exception e) {
                        log.error("resolve file failed", e); }}); String scriptBuilder = concurrentHashMap.get("ScriptTemplate.groovy_template");
        String scriptClassName = "testGroovy"; String StrategyLogicUnit ="if(context.amount>=20000){\n" +
                " context.nextScenario='A'\n" +
                " return true\n" +
                " }\n" +
                "";
        String fullScript = String.format(scriptBuilder, scriptClassName, StrategyLogicUnit);


Copy the code
    GroovyClassLoader classLoader = new GroovyClassLoader();
    Class<EngineGroovyModuleRule> aClass = classLoader.parseClass(fullScript);
    Context context = new Context();
    context.setAmount(30000);
    try {
        EngineGroovyModuleRule engineGroovyModuleRule = aClass.newInstance();
        log.info("Groovy Script returns:{} "+engineGroovyModuleRule.run(context));
        log.info("Next Scenario is {}"+context.getNextScenario());
    }
    catch (Exception e){
       log.error("error...")
    }
Copy the code

Execute the code above:

Groovy Script returns: true
Next Scenario is A
Copy the code

The key part is the configurability of StrategyLogicUnit. We display StrategyLogicUnit corresponding to different products on the UI of the management terminal, and can perform CRUD. To facilitate configuration, we also introduce functions such as policy group, product policy replication association, and one-click template replication.

Pit and performance optimization during integration

During the test, the project found that as the number of orders increased, frequent Full GC was conducted. After the test environment reappeared, the log showed:

[Full GC (Metadata GC Threshold) [PSYoungGen: 64K->0K(43008K)] [ParOldGen: [Metaspace: 15031K->15031K(1062912K)], 0.009340secs] [Times: User sys = = 0.03 0.00, real = 0.01 secs]Copy the code

The log shows that mataspace is out of space and cannot be collected by the Full GC. Using JVisualVM, you can see what happens:

Problem 1: Number of classes Problem: The introduction of Groovy may have caused too many classes to be loaded, but the project is actually configured with only 10 StrategyLogicUnits. Different orders executing the same StrategyLogicUnit should correspond to the same Class. The number of classes is too abnormal.

Question 2: Why can’t the Full GC collect even if there are too many classes?

Now let’s study with a question.

GroovyClassLoader load

Let’s start by analyzing what Groovy does. The most important pieces of code are as follows:

 GroovyClassLoader classLoader = new GroovyClassLoader();
 Class<EngineGroovyModuleRule> aClass = classLoader.parseClass(fullScript);
 EngineGroovyModuleRule engineGroovyModuleRule = aClass.newInstance();
engineGroovyModuleRule.run(context)
Copy the code

GroovyClassLoader is a custom classloader that dynamically loads Groovy scripts as Java objects at code execution time. We all know about classloaders’ parent delegates, so let’s take a look at the ancestors of groovyClassLoaders:

def cl = this.class.classLoader  
while (cl) {  
    println cl  
    cl = cl.parent  
}  
Copy the code

Output:

groovy.lang.GroovyClassLoader$InnerLoader@13322f3  
groovy.lang.GroovyClassLoader@127c1db  
org.codehaus.groovy.tools.RootLoader@176db54  
sun.misc.Launcher$AppClassLoader@199d342  
sun.misc.Launcher$ExtClassLoader@6327fd  
Copy the code

Thus:

The Bootstrap this write sun. Misc. The Launcher. ExtClassLoader / / that the Extension this write sun. Misc. The Launcher. AppClassLoader / / System this write org. Codehaus. Groovy. View RootLoader / / the following is the User Custom this write Groovy. Lang. GroovyClassLoader write groovy. Lang. GroovyClassLoader. InnerLoaderCopy the code

Check key GroovyClassLoader parseClass method, found the following code:

    public Class parseClass(String text) throws CompilationFailedException {
        return parseClass(text, "script" + System.currentTimeMillis() +
                Math.abs(text.hashCode()) + ".groovy");
    }
Copy the code
    protected ClassCollector createCollector(CompilationUnit unit, SourceUnit su) {
        InnerLoader loader = AccessController.doPrivileged(new PrivilegedAction<InnerLoader>() {
            public InnerLoader run() {
                returnnew InnerLoader(GroovyClassLoader.this); }});return new ClassCollector(loader, unit, su);
    }
Copy the code

The two codes mean: Groovy generates a class object for the script every time it executes. This class object is named “script” + System.currentTimemillis () + math.abs (text.hashcode). Each time the same StrategyLogicUnit is executed, a different class is generated, and each time the rule script is executed, a new class is generated.

The InnerLoader section of Problem 2: Groovy loads the object with a new InnerLoader each time the script is executed. For problem 2, we can infer that: Neither InnerLoader nor script objects can be retrieved at fullGC, so the PERM fills up after running for a while, triggering fullGC all the time.

Why do WE need innerLoader?

Combined with the parent delegate model, since a ClassLoader can only load classes of the same name once, if both are loaded by GroovyClassLoader, when C is defined in one script, another script defines C, GroovyClassLoader won’t load.

Because a class cannot be GC until its ClassLoader is GC.

If all classes are loaded by a GroovyClassLoader, then all classes can only be GC if the GroovyClassLoader is GC, whereas with InnerLoader, since there are no external references to it after the source code is compiled, except for the classes it loads, So as long as the classes it loads are not referenced, it and the classes it loads can be GC.

Conditions for Class recycling (in Depth understanding the JVM Virtual Machine)

A Class in a JVM can only be collected by the GC if it meets any of the following three conditions:

1. All instances of this Class are already GC, meaning that there are no instances of this Class in the JVM.

2. The ClassLoader that loaded the class has been GC.

3. Java.lang.Class for this Class

Object is not referenced anywhere, such as a method that cannot be accessed anywhere through reflection. Analyze these three points one by one:

The first point is ruled out:

See GroovyClassLoader. ParseClass () code, and concluded: Groovy compiles the script to a class called Scriptxx, which generates an instance of reflection and executes it by calling its MAIN function. This action is executed only once, and there is no reference to the class or its generated instance anywhere else in the application.

The second point was ruled out:

About InnerLoader: Groovy specifically creates a new InnerLoader for each script to solve GC problems, so the InnerLoader should be independent and not referenced in the application;

Only a third possibility remains:

The Class object of this Class is referenced.

    /**
     * sets an entry in the class cache.
     *
     * @param cls the class
     * @see #removeClassCacheEntry(String)
     * @see #getClassCacheEntry(String)
     * @see #clearCache()
     */
    protected void setClassCacheEntry(Class cls) { synchronized (classCache) { classCache.put(cls.getName(), cls); }}Copy the code

You can repeat the problem and see the cause: infinite loop parsing scripts, jmap-clsstat to check the status of the classloader, and export dump to check the reference relationship. So here’s why: Groovy Parse caches the script’s Class each time it parses the script, and the next time it parses it, it will read from the cache first. The cached Map is held by the GroovyClassLoader, where key is the script’s class name, value is class, and class objects are named as follows:

“script” + System.currentTimeMillis() + Math.abs(text.hashCode()) + “.groovy”

Therefore, each time the compiled object name is different, a class object is added to the cache, causing the class object to be unreleasable. As the number of times the compiled class object fills up the PERM section.

The solution

Most of the time, Groovy is compiled and executed, and in this case, even though the script is passed in as an argument, the content of most of the scripts is the same. The solution is to cache Class objects generated after parseClass at project startup through the InitializingBean interface, with keys as md5 values of groovyScript scripts, and to refresh the cache when the configuration is modified at the configuration end. There are two advantages to this:

1. Solve the metaspace overflow problem

2. You can speed up script execution because you don’t need to compile and load at runtime

conclusion

Groovy fit in the case of business change more rapidly in some configurable processing, it is easy to use, its essence is run on the JVM Java code, need to know it when we use the class loading mechanism, to know the basis of memory storage by heart, and by caching to solve some potential problems at the same time improving performance. It is suitable for rules engines that have a relatively small number of rules and do not update them frequently.

You have saved the template to github: github.com/loveurwish/…