Fans, I don’t know if you are tired of reading the story (if not, please leave a comment and tell me ^_^), today’s article is a change of flavor, to write a serious technical article. Anyway, let’s get started!

Structure of this paper:

- Requirement background - Attacking Python - Java and Python - Speeding Up Python - Looking for direction - Jython? - Python->Native code - overall idea - actual hands-on - automation - key issues - Import issues - Python GIL issues - test effects - summaryCopy the code

Demand background

To advance the Python

With the rise of artificial intelligence, Python, once a niche programming language, is enjoying a second Renaissance.

The popularity of machine learning/deep learning frameworks, such as TensorFlow and PyTorch, helped Python, a programming language once known for its crawlers (don’t be angry with Python fans), to climb all the way to the top three TIOBE programming languages. It is second only to Java and C, beating C++, JavaScript, PHP, C#, etc.

Of course, Xuanyuan Jun always does not advocate the competition between programming languages, each language has its own advantages and disadvantages, has its own application field. On the other hand, the TIOBE statistics are not representative of the actual situation in China, and the above examples only reflect the popularity of Python today.

Java 还是 Python

Back to our requirements, in many enterprises today, there are both Python research and development teams and Java research and development teams. The Python team is responsible for the development of artificial intelligence algorithms, while the Java team is responsible for algorithm engineering, which provides the interface of algorithm capabilities to higher-level applications through the engineering packaging.

Maybe you want to ask, why not do AI development in Java directly? It’s going to take two teams. In fact, frameworks including TensorFlow are gradually starting to support the Java platform, so it is not a bad idea to use Java for AI development (in fact, there are many teams doing this), but due to historical reasons, there are not many people doing AI development. Most of these people are using Python technology to stack the pit. The AI development ecology of Python has been relatively well built, so in many companies, the algorithm team and the engineering team have to use different languages.

Now it’s time to throw in the big question of this article: How does a Java engineering team invoke Python’s algorithmic capabilities?

The answer is basically one: Python starts a Web service through a framework like Django/Flask, and Java interacts with it through Restful apis

The above approach does solve the problem, but it comes with a performance problem. In particular, as the number of users increases, the amount of concurrent interface access, network access and Python code execution speed will become the bottleneck of the entire project.

Of course, companies with good money can build performance out of hardware; if one doesn’t work, deploy several Python Web services.

Is there a more affordable solution? That’s what this article is about.

To the Python acceleration

Looking for direction

In the performance bottleneck above, there are two main reasons for the slow execution:

  • Access through the network is not as fast as calling the internal module directly
  • Python is interpreted, not fast

As we all know, Python is an interpreted scripting language, and in general, in terms of execution speed:

Interpreted language < intermediate bytecode language < native compiled language

Naturally, we have to work in two directions:

  • Can you call it locally without network access
  • Python does not interpret execution

Combining the two points above, our goal becomes clear:

Convert Python code into modules that Java can call directly and locally

For Java, there are two types of calls that can be made locally:

  • Java code package
  • Native code module

When we talk about Python, we usually mean CPython, which is an interpreter developed in C to explain execution. In addition, in addition to C, many other programming languages can develop virtual machines to interpret and execute Python scripts based on the Python language specifications:

  • CPython: An interpreter written in C
  • Jython: An interpreter written in Java
  • IronPython: Interpreter for the.NET platform
  • PyPy: Python’s own interpreter (chicken and egg)

Jython?

Interacting with Java business code is naturally the easiest if you can execute Python scripts directly in the JVM. But a subsequent investigation found that the road was quickly blocked:

  • Syntax above Python3.0 is not supported
  • Third-party libraries referenced in python source code that contain C extensions, such as Numpy, will not be supported

This route doesn’t work, so here’s another: Convert Python code into Native code blocks that Java calls through JNI’s interface.

Python -> Native code

The overall train of thought

First, convert the Python source code into C code, then compile the C code into binary module SO/DLL with GCC, then encapsulate the Java Native interface, use Jar package command to convert it into Jar package, and then Java can be directly invoked.

The process is not complicated, but there is a key issue that needs to be addressed to achieve this goal in its entirety:

How to convert Python code into C code?

Finally, it’s time for the main character of this article, a core tool called Cython

Note that this Cython is not the same thing as the aforementioned CPython. CPython is a Python interpreter written in C language. It is the default Python script interpreter for Windows and Linux.

Cython is a third-party library for Python that you can install using PIP install Cython.

Cython is a superset of Python language specifications that convert mixed Python+C pyx scripts into C code. It is mainly used to optimize The performance of Python scripts or to call C function libraries.

This may sound a bit complicated and convoluted, but it doesn’t matter, just get a core point: Cython can convert Python scripts into C code

Here’s an experiment:

# FileName: test.py
def TestFunction(a):
  print("this is print from python script")
Copy the code

Convert the above code through Cython to generate test.c, which looks like this:

Reality get to work

1. Prepare Python source code

# FileName: Test.py
Example code: convert the input string to uppercase
def logic(param):
  print('this is a logic function')
  print('param is [%s]' % param)
  return param.upper()

Interface function, exported to Java Native interface
def JNI_API_TestFunction(param):
  print("enter JNI_API_test_function")
  result = logic(param)
  print("leave JNI_API_test_function")
  return result
Copy the code

Note 1: We use a convention here in python source code: functions prefixed with JNI_API_ represent the interface functions that python code modules export to call externally, so that our Python one-key transJAR system can automatically identify which interfaces to extract as export functions.

Note 2: The input of this type of interface function is a Python string of type STR, and the output is also a string of type STR, which makes it easy to port RESTful interfaces that used to take arguments in JSON form. The advantage of using JSON is that you can encapsulate parameters and support multiple complex parameter forms without overloading different interface functions for external calls.

Note 3: It is also important to note that after the JNI_API_ prefix of the interface function, the function name should not be named in python’s usual underline naming method, but in the hump naming method. Note that this is not a recommendation, but a requirement, for reasons that will be discussed later.

Prepare a main.c file

The purpose of this file is to encapsulate the code generated by the Cython transformation into a Java JNI interface style for use in Java in the next step.

/* DO NOT EDIT THIS FILE - it is machine generated */
#include <jni.h>
#include <Python.h>
#include <stdio.h>

#ifndef _Included_main
#define _Included_main
#ifdef __cplusplus
extern "C" {
#endif

#if PY_MAJOR_VERSION < 3
# define MODINIT(name)  init ## name
#else
# define MODINIT(name)  PyInit_ ## name
#endif
PyMODINIT_FUNC  MODINIT(Test)(void);

JNIEXPORT void JNICALL Java_Test_initModule
(JNIEnv *env, jobject obj) {
  PyImport_AppendInittab("Test", MODINIT(Test));
  Py_Initialize();

  PyRun_SimpleString("import os");
  PyRun_SimpleString("__name__ = \"__main__\"");
  PyRun_SimpleString("import sys");
  PyRun_SimpleString("sys.path.append('./')");

  PyObject* m = PyInit_Test_Test();
  if(! PyModule_Check(m)) { PyModuleDef *mdef = (PyModuleDef *) m; PyObject *modname = PyUnicode_FromString("__main__");
  	m = NULL;
  	if (modname) {
  	  m = PyModule_NewObject(modname);
  	  Py_DECREF(modname);
  	  if (m) PyModule_ExecDef(m, mdef);
  	}
  }
  PyEval_InitThreads();
}


JNIEXPORT void JNICALL Java_Test_uninitModule
(JNIEnv *env, jobject obj) {
  Py_Finalize();
}

JNIEXPORT jstring JNICALL Java_Test_testFunction
(JNIEnv *env, jobject obj, jstring string)
{
  const char* param = (char*)(*env)->GetStringUTFChars(env, string.NULL);
  static PyObject *s_pmodule = NULL;
  static PyObject *s_pfunc = NULL;
  if(! s_pmodule || ! s_pfunc) { s_pmodule = PyImport_ImportModule("Test");
    s_pfunc = PyObject_GetAttrString(s_pmodule, "JNI_API_testFunction");
  }
  PyObject *pyRet = PyObject_CallFunction(s_pfunc, "s", param);
  (*env)->ReleaseStringUTFChars(env, string, param);
  if (pyRet) {
    jstring retJstring = (*env)->NewStringUTF(env, PyUnicode_AsUTF8(pyRet));
    Py_DECREF(pyRet);
    return retJstring;
  } else {
    PyErr_Print();
    return (*env)->NewStringUTF(env, "error"); }}#ifdef __cplusplus
}
#endif
#endif
Copy the code

There are three functions in this file:

  • Java_Test_initModule: Python initialization
  • Java_Test_uninitModule: Python uninitialization
  • Java_Test_testFunction: a real business interface that encapsulates the call to the JNI_API_testFuncion function defined in Python and is responsible for the JNI layer parameter jString conversion.

According to the JNI interface specification, the naming of C functions at the native level should conform to the following form:

QualifiedClassName: Specifies the full class name
MethodName: indicates the name of the JNI interface function
void
JNICALL
Java_QualifiedClassName_MethodName(JNIEnv*, jobject);
Copy the code

This is why the definition in the main.c file should be named as above. This is why the python interface function names should not be underlined. This will cause the JNI interface to not find the corresponding native function.

3. Use the Cython tool to compile and generate dynamic libraries

Add a little prep work: change the suffix of Python source files from.py to.pyx

With the Python source code test.pyx and the main.c file ready, it’s time for Cython, which automatically converts all pyx files into.c files, and internally calls GCC to generate a dynamic binary library file, combined with our own main.c file.

To work on Cython, you need to prepare a setup.py file, which configurates the compilation information of the transformation, including the input file, output file, compilation parameters, include directory, and link directory, as follows:

from distutils.core import setup
from Cython.Build import cythonize
from distutils.extension import Extension

sourcefiles = ['Test.pyx'.'main.c']

extensions = [Extension("libTest", sourcefiles, 
  include_dirs=['/ Library/Java/JavaVirtualMachines jdk1.8.0 _191. JDK/Contents/Home/include'.'/ Library/Java/JavaVirtualMachines jdk1.8.0 _191. JDK/Contents/Home/include/Darwin/'.'/ Library/Frameworks/Python framework Versions / 3.6 / include/python3.6 m'],
  library_dirs=['/ Library/Frameworks/Python framework Versions / 3.6 / lib/'],
  libraries=['python3.6 m'])]

setup(ext_modules=cythonize(extensions, language_level = 3))
Copy the code

Note: This involves the compilation of Python binaries and requires a link to the Python library

Note: This refers to JNI related data structure definitions and requires the Java JNI directory to be included

With the setup.py file ready, run the following command to start the conversion + compilation:

Python3.6 setup. Py build_ext -- inplaceCopy the code

Generate the dynamic library file we need: libtest.so

4. Prepare the interface file for Java JNI invocation

Java business code usage needs to define an interface, as follows:

// FileName: Test.java
public class Test {
  public native void initModule(a);
  public native void uninitModule(a);
  public native String testFunction(String param); }Copy the code

At this point, the purpose of calling in Java has been realized. Note that before calling the business interface, you need to call the initModule to initialize Python at the native level.


import Test;
public class Demo {
    public void main(String[] args) {
        System.load("libTest.so");
        Test tester = new Test();
        tester.initModule();
        String result = tester.testFunction("this is called from java"); tester.uninitModule(); System.out.println(result); }}Copy the code

Output:

enter JNI_API_test_function
this is a logic function
param is [this is called from java]
leave JNI_API_test_function
THIS IS CALLED FROM JAVA!
Copy the code

Successful implementation of calling Python code in Java!

5. Encapsulate it as a Jar package

To achieve the above is not enough, in order to better use experience, we go one step further and encapsulate it into a Jar package.

First, the original JNI interface file needs to be expanded to add a static loadLibrary method to automatically release and load the so file.

// FileName: Test.java
public class Test {
  public native void initModule(a);
  public native void uninitModule(a);
  public native String testFunction(String param);
  public synchronized static void loadLibrary(a) throws IOException {
    // Implement...}}Copy the code

Then convert the above interface file to a Java class file:

javac Test.java
Copy the code

Finally, prepare to put the class and so files in the Test directory and package them:

jar -cvf Test.jar ./Test
Copy the code

automation

The above 5 steps are a hassle if you have to do them manually every time! The good news is that we can write A Python script to completely automate this process and actually convert the Jar packages in Python with one click

For the sake of space, here’s just the key to the automation process:

  • Automatically scan python source code to extract interface functions that need to be exported
  • Java files for main.c, setup.py, and JNI interfaces need to be generated automatically (you can define template + parameter form to quickly build), and the corresponding relationship between module names and function names needs to be handled properly

The key problem

1. An import problem

The example shown above is just a single PY file. In practice, our projects usually have multiple PY files, and these files usually form a complex directory hierarchy with various import relationships among each other.

One of the biggest pitfalls of the Cython tool is that the directory level information of the code file is lost in the file code it processes. As shown in the figure below, the c.py converted code is no different from the code generated by m/C.py.

This leads to a very big problem: if the a.py or B.py code has a reference to the c. py module in the m directory, the directory information will be lost, and the two can not find the corresponding module when they execute the import m.C!

Fortunately, experiments have shown that in the figure above, import works correctly if modules A, B, and C are in the same level of directory.

Xuyuanjun once tried to read the source code of Cython and modified it to keep the directory information so that the generated C code could still import normally. However, due to the short time and insufficient understanding of the mechanism of Python interpreter, he gave up after trying.

After getting stuck on this problem for a long time, I finally chose the clumsy option of expanding the tree of code-level directories into a flat directory structure, which in the case of the example above becomes

A.py
B.py
m_C.py
Copy the code

That alone is not enough; all references to C in A and B need to be corrected to be references to m_C.

This may seem simple, but it’s more complicated than that. In Python, import comes in a variety of complicated forms than just import:

import package import module import package.module import module.class / function import package.module.class / function  import package.* import module.* from module import * from module import module from package import * from package import module from package.module import class / function ...Copy the code

In addition, it is possible to write references directly through modules in your code.

The cost of expanding into a flat structure is to deal with all of the above! Xuanyuan jun helpless under only this bad policy, if you have a better solution also hope not stingy give advice.

2. The problem of Python GIL

The Python converted JAR package started to be used in production, but then a problem was discovered:

Every time the Number of Java concurrency goes up, the JVM periodically crashes

Subsequent analysis of the crash information revealed that the crash was in the Python converted code in the Native code.

  • Is it a Cython bug?
  • Is the converted code cratered?
  • Or is there something wrong with the import fixes above?

Crash cloud hanging over the head for a long time, calm down to think: why the test is normal when no problems, online will crash?

Looking through the crash log again, I found that in the native code, the exception always occurs at the place where malloc allocates memory. Is the memory damaged? It was also discovered that only functional testing was done, not concurrent stress testing, and that the crash scenario was always in a multi-concurrent environment. With multiple threads accessing the JNI interface, Native code will execute in a multi-thread context.

Alert: 99% of Python’s GIL locks are related!

As we all know, due to historical reasons, Python was born in the 1990s, when the concept of multithreading was far from being as popular as it is today. As a product of this era, Python was born as a single-threaded product.

Although Python also has a multithreading library, which allows the creation of multiple threads, because the C version of the interpreter is not thread-safe in terms of memory management, there is a very important lock within the interpreter that restricts Python’s multithreading, so it’s really just a matter of everyone taking turns.

Originally, GIL was managed by the interpreter. Now, after being converted into C code, who is responsible for managing the security of multiple threads?

Since Python provides a set of interfaces called by THE C language to allow Python scripts to be executed in C programs, look at the documentation for the API to see if you can find out.

Fortunately, I did find it:

Obtain the GIL lock:

Release the GIL lock:

GIL locks need to be acquired when JNI calls enter and released when interfaces exit.

With the addition of GIL lock control, the annoying Crash issue has finally been resolved!

The test results

Prepare two identical PY files with the same algorithm function. One is accessed through the Flask Web interface (the Web service is deployed locally 127.0.0.1 to minimize network latency), and the other is converted into A Jar package through the above process.

In Java service, the two interfaces are called 100 times respectively, and the whole test is performed 10 times. The execution time is calculated:

In the above tests, to further distinguish between network latency and the latency of the code execution itself, we timed the entry and exit of the algorithm function, as well as the time before Java executed the interface call and where the result was obtained, so that we could calculate the proportion of the time of the algorithm execution itself in the entire interface call.

  • It can be seen from the results that the interface access executed through Web API only takes 30%+ of the algorithm execution time, and most of the time is spent on network overhead (packet sending and receiving, Flask framework scheduling processing, etc.).

  • Through JNI interface local invocation, the execution time of the algorithm takes up more than 80% of the entire interface execution time, while the Java JNI interface conversion process only takes up 10%+ time, effectively improving the efficiency and reducing the waste of extra time.

  • In addition, looking at the execution part of the algorithm itself, the execution time of the same code converted into Native code is 300~500μs, while the CPython interpretation execution time is 2000~4000μs, which is also quite different.

conclusion

This article provides a new way of thinking about Java calling Python code. For reference only, its maturity and stability are questionable, and access through HTTP Restful interfaces is still the preferred way to interconnect across languages.

As for the method in the article, interested friends are welcome to leave messages.

PS: limited to the level of the author is limited, if there are mistakes in the article, welcome to give advice, so as not to mislead the reader, thank you.

Previous hot reviews

A Java Object memoir: Garbage Collection

Kernel Address Space Adventure 3: Permission management

Who’s moving your HTTPS traffic?

Advertising secrets in routers

Kernel Address Space Adventure 2: Interrupts and Exceptions

DDoS attacks: Infinite war

An SQL injection leads to a big case

Kernel address space adventure: System call

A fantastical journey through HTTP packets

A DNS packet adventure

I am a rogue software thread

Scan code attention, more wonderful