• Optimization in Python — Interning
  • Chetan Ambi
  • The Nuggets translation Project
  • Permanent link to this article: github.com/xitu/gold-m…
  • Translator: samyu2000
  • Proofread by: Caiyundong

Optimization of Python – resident mechanism

There are several different Python interpreters available today, including CPython, Jython, IronPython, and more. The optimization techniques we are discussing are related to a standard Python interpreter like CPython.

Reside mechanism

Interning (resident mechanism) means reusing objects as needed, rather than creating new objects. Let’s look at some examples to understand the residency mechanism for objects of type Integer and String.

Is-is is an operator used to compare the memory location of two Python objects.

Id-id Is used to obtain the decimal memory location of an object.

Integer Specifies the residency of the object

Python starts with a series of Integer objects preloaded in memory, ranging from -5 to 256. Whenever we create Integer objects in that range, we automatically point to these preloaded memory locations, and Python does not create new objects because of that.

The reason for using such an optimization strategy is simply that numbers in the -5 to 256 range are often used. It makes sense to prestore them in memory. So, Python preloads them into memory at startup to optimize speed and memory.

Case 1

In this case, both variables A and B are assigned a value of 100. Since 100 is a value in the range -5 to 256, Python uses the resident object, and the variable B points to the same memory location, rather than creating another object with a value of 100.

As you can see from the following code, variable A and variable B refer to the same object in memory. Instead of creating a new object for variable B, Python points to the memory location of variable A. This is due to the resident mechanism of the Integer object.

Case 2

In this case, both variables A and B are assigned a value of 1000. Since 1000 is not in the range -5 to 256, Python creates two Integer objects. So variable A and variable B are stored in different places.

As you can see from the following code, variable A and variable B are stored in different locations in memory.

The resident of a String object

Like Integer objects, some String objects are resident. In general, any String that conforms to the naming convention for identifiers is resident. There are sometimes exceptions, so don’t rely solely on the resident mechanism.

Case 1

The string “Data” is a legal identifier that will reside, so both variables point to the same memory location.

Case 2

The string “Data Science” is not a valid identifier, and the resident mechanism is invalid, so the two variables point to different memory locations.

The above examples are from Google Colab, using Python version 3.6.9.

In Python 3.6, all legal strings of length less than 20 are resident. In Python 3.7, however, the maximum length is 4096. So as I mentioned before, these standards vary from Python version to Python version.

Since not all strings are hosted, Python provides a method to forcibly host strings sys.intern(). This method is not recommended unless you really need it. Refer to the code below for instructions.

The significance of String object residency

Assume that in your application, string operations are frequent. If you use the == operator to compare larger strings, Python will do the comparison character by character, which is obviously time-consuming. But if these long strings are hosted, they point to the same memory location. Since comparing memory locations is much faster, we can use the IS operator for string comparisons.

conclusion

After reading this article, you should understand Python’s residency mechanism.

If you’re interested in Python’s mutability and immutability, check out my article.


Thank you for taking the time to read this.