1. The introduction

References are used in a variety of programming languages, such as Java value passing and reference passing. Object reference in Python is also a point of special concern in learning Python, especially passing function parameters, which may cause unnecessary bugs. This paper will sort out the references as follows:

  • Variables and assignments
  • Mutable objects and immutable objects
  • References to function arguments
  • Shallow copy and deep copy
  • The garbage collection
  • A weak reference

2. The python references

2.1 Variables and Assignments

Every Python object has three attributes: tag, type, and value. The label remains the same after the object is created until the memory is reclaimed and can be understood as the memory address.

When Python assigns a value to a variable, it assigns the memory address of the assigned object to the variable. That is, Python variables are address-referenced variables. Reference semantics store only the memory location of the value of a variable, not the value of the variable itself.

You can check whether the same memory address is referenced by is or by comparing id().

  • = =Is to compare whether the contents of two objects are equal, that is, whether the values of two objects are equal
  • isCheck both the value of the object and the memory address. You can use IS to determine whether the object is the same
  • id()Lists the number of memory addresses for variables
A = [; a = [;1.2.3]
c = [1.2.3]
print(a is c) # False
print(a == c) # True
b = a
a.append(5)
print(a is b) # True
Copy the code
  • Initialization assignment: Each initialization of a variable opens up a new space to assign the address of the new content to the variable
  • Variable assignment: Passing of a reference address

2.2 Mutable objects and Immutable Objects

Everything in Python is an object. Python objects are divided into mutable and immutable objects. The difference is whether the value of an object can be modified without changing the memory address.

  • Mutable objects include dict dict, list list, collection set, manually declared class objects, etc
  • Immutable objects include numbers int float, characters STR, None, tuple, and so on

Here are a few typical examples:

  • List Mutable object whose content changes and address remains unchanged

    a = [1.2.3]
    print(id(a))
    a.append(5)
    print(id(a))
    Copy the code
  • Immutable objects (common shared addresses or caches)

    Smaller integers are used frequently, and Python uses shared addresses to manage a =1
    b = 1
    print# True # For the word class STR, Python uses a shared cache to share the address a ='hello'
    b = 'hello'
    print(a is b) # True
    Copy the code
  • Immutable objects (do not share addresses)

    a = (1999.1)
    b = (1999.1)
    print(a is b) # False
    
    a = 'hello everyone'
    b = 'hello everyone'
    print(a is b) # False
    Copy the code
  • A relatively immutable form of a tuple

    A = (a) a = (b) a = (c) a = (d)1999[1.2])
    ida = id(a)
    a[- 1].append(3)
    idb = id(a)
    
    print(ida == idb) # True
    Copy the code

The reason why I mention variable variable and immutable, in fact, mainly want to show that the reference of a variable is not directly related to variable variable and immutable. Mutable and immutable variables focus on whether a variable can be modified. Note that this modification is not done through an assignment operation.

a = [1.2.3]
print(id(a))
a = a + [5]
print# (id) (a) before and after the two variable a, is not the same addressCopy the code

2.3 Reference of function parameters

Functions in Python are passed by shared parameters, which means that the parameters of a function are copies (aliases) of each reference in the argument. The function modifies arguments that are mutable objects (representing the same object); Without changing the reference to the argument.

def func(d):
    d['a'] = 10
    d['b'] = 20# change the value of the external argument d = {'a'0.'b'1} # assign operation, local d is attached to the new identifierprint(d) # {'a'0.'b'1}

d = {}
func(d)
print(d) # {'a'10.'b'20}
Copy the code

It is recommended not to write the above example code, local variables and global variables name the same, try to code, otherwise it is easy to bug without knowing it.

Default values for function arguments avoid mutable arguments and use None instead. The reason is that the default value of a function is an attribute of a function object, and if the default value is a mutable object and you change it, all subsequent function objects there will be affected.

class bus():
    def __init__(self, param=[]):
        self.param = param
        
    def test(self, elem):
        self.param.append(elem)
b = bus([2.3])
b.param # [2.3]

c = bus()
c.test(3)
c.param # [3]

d = bus()
d.param # [3] # c changed the content of the reference to the default valueCopy the code

2.4 Shallow copy and Deep Copy

For mutable objects, we should always pay attention to its variability, especially when the assignment or copy of the variable content modification operation, need to consider whether it will affect the value of the original variable, if the program has bugs, you can think about this aspect. Copy is to make a copy. There are two kinds of copies:

  • Shallow copy: Only the top-level object is copied. For nested data structures, the internal elements are references to the original object
  • Deep copy: Copies all objects recursively. The copied object is a completely different object from the original object. For immutable objects, both shallow and deep copies have the same address. But in the case of nested mutable object elements, this is different
test_a = (1.2.3)
test_b = copy.copy(test_a)
test_c = copy.deepcopy(test_a)
print(test_a is test_b) # True
print(test_a is test_c) # True

test_a[2].append(5) # Change the contents of mutable elements in immutable objectsprint(test_a is test_b) # True
print(test_a is test_c) # False
print(test_c) # (1.2[3.4])
Copy the code

For mutable objects, a copy creates a new top-level object. If the copy is shallow, the internally nested mutable objects just copy the references, so they affect each other. Deep copy does not have this problem.

l1 = [3[66.55.44], (2.3.4)] l2 = list (l1) # # l2 is a shallow copy of the l1 top-level changes will not influence each other, because it is two different objects l1.append(50print(l1) # 3[66.55.44], (2.3.4), 50]
print(l2) # [3[66.55.44], (2.3.4)] # echo echo echo echo echo echo echo1].append(100)
print(l1) # [3[66.55.44.100], (2.3.4), 50]
print(l2) # [3[66.55.44.100], (2.3.4Immutable)] # nested elements, immutable elements of operation is to create a new object, so does not affect the l1 [2] + = (2.3)
print(l1) # [3[66.55.44.100], (2.3.4.2.3), 50]
print(l2) #[3[66.55.44.100], (2.3.4)]
Copy the code

2.5 Garbage Collection

Python’s garbage collection strategy is based on reference counting and mark-sweep + generational collection.

  • Reference count: Python can maintain a reference count attribute for all objects (areas of memory). When a reference is created or copied, let Python increase the reference count of the related object by 1. Instead, the reference count of the associated object is -1 when the reference is destroyed. When the reference count of an object is reduced to zero, it is assumed that there are no more variables in Python that refer to the object, so the memory occupied by the object can be freed up. Can be achieved bysys.getrefcount()To view a reference to an object
  • Generation recycling: Generation recycling is mainly to improve the efficiency of garbage recycling. Objects are created and consumed at different rates. Since Python needs to check for garbage before garbage collection, garbage collection, and then garbage collection again. When there are many objects, garbage detection becomes time-consuming and inefficient. Python uses the generation of objects, according to the generation of different frequencies of detection. The rules of generation level are judged according to the life time of the object. For example, an object is reachable for several times in succession, and the generation level of the object is high, which reduces the detection frequency. The default in Python is to divide all objects into three generations. Generation 0 contains the newest objects, and generation 2 is the earliest
  • Circular reference: an object refers directly or indirectly to itself, and the chain of references forms a ring. The reference count of this modified object can never be zero. All objects that can refer to other objects are called containers. Circular references can only occur between containers. Python’s garbage collection mechanism takes advantage of this feature to find objects that need to be released.
import sys
a = [1.2]
b = a

print(sys.getrefcount(a)) # 3The command itself also references del b onceprint(sys.getrefcount(a)) # 2 
Copy the code

3. A weak reference

Weak references exist in many high-level languages. As mentioned earlier, the garbage collection mechanism destroys an object when its reference count reaches zero. But sometimes you need to reference objects, but you don’t want to increase the reference count. What good is that?

  • Applications live in the cache for only a limited time. The object is available if the object it references exists, and None is returned if the object does not exist
  • By not increasing the reference count, you reduce the possibility of memory leaks by using references in a loop

This is called weak reference and does not increase the number of references to the object. The target object of a reference is called the referent.

import weakref
a_set = {0.1} wref = weakref.ref(a_set) #print(wref()) # {0.1}
a_set = {2.3.4} # the original a_set reference count is0, garbage collectionprint(wref()) # None # is garbage collected, and weak references disappear to NoneCopy the code

Weak references are generally used when weakref collection, weakref WeakValueDictionary, weakref. WeakKeyDictionary is the difference between the two is a weak reference value, can be a weak reference; In addition, there is weakRef.WeakSet

4. To summarize

This article describes several aspects of referencing in Python that you hope will be helpful. The summary is as follows:

  • Object assignment completes the reference, and the variable is the address reference variable
  • Always be aware that a change in the reference mutable object causes a change in the co-reference variable
  • The function modifies the arguments to mutable objects
  • Shallow copy is just the top layer of copy. If there are internal nested mutable objects, note that copy is still a reference
  • When the object’s reference count reaches zero, garbage collection begins
  • Weak references increase the reference count and live or die with the object referenced without affecting circular references

About the author: WeDO Experimental Jun, data analyst; Love life, love writing

Appreciate the author

Read more

5 minutes to master using C++\ in Cython

5 minutes to master common configuration files in Python \

Learn the Hook function \ in Python in 5 minutes

Special recommendation \

Programmer’s guide to fish

For your selection of Silicon Valley geeks,

From FLAG giant developers, technology, venture capital first-hand news

\

Click below to read the article and join the community