The introduction

Introduces multithreading sharing global variables, and examines the problem of threads out of sync caused by competing multithreaded resources in Python.

Use thread Lock mechanism to achieve thread synchronization.


Multithreaded – Share global variables

import time
from threading import Thread


g_num = 100


def work1() :
    global g_num
    for i in range(3):
        g_num += 1

    print("----in work1, g_num is %d---" % g_num)


def work2() :
    global g_num
    print("----in work2, g_num is %d---" % g_num)


def main() :
    
    print(G_num is %d--" % g_num)

    t1 = Thread(target=work1)
    t1.start()

    Wait a little while to make sure things are done in the T1 thread
    time.sleep(1)

    t2 = Thread(target=work2)
    t2.start()
    
 
if __name__ == "__main__":
    main()
Copy the code

Running results:

G_num before the thread is createdis 100---
----in work1, g_num is 103---
----in work2, g_num is 103---
Copy the code


The list is passed to the thread as an argument

import time
from threading import Thread


def work1(nums) :
    nums.append(44)
    print("----in work1---",nums)


def work2(nums) :
    Wait a little while to make sure things are done in the T1 thread
    time.sleep(1)
    print("----in work2---",nums)

g_nums = [11.22.33]

t1 = Thread(target=work1, args=(g_nums,))
t1.start()

t2 = Thread(target=work2, args=(g_nums,))
t2.start()
Copy the code

Running results:

----in work1--- [11.22.33.44] -in work2--- [11.22.33.44]
Copy the code


  • Global variables are shared by all threads within a process, making it easy to share data between multiple threads
  • The disadvantage is that threads arbitrarily modify global variables, which can lead to chaos between multiple threads on global variables (i.e. thread unsafe)


Python multithreaded resource contention issues

We’ll simulate the resource contention problem by defining a custom Thread class that inherits from the threading.Thread class.

Code demo

"" Python multithreaded synchronization problem ""
import time
import threading


Thread sharing variables
num1 = 0
num2 = 0


class NumIncrement(threading.Thread) :
    "" custom increment thread class ""

    def __init__(self, count) :
        super().__init__()
        self.count = count
    
    def run(self) :
        self.num_increment()

    def num_increment(self) :
        """ Digit increment """
        global num1
        for i in range(self.count):
            num1 += 1


def sync_test() :
    """ Multithreaded synchronization test ""
    global num1, num2

    print('num1=%d' % num1)
    print('num2=%d' % num2)

    count = 1000
    t1 = NumIncrement(count)
    t2 = NumIncrement(count)

    t1.start()
    t2.start()
    t1.join()
    t2.join()

    # single thread increment
    for i in range(2) :for j in range(count):
            num2 += 1

    print('num1=%d' % num1)
    print('num2=%d' % num2)


def main() :
    sync_test()


if __name__ == '__main__':
    main()
Copy the code


The results

Count = 1000 The result is as follows:

num1=0
num2=0
num1=2000
num2=2000
[Finished in 0.1s]
Copy the code


But if you try to set count to 1000000 (1 million) or more, you’ll find that multi-threading increments don’t get the right number, and the result may be different every time.

Count = 1000000 The result is as follows:

num1=0
num2=0
num1=1377494
num2=2000000
[Finished in 0.4s]
Copy the code

If multiple threads operate on the same global variable at the same time, resource contention can occur, resulting in incorrect data results

num += 1This will convert to num = num +1
Copy the code


Problem analysis

If num = 100, the first thread runs on the time slice, and the second thread runs on the time slice, and the second thread runs on the time slice, and the second thread runs on the time slice, and the second thread runs on the time slice. So num = 101, and then switch to the context of the first thread to do the assignment num = 101, because the first thread has finished adding, just keep on doing the assignment, but now num has been assigned to num = 101 by the second thread, So making num = 101 has already caused num to be assigned twice, so the data will be incorrect.

Why is it so hard to get the wrong result when count = 1000 is a small number? It’s possible that the CPU can compute these small numbers so quickly that it can do a simple +1 operation all at once. However, when it comes to millions or tens of millions, the time slice may not be enough to complete all operations at one time. Moreover, non-atomic operations can be carried out for context switching of threads.

Atomic operation does not require synchronized, a cliche of multithreaded programming. Atomic operations are operations that cannot be interrupted by thread scheduling; Once this operation starts, it runs through to the end without any context switch.

Num += 1 is a nonatomic operation. The num + 1 addition operation is followed by the num assignment operation.


Thread synchronization

The concept of synchronization

Synchronization is a coordinated pace, running in a predetermined order. I’ll say it after you say it.

The word “tong” literally means to act together

In fact, no, the word “with” should mean collaboration, assistance and mutual cooperation.

For example, process and thread synchronization can be understood as A process or thread A and B cooperate together. When A performs to A certain extent, it depends on A result of B, so it stops, signals B to run, B to execute, and then gives the result to A, and A continues to operate.


Thread locking mechanism

The mutex

Synchronization control is required when multiple threads modify a shared data at almost the same time

Thread synchronization can ensure that multiple threads can safely access competing resources. The simplest synchronization mechanism is to introduce mutex.

A mutex introduces a state for a resource: locked/unlocked

When a thread wants to change shared data, it locks it first. In this case, the resource status is locked and other threads cannot change it. Until the thread releases the resource and its state becomes unlocked, other threads cannot lock the resource again. The mutex ensures that only one thread writes at a time, thus ensuring the accuracy of data in multithreaded situations.


The calculation error mentioned above can be solved by thread synchronization

The idea is as follows:

  1. The system calls t1 and gets the value of g_num to 0. The system locks g_num so that no other thread is allowed to operate on it
  2. T1 adds +1 to the value of g_num
  3. T1 unlocks g_num with a value of 1. Other threads can use g_num with a value of 1 instead of 0
  4. Similarly, when other threads modify g_num, they must lock it first and then unlock it after processing. During the whole process of locking, other threads are not allowed to access the g_num, ensuring the accuracy of data

The threading module defines the Lock class to handle locking easily:

import threading

# to create lock
mutex = threading.Lock()

# lock
mutex.acquire()

# release
mutex.release()
Copy the code


"" Python mutex solves the problem of multi-threaded resource contention ""
import time
import threading


Thread sharing variables
g_num = 0

Create a mutex
The default state is unlocked
mutex = threading.Lock()


def work1(num) :
    global g_num
    for i in range(num):
        mutex.acquire()  # locked
        g_num += 1
        mutex.release()  # unlock

    print("---work1---g_num=%d" % g_num)


def work2(num) :
    global g_num
    for i in range(num):
        mutex.acquire()  # locked
        g_num += 1
        mutex.release()  # unlock

    print("---work2---g_num=%d" % g_num)


def mutex_test() :
    """ Mutex test ""

    Create two threads and let them each increment g_num 1,000,000 times
    count = 1000000
    t1 = threading.Thread(target=work1, args=(count,))
    t1.start()

    t2 = threading.Thread(target=work2, args=(count,))
    t2.start()

    Wait for the calculation to complete
    # len(threading.enumerate()) = Number of current program threads
    # = 1 indicates that only the main thread is left
    while len(threading.enumerate()) != 1:
        time.sleep(1)

    print("The end result of two threads operating on the same global variable is :%s" % g_num)


def main() :
    mutex_test()


if __name__ == '__main__':
    main()
Copy the code


The running results are as follows:

---work1---g_num=1974653
---work2---g_num=2000000
2The result of two threads operating on the same global variable is:2000000
Copy the code

You can see the final result, which is as expected after the mutex is added.


Note:

  • If the lock was not previously locked, thenacquireDon’t jam
  • If you are callingacquireThis lock was locked before it was already locked by another threadacquireWill block until the lock is unlocked


A deadlock

When multiple resources are shared between threads, a deadlock can occur if two threads each hold a portion of the resource and wait for the other’s resource at the same time.

Although deadlocks rarely occur, they can cause the application to stop responding when they do. Let’s look at an example of a deadlock

""" Python deadlock demo """
import time
import threading


mutexA = threading.Lock()
mutexB = threading.Lock()


class MyThread1(threading.Thread) :

    def run(self) :

        # lock mutexA
        mutexA.acquire()

        After mutexA is locked, wait 1 second for another thread to lock mutexB
        print(self.name+'----do1---up----')
        time.sleep(1)

        This will block because the mutexB has already been locked by another thread
        mutexB.acquire()
        print(self.name+'----do1---down----')
        mutexB.release()

        Unlock mutexA
        mutexA.release()


class MyThread2(threading.Thread) :

    def run(self) :
        # lock mutexB
        mutexB.acquire()

        After mutexB is locked, wait 1 second for another thread to lock mutexA
        print(self.name+'----do2---up----')
        time.sleep(1)

        This will block because mutexA has already been locked by another thread
        mutexA.acquire()
        print(self.name+'----do2---down----')
        mutexA.release()

        Unlock mutexB
        mutexB.release()


def main() :
    t1 = MyThread1()
    t2 = MyThread2()

    t1.start()
    t2.start()


if __name__ == '__main__':
    main()
Copy the code

Running results:

Thread-1----do1---up----
Thread-2----do2---up----

Copy the code

You have entered a deadlock state, you can use Ctrl-C to exit


Avoid deadlock

  • Try to avoid deadlocks when programming (banker algorithm)
  • Add a timeout period


Banker’s algorithm

Background knowledge

The banker’s problem is how a banker can safely lend a certain amount of money to several clients so that they can borrow to do what they want, and at the same time the banker can get all his money back without going bankrupt. The problem is similar to that of resource allocation in an operating system: the banker is like an operating system, the client is like a running process, and the banker’s money is the system’s resource.


Description of the problem

A banker has a certain amount of money and a number of clients who want loans. Each customer must state at the outset the total amount of the loan he requires. The banker may accept the client’s request provided that the client’s total loans do not exceed the banker’s total funds. The client loan is made one unit at a time (e.g. RMB 10,000, etc.). The client may wait until the required unit amount is fully borrowed, but the banker must guarantee that this wait is limited and can be completed.

For example, three clients C1, C2 and C3 borrow money from the banker. The total amount of the banker’s capital is 10 units, among which C1 client needs to borrow 9 units, C2 client needs to borrow 3 units and C3 client needs to borrow 8 units, totaling 20 units. The state at a given moment is shown in the figure.

As for the state in Figure A, according to the requirements of the security sequence, the first customer we select should meet the requirement that the loan required by the customer is less than or equal to the banker’s current remaining money. It can be seen that only customer C2 can be satisfied: C2 customer needs 1 unit of capital, 2 units of capital in the hands of the small banker, so the banker lends 1 unit of capital to C2 customer to complete the work and repay the money borrowed from the 3 units of capital, enter the B diagram. In the same way, the banker lends 4 units of capital to C3 to get the job done. In figure C, there is only one client, C1, who needs 7 units of capital. At this point, the banker has 8 units of capital, so C1 can also borrow money and get the job done. In the end (see chart D) the banker takes back all 10 units, guaranteeing no loss. Then the customer sequence {C1, C2, C3} is a safe sequence, according to which the loan is safe for the banker. Otherwise, if the banker lends 4 capital units to C1 in FIG. B, there will be an insecure state: C1 and C3 cannot complete their work at this time, and the banker has no money in his hand. The system is in a stalemate, and the banker cannot recover his investment.

To sum up, the banker algorithm starts from the current state, checks each client one by one according to the security sequence, and then assumes that the client can finish the work and repay all the loans, and then checks the next client who can finish the work,…… . The banker is safe if all clients can get the job done and a safe sequence is found.


The public,

Create a new folder X

Nature took tens of billions of years to create our real world, while programmers took hundreds of years to create a completely different virtual world. We knock out brick by brick with a keyboard and build everything with our brains. People see 1000 as authority. We defend 1024. We are not keyboard warriors, we are just extraordinary builders of ordinary world.