This article is divided into the following parts

  • The introduction
  • Try a manager
  • Matters needing attention
  • Distributed process

The introduction

In multiple processes, each process has its own copy of the variable, so a variable in the main process is passed to another process to modify, the result is still stored in that process, the main process of the variable has not been modified. In order for changes made by other processes to be synchronized to the main process, you need to create variables that can be shared between multiple processes.

Let me give you an example

from multiprocessing import Process

def f1(x, l):

x += 1

l.append(2)

def f2(x, l):

x -= 2

l.append(3)

if __name__ == '__main__':

x = 0

l = [1]

p1 = Process(target=f1, args=(x, l))

p2 = Process(target=f2, args=(x, l))

p1.start()

p2.start()

p1.join()

p2.join()

print(x, l)

Copy the code

The run result is

0 [1]

Copy the code

X and L are not changed because they are modified in other processes.

Try a manager

In previous articles, we mentioned ways to share data between processes, such as Queue Pipe Value, but the Multiprocessing module also provides a more advanced encapsulation that uses the Manager to create variables that can be shared between processes. Let’s jump right into the following example

from multiprocessing import Process, Manager

def f1(ns, l):

ns.x += 1

l.append(2)

def f2(ns, l):

ns.x -= 2

l.append(3)

if __name__ == '__main__':

manager = Manager()

ns = manager.Namespace()

l = manager.list([1])

ns.x = 0

p1 = Process(target=f1, args=(ns, l))

p2 = Process(target=f2, args=(ns, l))

p1.start()

p2.start()

p1.join()

p2.join()

print(ns, l)

Copy the code

The results are as follows

Namespace(x=-1) [1, 2, 3]

Copy the code

The above code refers to manager.namespace () and manager.list(). To create variables that are specifically used to create lists. Variables created using the Manager method can be modified in different processes.

Other types created by Manager can be found on the official website

Matters needing attention

Sometimes when you use manager, you still find variables that have not been changed by other processes, such as when you create a list with manager.namespace () or when you create a multi-tier list with manager.list(). This is because they are mutable objects, and the memory address remains the same when modified, so the main process still reads the original address, unique and original value.

The code in this answer is so appropriate that I’ll just post it here

import multiprocessing

import time

def f(ns, ls, di):

ns.x += 1

ns.y[0] + =1

ns_z = ns.z

ns_z[0] + =1

ns.z = ns_z

ls[0] + =1

ls[1] [0] + =1

ls_2 = ls[2]

ls_2[0] + =1

ls[2] = ls_2

di[0] + =1

di[1] [0] + =1

di_2 = di[2]

di_2[0] + =1

di[2] = di_2

if __name__ == '__main__':

manager = multiprocessing.Manager()

ns = manager.Namespace()

ns.x = 1

ns.y = [1]

ns.z = [1]

ls = manager.list([1[1], [1]])

di = manager.dict({0: 1.1: [1].2: [1]})

print('before', ns, ls, di)

p = multiprocessing.Process(target=f, args=(ns, ls, di))

p.start()

p.join()

print('after', ns, ls, di)

Copy the code

The run result is

before Namespace(x=1, y=[1], z=[1]) [1, [1], [1]] {0: 1.1: [1].2: [1]}

after Namespace(x=2, y=[1], z=[2]) [2, [1], [2]] {0: 2.1: [1].2: [2]}

Copy the code

The results of ns, LS, DI are divided into three parts. Each part is the first part, the third part has changed, and the second part has not changed. Readers can feel the differences between them in detail

Distributed process

Normal distribution should be running on two computers, but I only have one computer, so I open two CMD, run two files, simulate distributed operation

The master.py and task1.py files will be created

  • The master.py file creates a list of files that are responsible for the listappendThe data. It’s useless because there’s only one processProcessIt’s linear execution
  • The task1.py file needs to perform a list in the master.py filepop

The implementation idea is as follows

  • The list in the master.py file is created correctlywhile TrueThe loop keeps adding elements
  • The master.py file sets an account password that exposes the list variable
  • The task1.py file is connected to the master.py file using the account password and extracts the list variable

The master.py file contains the following contents

import random, time

from multiprocessing.managers import BaseManager as bm

l = []

def return_l(a):

return l

if __name__ == '__main__':

Just do this

bm.register('get_l', callable = return_l)

m = bm(address = ('127.0.0.1'.5000), authkey = b'abc')

m.start()

new_l = m.get_l()

while True:

new = random.randint(0.100)

new_l.append(new)

print('produce {} now all {}'.format(new, new_l))

time.sleep(2 * random.random())

m.shutdown()

print('master exit')

Copy the code

The task1.py file contains the following contents

import random

import time

from multiprocessing.managers import BaseManager as bm

if __name__ == '__main__':

bm.register('get_l')

m = bm(address = ('127.0.0.1'.5000), authkey = b'abc')

m.connect()

l = m.get_l()

while True:

print('drop {}'.format(l.pop()))

time.sleep(3 * random.random())

Copy the code

In fact, if you look at the above code and combine it with the previous ideas, you can see how distribution works. Here’s how it works.

After saving the two files, open CMD in their respective locations, which I’ll call CMDM and CMDT respectively

Enter in CMDM

python master.py

Copy the code

You’ll notice that the program is running and the list is getting more and more elements. Then enter it in the CMDT

python task1.py

Copy the code

You’ll see that the list elements in the CMDM start to decrease.

This process is essentially running two files, as long as master.py is running first. Running two files is the equivalent of starting two processes, and starting them on two computers is the same.

Note: The above code runs on a Windows system. The distributed process under Linux can refer to the article by Liao Xuefeng, the code in this article cannot run under Windows, so rewrite it into the version under Windows for reference

Welcome to my zhihu column

Column home: Programming in Python

Table of contents: table of contents

Version description: Software and package version description