Jupyter using multithreading multiprocessing can run without error

Pycharm can be run, but it needs to be run on Jupyter. The problem is that the program is running all the time, but there is no output

In jupyter notebook/lab, AttributeError is not reported on the interface, but is always in the running state. Can’t get attribute ‘XXX’ on

, Pool cannot always use objects not defined in imported modules for some reason.

Based on the last blog post, this is actually the place: In Linux, public variables and methods are forked into a new process, but in Windows, a new process is created. There are no public variables and methods in the process, so the problem occurs

② Error in AttributeError: Module ‘main’ has no attribute ‘spec’, if __name__ == __main__: The class = “ModuleSpec (name =” builtins, loader = > < class ‘_frozen_importlib. BuiltinImporter’)”

Jupyter can only track the main process, not the child process. There are some other solutions online, but PERSONALLY I think the easiest is to package the code into a PY file, and then let Jupyter execute the file

I’m basically using the multiprocessing module, but I’m still learning about multiprocessing. I’m using a book written by Dusty Phillips, and this code belongs in that book.

import multiprocessing  
import random
from multiprocessing.pool import Pool

def prime_factor(value) :
    factors = []
    for divisor in range(2, value-1):
        quotient, remainder = divmod(value, divisor)
        if not remainder:
            factors.extend(prime_factor(divisor))
            factors.extend(prime_factor(quotient))
            break
        else:
            factors = [value]
    return factors

if __name__ == '__main__':
    pool = Pool()
    to_factor = [ random.randint(100000.50000000) for i in range(20)]
    results = pool.map(prime_factor, to_factor)
    for value, factors in zip(to_factor, results):
        print("The factors of {} are {}".format(value, factors))
Copy the code

On Windows PowerShell (not jupyter notebooks), I saw the following

Process SpawnPoolWorker-5:
Process SpawnPoolWorker-1:
AttributeError: Can't get attribute 'prime_factor' on <module '__main__' (built-in)>
Copy the code

I wonder why the unit never stops running?

The solution

In the case of the Jupyter laptop, the problem seems to be at odds with the design. Therefore, we must write the function (prime_factor) to another file and then import the module. In addition, we must pay attention to adjustment. For example, in my case, I have coded this function into a function called defs.py

“.

def prime_factor(value) :
    factors = []
    for divisor in range(2, value-1):
        quotient, remainder = divmod(value, divisor)
        if not remainder:
            factors.extend(prime_factor(divisor))
            factors.extend(prime_factor(quotient))
            break
        else:
            factors = [value]
    return factors
Copy the code

Then IN the Jupyter notebook I wrote the following lines

import multiprocessing  
import random
from multiprocessing import Pool
import defs



if __name__ == '__main__':
    pool = Pool()
    to_factor = [ random.randint(100000.50000000) for i in range(20)]
    results = pool.map(defs.prime_factor, to_factor)
    for value, factors in zip(to_factor, results):
        print("The factors of {} are {}".format(value, factors))
Copy the code

That solved my problem

Jupyter using multithreading multiprocessing can run without error

Related Posts

How to build their own SpringBoot source debugging environment? SpringBoot source code (a)

Flask Framework study notes

802: Find the final safety state (medium, DFS, BFS)