Jupyter is a very famous development environment in the field of data analysis, using Jupyter to write data analysis related code will greatly save the development time.

Imagine a scenario where a colleague in another department sends you a data analysis module for advanced data analysis. There are hundreds of functions in a module.

If you write a Python file directly to call the data analysis module, the method is very simple:

from analyze importFathersAnalyzer data = [...]  father = FathersAnalyzer(data) result = father.analyze() print(The analysis result of F 'is:{result}')
Copy the code

Now, you need to use Jupyter to call the analysis module. How should you call from inside Jupyter?

You might think, isn’t that easy? Directly put the module code in the. Ipynb file of Jupyter Notebook, and import it into Jupyter as if it were a normal module, as shown in the figure below:

Now, what happens if I change the analyze. Py file?

Let’s change it, as shown in the figure below.

Re-run the code in this Cell. Although the code from Analyze import FathersAnalyzer looks as if the module has been re-imported, it turns out that it is running the code before the change.

This is because all the code in a Jupyter Notebook runs in the same runtime, and when you import the same module multiple times, Python’s package management mechanism automatically ignores all subsequent imports, always using only the result of the first import (so you can implement singleton mode this way as well).

What if I modify the imported package and then want to import it again? There are three options:

  • Restart the entire Notebook. However, this will result in the loss of all variables in the current runtime.
  • useimportlib:

The downside of this approach is obvious — unless you run each Cell in order, your code will look something like this:

The analysis module needs to be reloaded in each Cell. Otherwise, you may run a Cell separately with old code and cause undetectable bugs.

  • Use Jupyter’s own%autoreload:
%load_ext autoreload
%autoreload 1
%aimport analyze 

data = 123
importlib.reload(analyze)
father = analyze.FathersAnalyzer(data)
result = father.analyze()
print(result)
Copy the code

The running effect is shown in the figure below:

The key code is three lines:

%load_ext autoreload
%autoreload 1
%aimport analyze 
Copy the code

These three lines of code will only work in Jupyter, but in normal.py files they will cause an error. They do this: Line 1 starts the Autoreload mechanism. In line 2, set up to automatically load modules imported through % aimPort. Line 3 imports the Analyze module with % aimPort.

After this is written, any Cell runs and all modules imported by % aimPort are reloaded once. This allows you to use the latest code every time.

Of course, you can go even further and reduce the special code to two lines:

%load_ext autoreload
%autoreload 2
Copy the code

When %autoreload is set to 2, all modules imported by import XXX will be automatically reloaded each time any Cell is run. The trade-off is that it will run slower.