The original | Tutorial: Advanced Jupyter Notebooks

The author | Benjamin Pryke

The translator | kbsc13 (” algorithm ape to the growth of the public, “the author)

The original | www.dataquest.io/blog/advanc…

Statement | translation is for the exchange of learning purpose, welcome to reprint, but please keep this paper for, do not used for commercial or illegal purposes

preface

Last time we introduced the introductory tutorial of Jupyter, this time we will introduce more tips on how to use Jupyter Notebook.

This paper mainly introduces the following contents:

  • Some of the basicsshellCommands and handy magic commands, including debug, timing, and execution in multiple languages;
  • To explore such aslogging,macrosRun external code and extension plug-ins of Jupyter;
  • Describes how to enhanceSeabornModule diagrams, run through the command line, and use the database.

Shell command

In the notebook, you can directly use the shell command, just in the code cell, with! The initial is treated as a shell command, which is useful when dealing with data or files and managing Python packages. Here is a simple example:

Alternatively, you can add Python variables to a shell command by adding the $command, as shown below:

Due to the! Commands that start are discarded after execution, so commands like CD have no effect. However, IPython’s magic commands provide a solution.

Basic magic commands

Magic commands are very handy and useful commands built into the IPython core that are designed to handle specific tasks. Although they look like Unix commands, they are actually implemented in Python. There are many magical commands, but only a few of them are covered in this article.

There are also two types of magic commands:

  • Line Magics
  • Cell Magics

As can be seen from the name, it is mainly divided according to its scope of action, some can be performed in a single line, some can be used in multiple lines or the entire cell.

To see which magic commands are available, type the command %lsmagic. The output is as follows. You can see that there are two types of commands, line and cell, and the number of commands is given.

If you want to specific to understand the role of these commands, you can see on — ipython. Readthedocs. IO/en/stable/I…

The line magic command and the unit magic command are also used differently, with line magic commands starting with % and unit magic commands starting with %%.

In fact! The beginning of the shell command is a complex magic syntax, previously said cannot use a command like CD, can use magic command implementation, namely % CD, %alias, %env.

Here are some more examples.

Autosaving

The %autosave command can determine the interval for the notebook tosave automatically, as shown in the following example. After the command, add the interval parameter in seconds.

%autosave 60
Copy the code

Output result:

Autosaving every 60 seconds
Copy the code

A diagram showing Matplotlib

One of the most common line magic commands in data science is the %matplotlib command. It can be used to display a graph of matplotlib, as shown in the following example:

%matplotlib inline
Copy the code

The addition of the parameter inline ensures that the chart for Matplotlib is displayed within a cell. You usually need to use this line magic command before importing Matplotlib, usually in the first code unit.

Timing Execution

In general, we need to consider the execution time of the code. In notebook, we can have two time magic commands %time and %timeit, both of which have row and cell modes

For %time, the following is an example:

The difference between %timeit and %time is that it runs the given code multiple times and calculates an average time, which can be specified by adding the -n argument, or if not, an optimal number is automatically selected. Here is an example:

Execute different programming languages

Different programming languages can be executed in Jupyter Notebook. Although the selected core has a given language, such as Python3 for this example, different programming languages can be executed by magic commands, as can be found in the %lsmagic output.

Here are some examples of use, including the implementation of the HTML language and LaTeX language for displaying mathematical formulas.

Of course, you can execute other programming languages, including Ruby, Markdown, JavaScript, R, and so on.

Configuring Logging

In Jupyter there is a custom way to output error messages, which can be implemented by importing the Logging module.

Error messages are highlighted, as shown above.

In addition, the output of the logging module is separate from print and the standard unit output, as shown below:

The reason for the above figure is that Jupyter Notebook listens to the standard output streams, stdout and stderr, but print and unit output default to stdout and logging to stderr.

Therefore, logging can be configured to display other types of stderr information, such as the INFO and DEBUG types shown below.

logger = logging.getLogger()
logger.setLevel(logging.DEBUG)

logging.info('This is some information')
logging.debug('This is a debug message')
Copy the code

You can also customize the output format of the message:

handler = logging.StreamHandler()
handler.setLevel(logging.DEBUG)

formater = logging.Formatter('%(levelname)s: %(message)s')
handler.setFormatter(formater)

logger.handlers = [handler]

logging.error('An error')
logging.warning('An warning')
logging.info('An info')
Copy the code

Note that if each run contains the code logger.addHandler(handler) to add a new Stream handler, then each output will have an extra line of information. We can configure logging in a separate unit or, as shown above, replace all existing handlers directly, not with addHandler but with Logger. handlers = [handler]. This removes the default handler.

It is also possible to save log information to a file, as shown below, using FileHandler instead of StreamHandler.

handler = logging.FileHandler(filename='important_log.log', mode='a')
Copy the code

Finally, the log level used here is different from the log level set by %config application. log_level=’INFO’. The log level set by %config is the log information output by Jupyter to the current running Jupyter terminal.

expand

Jupyter is an open source tool, so there are many developers developed a lot of extension plug-ins, you can see:

Github.com/ipython/ipy…

Github is an extension of the ipython-SQL plugin, which includes spell checking, code folding, and other functions.

Github.com/ipython-con…

These plug-ins can be installed by using the following command

pip install ipython-sql
pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user
jupyter nbextension enable spellchecker/main
jupyter nbextension enable codefolding/main
Copy the code

Enhance Seaborn’s chart

One of the most common uses for Jupyter Notebook is for charting. But Python’s most common drawing library, Matplotlib, doesn’t give very appealing results in Jupyter, which can be embellished with Seaborn and added some extra features.

If seaborn is not installed, you can run the PIP install seaborn command or in Jupyter, according to the shell command execution mode introduced at the beginning –! PIP install seaborn can import the necessary libraries and data:

%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
data = sns.load_dataset("tips")
Copy the code

Using a simple dataset provided by Seaborn, the tips used here are a PANDAS DataFrame dataset containing billing information from a bar or restaurant.

Data.head () displays the first five pieces of data and views property information.

Use Matplotlib to graph total_bill and tip:

plt.scatter(data.total_bill, data.tip);
Copy the code

Adding Seaborn is also easy, as shown below, to style a DarkGrid

sns.set(style="darkgrid")
plt.scatter(data.total_bill, data.tip);
Copy the code

Seaborn has five styles: Darkgrid, Whitegrid, Dark, White, and Ticks.

In fact, we can also use Seaborn’s drawing function alone, as follows:

sns.scatterplot(x="total_bill", y="tip", data=data);
Copy the code

The figure above can add title information for each coordinate and have a promotion mark for each data point. Seaborn can also be automatically divided according to the type of data, that is, a dimension can be added. Here we can add the attribute smoker as a parameter hue, indicating the color of the data point:

sns.scatterplot(x="total_bill", y="tip", hue="smoker", data=data);
Copy the code

After the addition of smoker, we can see that each data point is divided into two colors according to whether or not the information displayed is even richer. Furthermore, we add the attribute size as a color division and smoker as a style, as shown below:

sns.scatterplot(x="total_bill", y="tip", hue="size", style="smoker", data=data);
Copy the code

Seaborn can create even more beautiful diagrams. For more examples, see seaborn’s website:

Seaborn.pydata.org/examples/in…

Macros (Macros)

A lot of times, you might be doing the same tasks over and over again, like importing the same bunch of third-party libraries, statistical methods for each dataset, or drawing the same types of charts every time you create a new notebook.

In Jupyter, you can save snippets of code as executable macros that are available to all notebooks. This may not be a very user-friendly way for others to read and use your notebook, but it can be a very convenient and workless way for you.

Macro commands are code, so they can also contain variables. Let’s start with an example

The first is to write a code unit, the main function is to output Hello, name! The command %macro is then used to save the macro command with the name __hello_world, and 28 represents the last code unit that was run In order 28, that is, the code unit corresponding to In [28]. Then %store is the save macro command.

Load the macro command as follows, again using %store, but with -r and the name of the macro command.

If you change the variable used in the macro command, the output will also change:

name = 'Ben'
__hello_world
Copy the code

Output result:

Hello, Ben!
Copy the code

You can also do more with macro commands, which can be found on the official website.

Executing external code

In Jupyter you can also load and run external code, i.e..py code files. The commands used here are %load and %run, respectively.

Let’s start by creating a new code file, imports. Py, which contains the following:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
Copy the code

Then load the code file in Jupyter:

%load imports.py
Copy the code

The running results are as follows:

Next we create a new code file, triangle_hist.py, which looks like this and draws a triangle histogram.

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="darkgrid")

if __name__ == '__main__':
    h = plt.hist(np.random.triangular(0.5.9.1000), bins=100, linewidth=0)
    plt.show()
Copy the code

Then call the command %run to run:

Alternatively, you can pass arguments to the script by adding them to the end of the file name, such as %run myfile.py 0 “Hello, World!” Or pass the variable name, such as %run $filename {arg0} {arg1}, or add -p to run the code through Python’s parser.

  • Stackoverflow.com/a/14411126/…
  • Stackoverflow.com/questions/5…

The script is running

The most powerful thing about Jupyter Notebook is its interactive flow, but it can also be run in non-interactive mode, which means you can run Jupyter Notebook from a script or command line.

The basic syntax of the command line is as follows:

jupyter nbconvert --to <format> notebook.ipynb
Copy the code

Nbconvert is an API interface for converting notebook to other forms, such as PDF, HTML, Python scripts (i.e..py files), and even LaTeX files.

For example, if you need to convert the notebook to PDF:

jupyter nbconvert --to pdf notebook.ipynb
Copy the code

This will generate a PDF file –notebook. PDF, but if you want to convert to PDF, you’ll need to install the necessary libraries — Pandoc and LaTeX.

Stackoverflow.com/a/52913424/…

By default, nbconvert does not execute the code in the notebook, but you can add –execute to make it run code:

jupyter nbconvert --to pdf --execute notebook.ipynb
Copy the code

Alternatively, you can add –allow-errors to make nbconvert print error messages in the code and not interrupt the conversion process because of errors:

jupyter nbconvert --to pdf --execute --allow-errors notebook.ipynb
Copy the code

Using a database

To use the database in Jupyter, you first need to install ipython-SQL:

pip install ipython-sql
Copy the code

Once installed, first type the following magic command to load ipython-SQL.

%load_ext sql
Copy the code

Then connect to a database:

%sql sqlite://
Copy the code

Output:

'Connected: @None'
Copy the code

Here is connected to a temporary database, you can also specify a connection to your database, can according to the website (docs.sqlalchemy.org/en/latest/c…). Syntax to connect:

dialect+driver://username:password@host:port/database
Copy the code

Postgresql :// Scott :tiger@localhost/mydatabase, driver is postgresql, usename is Scott, password is tiger, Host is localhost, and database is mydatabse.

Here’s a quick way to build our database using the tips data set loaded earlier with Seaborn:

Next, you can perform some query operations on the data, as shown below, where you need to use the multi-line magic command form %% :

More complex query operations can also be performed:

More examples can be found at github.com/catherinede…

summary

Contrast the original text, in fact, deleted some content, such as script running part of Jupyter, customize the style of Jupyter, and then the database part has been deleted, mainly the original code is always missing part of the content.


reference

  • Github.com/catherinede…
  • Github.com/ipython-con…
  • Github.com/mwaskom/sea…
  • Stackoverflow.com/a/14411126/…
  • Stackoverflow.com/questions/5…
  • Nbconvert. Readthedocs. IO/en/latest/I…
  • Stackoverflow.com/a/52913424/…
  • Github.com/catherinede…

Finally, the code for this article is uploaded to Github:

Github.com/ccc013/Pyth…

Welcome to follow my wechat official account — the growth of algorithmic ape, or scan the QR code below, we can communicate, learn and progress together!