Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”

This article also participated in the “Digitalstar Project” to win a creative gift package and creative incentive money

In Python, when we try to traverse a file in a directory, we think of using the os.listdir() method:

import os
os.listdir("path")
Copy the code

This approach, however, has a fatal limitation:

If there are subdirectories in the passed directory, this method can only return the name of the subdirectory, but cannot deeply traverse all the files in the subdirectory.

That is, you cannot traverse files in a subdirectory using os.listdir().

To solve these problems and achieve true deep traversal, this article introduces you to another method in the Python OS library, os.walk().

Let’s go!

1. Os.walk (

1.1. Grammar

To use the os.walk() method, we first need to import Python’s OS library (duh).

After importing the OS library, the syntax for the os.walk() method is as follows:

import os
​
os.walk(top[, topdown=True[, onerror=None[, followlinks=False]]])
Copy the code

Easy to use: the main parameter of os.walk() is “top”, which is the path of the file that needs to be traversed. Usually, we only need to use this parameter to normally use os.walk(), that is, through os.walk(top) can quickly and deeply traverse the specified folder.

1.2. Detailed description of parameters

Parameter names meaning
top The address of the directory to traverse
topdown This parameter is optional. If this parameter is True, top directories are preferentially traversed; otherwise, subdirectories of top are preferentially traversed (default: True). If True, files in the top directory are traversed first, and then files in subdirectories contained in the top directory are traversed.
onerror Optional, a Callable object is required, which is called when the walk needs an exception.
followlinks This parameter is optional. If True, the directory actually referred to by the shortcut in the directory (Linux is the soft link symbolic link) will be traversed. If False, the subdirectories of top will be traversed first (the default is False).

1.3. The return value

The os.walk() method returns an Iterator:

import os
from collections.abc import Iterator
​
isinstance(os.walk('Downloads/chrome - 2.0.0'),Iterator)  # The path is my custom, don't worry about the details
# output; Ture
Copy the code

2. Actually demonstrate the code

This section uses a directory named Chrome-2.0.0 for in-depth traversal. The folder structure is as follows:

As you can see, Chrome-2.0.0 contains 6 files and a subdirectory assets, which contains a subdirectory images, which contains 5 files.

2.1. Return the result as a list

Since os.walk() returns an iterator directly, we usually need to convert it to a list display:

import os
​
traResult = list(os.walk('Downloads/chrome - 2.0.0'))
print(traResult) 
Copy the code

This code returns a list of tuples equal to the total number of subdirectories under a directory (including subdirectories of subdirectories) plus one (the directory itself). Each tuple in turn consists of three elements, namely (‘ current path ‘, [directories contained in current path], [files contained in current path]).

The output of the above code is:

[('Downloads/chrome - 2.0.0'['assets'], ['background.js'.'browser-polyfill.min.js'.'browser-polyfill.min.js.map'.'manifest.json'.'popup.html'.'popup.js'), ('Downloads/chrome - 2.0.0 \ assets'['images'], []),'Downloads/chrome - 2.0.0 \ assets \ images', [], ['128.png'.'16.png'.'32.png'.'48.png'.'icon-large.png']]Copy the code

2.2 Using loops to return results (most commonly used)

Obviously, the first method is not convenient for us to use and read, so we usually use a loop to read the result of the traversal. The example code is as follows:

import os
​
for root, dirs, files in os.walk("Downloads/chrome - 2.0.0", topdown=False) :for name in files:
        print(os.path.join(root, name))
    for name in dirs:
        print(os.path.join(root, name))
Copy the code

The output is:

Downloads/chrome-2.0. 0\assets\images\128.png
Downloads/chrome-2.0. 0\assets\images\16.png
Downloads/chrome-2.0. 0\assets\images\32.png
Downloads/chrome-2.0. 0\assets\images\48.png
Downloads/chrome-2.0. 0\assets\images\icon-large.png
Downloads/chrome-2.0. 0\assets\images
Downloads/chrome-2.0. 0\background.js
Downloads/chrome-2.0. 0\browser-polyfill.min.js
Downloads/chrome-2.0. 0\browser-polyfill.min.js.map
Downloads/chrome-2.0. 0\manifest.json
Downloads/chrome-2.0. 0\popup.html
Downloads/chrome-2.0. 0\popup.js
Downloads/chrome-2.0. 0\assets
Copy the code

Os.walk () is used to walk through files in depth.

Any comments or suggestions also please leave a message or private letter.