When importing third-party modules, we usually use the import keyword, such as:

import scrapy
from scrapy.spider import Spider
Copy the code

But if you look at Scrapy settings.py, you’ll see that pipeline and middleware are specified as strings, such as:

 DOWNLOADER_MIDDLEWARES = {
     'Test.middlewares.ExceptionRetryMiddleware': 545.'Test.middlewares.BOProxyMiddlewareV2': 543,
 }
 
  SPIDER_MIDDLEWARES = {
    'Test.middlewares.LoggingRequestMiddleware': 543,}Copy the code

We know that the Test here. Middlewares. ExceptionRetryMiddleware actually corresponds to the root directory below Test middlewares in folder. Py ExceptionRetryMiddleware class in the file. So how does Scrapy import this class based on this string?

In the Scrapy source code, we can find the relevant code:

def load_object(path):
    """Load an object given its absolute object path, and return it. object can be a class, function, variable or an instance. path ie: 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware' """

    try:
        dot = path.rindex('. ')
    except ValueError:
        raise ValueError("Error loading object '%s': not a full path" % path)

    module, name = path[:dot], path[dot+1:]
    mod = import_module(module)

    try:
        obj = getattr(mod, name)
    except AttributeError:
        raise NameError("Module '%s' doesn't define any object named '%s'" % (module, name))

    return obj
Copy the code

From this code, we know that it uses the import_module function of the importlib module:

  1. First follow the string path to the rightmost.Split the string path into two parts, for example:Test.middlewares.LoggingRequestMiddlewareDivided intoTest.middlewaresandLoggingRequestMiddleware
  2. useimport_moduleImport the left part
  3. It goes through on the leftgetattrGet the concrete class

Now let’s test that out. The test file structure we created is shown below:

> < span style = “box-sizing: border-box; line-height: 22px; word-break: inherit! Important;”

The contents of the main.py file look like this:

> < span style = “box-sizing: border-box; line-height: 22px; word-break: inherit! Important; word-break: inherit! Important;”