Linux is often referred to as a monolithic kernel, which means that most of the functions of the operating system are called the kernel and run in privileged mode.

It differs from microkernels, which run only basic functions (interprocess communication [IPC], scheduling, basic input/output [I/O], and memory management) as kernels, leaving other functions (drivers, network stacks, and file systems) out of the privileged space.

As a result, you might think Linux is a completely static kernel, but the opposite is true.

Linux can be dynamically changed at run time through Linux kernel modules (LKMS).

Dynamic change means that you can load new functionality into the kernel, remove functionality from the kernel, or even add new LKMS that use other LKMS.

The advantage of LKM is that it minimizes the memory footprint of the kernel and loads only the elements needed (an important feature of embedded systems).

Linux is not the only (nor the first) single kernel that can change dynamically. Variations of Berkeley Software Distribution (BSD), Sun Solaris, older kernels such as OpenVMS, And other popular operating systems, such as Microsoft® Windows® and Apple Mac OS X, support loadable modules.

Profiling kernel modules

LKMS are fundamentally different from elements compiled directly into the kernel or a typical program. A typical program has a main function, where LKM contains entry and exit functions (in version 2.6, you can name these functions any way you want).

The entry function is called when a module is inserted into the kernel, and the exit function is called when a module is removed from the kernel.

Because the Entry and exit functions are user-defined, there are module_init and module_exit macros that define what kind of functions they belong to.

The LKM also contains a set of required macros and an optional set of macros that define the license for the module, the author of the module, the description of the module, and so on. Figure 1 provides a very simple view of an LKM.

The 2.6 Linux kernel provides a new and simpler way to build LKMS.

When building an LKM, you can use a typical user tool management module (although internally it has changed) :

  • Standard INSMOD (LKM installed),
  • Rmmod (delete LKM),
  • Modprobe (wrapper for Insmod and RMMOD),
  • Depmod (for creating module dependencies),
  • And modinfo (for finding values for module macros).

For more information on building LKMS for the 2.6 kernel, see Resources.

Profile the kernel module object

LKM is nothing more than a special Executable and Linkable Format (ELF) object file.

In general, object files must be linked to parse their symbols and results in an executable.

Since the LKM must be loaded into the kernel before it can resolve symbols, the LKM is still an ELF object.

You can use standard object tools on LKM (in version 2.6, kernel objects carry the suffix.ko).

For example, if you use the objdump utility on an LKM, you’ll find familiar sections such as.text (description),.data (initialized data), and.bss (block start symbol or uninitialized data).

You can also find other sections in the module that support dynamic features.

The.init.text section contains the module_init code, and the.exit.text section contains the module_exit code (see Figure 2).

The.modinfo section contains various macros that represent module licenses, authors, descriptions, and so on.

With the basics of LKMS behind us, let’s now explore how modules enter the kernel and how they are managed inside the kernel.

LKM life cycle

In user space, insmod (insert module) initiates the module loading process.

The insmod command defines the module to load and invokes the init_module user-space system call to begin the loading process.

The insmod command in the 2.6 kernel has been modified to be very simple (70 lines of code) to do more work in the kernel.

Insmod does not do all the necessary symbolic parsing (handling Kerneld), it simply copies the module binaries to the kernel via the init_module function, and the kernel does the rest.

The init_module function goes through the system call layer into the kernel to the kernel function sys_init_module (see Figure 3).

This is the main function for loading modules, and it uses many other functions to do the hard work.

Similarly, the rmmod command causes delete_module to make a system call, and delete_module eventually enters the kernel and calls sys_delete_module to remove the module from the kernel.

During module loading and unloading, the module subsystem maintains a simple set of state variables that represent module operations.

When the module is loaded, the state is MODULE_STATE_COMING.

If the module is loaded and available, the status is MODULE_STATE_LIVE.

In addition, when the module is unloaded, the status is MODULE_STATE_GOING.

Module loading details

Now let’s look at the internal functions when the module is loaded (see Figure 4).

When the kernel function sys_init_module is called, a license check is initiated to find out whether the caller is authorized to perform the operation (done through the capable function).

Then, the load_module function is called, which is responsible for loading the module into the kernel and performing the necessary debugging (more on this later).

The load_module function returns a module reference to the newly loaded module.

This module is loaded on a list of all dual-linked modules in the system, and the notifier list notifies threads that are waiting for the module state to change.

Finally, the module’s init() function is called to update the module state to indicate that the module is loaded and available.

The internal details of loading modules are ELF module parsing and operations.

The load_module function (located at./ Linux /kernel/module.c) first allocates a block of temporary memory to hold the entire ELF module.

The ELF module is then read from user space into temporary memory using the copy_from_user function.

As an ELF object, this file has a unique structure that is easy to parse and verify.

The next step is to perform a set of health checks on the loaded ELF image (is it a valid ELF file? Does it fit into the current architecture? Etc.).

Once the health check is complete, the ELF image is parsed, and then a set of convenience variables is created for each section header to simplify subsequent access.

Because ELF objects’ offsets are based on zero (unless reassigned), these convenience variables include relative offsets in temporary memory blocks.

ELF section headers are also verified during the creation of convenience variables to ensure that valid modules are loaded.

Any optional module parameters are loaded from user space into another allocated kernel memory block (step 4), and the module state is updated to indicate that the module is loaded (MODULE_STATE_COMING).

Per-cpu blocks are allocated if per-CPU data is required (which is determined when examining section headers).

In the previous step, module sections are loaded into kernel (temporary) memory, and it is known which sections should be kept and which can be deleted.

Step 7 Allocate the final location for the module in memory and move the necessary sections (SHF_ALLOC in the ELF header, or the sections that occupy memory during execution).

Another allocation is then performed, with the size required for the necessary section of the module.

Iterate over each section in the temporary ELF block and copy the sections that need to be executed into the new block.

Some additional maintenance will follow.

It also does symbol parsing, which can resolve symbols that reside in the kernel (compiled into a kernel image) or temporary symbols that are exported from other modules.

The new module is then iterated over for each remaining section and the relocation is performed.

This step is related to architecture, so rely on for architecture (. / Linux/arch / / kernel/module. C) to define the helper function.

Finally, the instruction cache is flushed (because the temporary.text section is used), some additional maintenance is performed (freeing temporary module memory, setting up system files), and the module is finally returned to load_module.

Module uninstallation Details

The process of uninstalling a module is basically the same as loading a module, except that you must perform a few health checks (to ensure that the module is safely removed).

The process of uninstalling a module begins with a call to the RMmod (delete module) command in user space.

Within the RMmod command, a system call is made to delete_module, which eventually results in a call to sys_delete_module within the kernel (see Figure 3).

Figure 5 illustrates the basic process of deleting a module.

When the kernel function sys_delete_module is called (the name of the module to be deleted is passed as an argument), the first step is to ensure that the caller has permissions.

Next, a list is checked to see if there are other modules that depend on this module.

There is a list called modules_which_use_me that contains an element of each dependent module.

If the list is empty, there are no module dependencies, so the module is the one to remove (otherwise an error is returned).

The next step is to test whether the module is loaded.

The user can call RMMOD on the currently installed module, so this check ensures that the module is loaded.

After several maintenance checks, the penultimate step is to call the module’s exit function (built-in to the module).

Finally, call the free_module function.

After calling the free_module function, you will find that the module is safely removed.

There are no dependencies for the module, so you can begin the kernel cleanup process for the module.

First, remove modules from the various lists (system files, module lists, and so on) that were added during installation.

Second, a call to a cleaning routines associated with architecture (can be found in. / Linux/arch / / kernel/module. Found in c).

Then iterate over the module that has the dependency and remove that module from the list.

Finally, from the kernel’s point of view, the cleanup is complete and the various memory allocated for the module is freed, including parameter memory, per-CPU memory, and the module’s ELF memory (core and init).

Optimize the kernel for module management

In many applications, loading modules dynamically is important, but once loaded, there is no need to unload the modules.

This allows the kernel to be dynamic at startup (loading modules based on the device found), but not throughout the operation.

If you don’t need to unload modules after loading, you can optimize to reduce the code required for module management.

You can “unload” the kernel configuration option CONFIG_MODULE_UNLOAD, removing a large number of kernel functions associated with unloading modules.

conclusion

This has always been a high-level view of the module management process in the kernel.

For details of module management, the source code itself is the best documentation.

About call the main function in the module management, please see. / Linux/kernel/module. C (and. / Linux/include/Linux/module. H a header file).

You can also in. / Linux/arch / / kernel/module. C is found in several architecture related functions.

Finally, you can find the kernel auto-loading function (which can automatically load modules from the kernel as needed) in./ Linux /kernel/kmod.c.

This feature can be enabled through the CONFIG_KMOD configuration option.