This article has participated in the activity of “New person creation Ceremony”, and started the road of digging gold creation together.


One, foreword

The shell is a command-line interpreter that gives bash commands to execute, and Bash gives the task to a subprocess so that if something goes wrong with that task, Bash is not affected.

Child processes do not perform tasks directly, but rather through program substitution, so the two most important parts of shell simulation are creating child processes and program substitution.

Creating a child process is already mentioned in the process concept, so let’s focus on process replacement and then simulate implementing the shell.


Second, process program replacement

1. Replacement principle

A child created with fork executes the same program as its parent (but possibly a different branch of code), often calling an exec function to execute another program. When a process calls an exec function, the process’s user-space code and data are completely replaced by the new program, starting with the startup routine of the new program. Calling exec does not create a new process, so the id of the process does not change before and after the exec call.


2. Substitution functions

L (list) : indicates a list of parameters v(vector) : indicates an array of parameters P (path) : indicates the path to search for the program in environment variables e(env) : indicates a custom environment variable in parameters

As you can see above, the six functions are not very different when used. Use execl as an example.


int execl(const char *path, const char *arg, …) ;

Path is the path of the program, how arG is executed, “…” Is a mutable argument and must end with NULL. Here’s an example.

Check the path of ls.

#include <stdio.h>
#include <unistd.h>

int main(a)
{
	printf("I am a process! \n");
	sleep(3);
	// The path ls command line argument ends with NULL
	execl("/usr/bin/ls"."ls"."-a"."-i"."-l".NULL);
	return 0;
}
Copy the code

The result of the final program is obviously the same as the result of running LS-A-I-L above in the GIF.

This allows me to call other programs (such as LS) within the myProc process I wrote.


Process replacement does not create a new process, but replaces the original code and data with new code and data, so the code after process replacement does not execute and is simply replaced.

#include <stdio.h>
#include <unistd.h>

int main(a)
{
	printf("I am a process! \n");
	sleep(3);
	execl("/usr/bin/ls"."ls"."-a"."-i"."-l".NULL);
	printf("you can't see me! \n");// See if this code is executed
	return 0;
}
Copy the code

You can’t see me! , so the above conclusion can be verified.


Program replacement can also fail. Take path error as an example.

#include <stdio.h>
#include <unistd.h>

int main(a)
{
	printf("I am a process! \n");
	sleep(3);
	The correct path is usr/bin/ls
	execl("/us/bin/ls"."ls"."-a"."-i"."-l".NULL);
	printf("you can't see me! \n");// See if this code is executed
	return 0;
}
Copy the code

The process replacement failed, so the following code and data will not be replaced, continue to run the following code.

Therefore, the exec family of functions does not need to judge the return value in terms of use, as long as the return is a failure.


Generally, the exec series functions are called to create a child process, and then replace the code and data of the child process to execute other programs. As with the above code, the new code is shown below.

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdlib.h>

int main(a)
{
	pid_t id = fork();
	if (id == 0)
	{
		printf("I am a process! \n");
		sleep(3);
		execl("/usr/bin/ls"."ls"."-a"."-i"."-l".NULL);
		exit(10);
	}
	int status = 0;
	pid_t ret = waitpid(id, &status, 0);
	if (ret > 0 && WIFEXITED(status))
	{
		printf("signal:%d\n", WIFEXITED(status));
		printf("exit code:%d\n", WEXITSTATUS(status));
	}
	return 0;
}
Copy the code


Here are some equivalents of functions that do not require environment variables.


3. Simulate a simple shell

Take LS as an example to illustrate the general process of shell.

The timeline below shows the sequence of events. Where time goes from left to right. The shell is represented by a square identified as SH, which moves from left to right over time.

The shell reads the character “ls” from the user, creates a new child process, then runs the ls program in the newly created process and waits for that process to terminate.

The shell then reads a new line of input, sets up a new process, runs the program in that process, and waits for the process to terminate.

So to write a shell, you loop through the following process:

  1. Get command line
  2. Parsing command line
  3. Create a child process (fork)
  4. Replace child process (exec)
  5. Parent waits for child to exit (wait)

1.gethostname

First, the command line is preceded by the following prompt:These can be obtained by gethostName.

Get the hostname with the following code:

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdlib.h>

int main(a)
{
	char name[100];
	while (1)
	{
		gethostname(name, sizeof(name));
		printf("%s\n", name);
	}
	return 0;
}
Copy the code

In the code below, I simply use my own fictitious hostname belt to represent the simulated shell.


2. Simulation implementation

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <string.h>

#define LEN 1024// A maximum of 1024 characters can be read
#define NUM 32// A maximum of 32 command + command line parameters are supported

int main(a)
{
	char cmd[LEN];
	while (1)
	{
		// Prints the prompt
		printf("[yh@my_centos dir]& ");

		// Get user input (which is essentially a string)
		fgets(cmd, LEN, stdin);// Reads a string of user input from standard input (typically keyboard)
		cmd[strlen(cmd) - 1] = '\ 0';// The last position was '\n'.

		// Parse the string
		char* myArg[NUM];
		// The input string must have at least the first command
		myArg[0] = strtok(cmd, "");// Split the command with a space
		int i = 1;
		// Get the following command line arguments
		while (myArg[i] = strtok(NULL.""))// By default it is extracted from the previous substring, so NULL is passed
		{
			i++;
		}

		// Let the child process execute the command
		pid_t id = fork();
		if (id == 0)
		{
			//child
			execvp(myArg[0], myArg);
			exit(11);
		}

		int status = 0;
		pid_t ret = waitpid(id, &status, 0);
		if (ret > 0 && WIFEXITED(status))
		{
			printf("signal:%d,", WIFEXITED(status));
			printf("exit code:%d\n", WEXITSTATUS(status)); }}return 0;
}
Copy the code


However, some commands cannot be implemented, such as ‘; ‘separated two commands, use’ | ‘commands such as line, CD, etc., because some doesn’t take into account when dealing with strings in the program, because some of the command itself is not completed by the fork.

The shell is implemented in emulation only for fork and process substitution, so it’s bound to be flawed.


Thank you for reading. If there are any mistakes, please correct them