What you see…
If you are curious to better understand what happens behind your Shell when you run one of the most common commands we can use on a Linux system, then in this article we will try to clarify what happens when we run the “ls -l” command.
When we execute the command “ls -l” in the shell, we expect to obtain as a result a list of all files and directories in the current directory.
For this, we have used the command “ls”, with the argument “-l” (lowercase) option tells “ls” to print files and directories in a long listing format.
The blinking cursor is indicative that the shell is waiting for us to write a character, that the stdin will read and buffer it, then write the character to the console, so that we see what we are typing, and so on until we have our command line “ls -l”.
In simple words, the Shell is a simple loop that waits for user input to search the Linux PATH …
Once we press enter …
…and this begins when the keyboard driver recognizes that characters have been typed and passes them to the Shell. The getline function reads the entered command line as a string from standard input and stores it in a buffer.
Then the process called tokenization begins, a string tokenization function is called that splits the command line into tokens, thus removing the blanks. Now our command has two tokens, “ls” and “-l”, once finished, the information is stored in an array of strings.
With the array tokenized, the shell checks whether the first token (the main command itself) is an alias, and if so, replaces the alias with the actual command. Normally, the Shell will search your system files for defined aliases. If the “ls” command is an alias for something else, the shell will replace the “ls” token with the command string that “ls” represents for the correct operation to take place in the following steps. If it is an alias, it is saved as a token after removing the spaces as before and it is found again the aliases are verified.
The next step is to check if each token is a built-in function or not. If the command is integrated, the shell executes the command directly without invoking another program, if it cannot find it, it will need to look for it to execute it. For example, “cd”, “pwd”, they are built-in commands, “ls” is not built-in, so now the system needs to find the executable for “ls”.
After the built-in verification, the Shell will find itself in the directory where “ls” exists and will create something known as a process, which is simply an instance of a program that needs to be executed. The Shell itself is a parent process and executing “ls” would be a child process.
Thus, the shell, using the fork () function, will clone itself or create a new process (child process) to run at the same time with the first process (parent process). This is done so that the shell can return to the prompt after it completes or fails. The fork () system call will give the child process a PPID or process ID number of 0, which helps differentiate it from the parent process that will be given a non-zero PID. The parent process uses the wait () system call, to ensure that it waits until the child process has run and completely terminated.
When the return value of the PPID fork of the child process to the parent returns 0, the execution was successful, or -1 if it failed.
At this moment both processes (patent and child) are running. All the process in this step happens within the child process and the parent process will wait until the child process finishes.
The next step is to verify the ROUTE. The shell will take the PATH environment variable and check if the “ls” command is in the directory list of the PATH variable. The shell will take a copy of the PATH value and tokenize it with the delimiter “:”, the result after tokenization is an array of strings, each string is a path to a directory.
The shell will concatenate the first string “ls” after tokenizing the buffer in the previous step and check if the path exists (for example, /usr/bin/ls). If it does not exist, the shell will move to the following directories. If there is no path, it will return to the main process.
If the path exists, once the executable file is found, it will execute the system command by calling the execve function (system call, a method used by programs to communicate with the system kernel) in a separate process from the main program and print the exit. The execve() function will receive the necessary parameters such as path, filename, arguments. The execve function runs the program pointed to by the file name. The file name must be a binary executable or a script that begins with a line of the form: #! interpreter [optional-arg].
With the execve() function the shell will know what command and what arguments for that command will be executed and where to find them, such as:
In this example, the command “ls” found in the folder “/usr/bin/ls” will be executed, with the argument “-l” (long) that will print the name of the file, the modification time, the size (in bytes), the group name, the owner name, and the file permissions.
The key factor for this to happen is that the commands are executed after a fork () system call. This system call clones the main process and runs its own processes based on whether certain conditions are met. Although the child and parent processes run simultaneously, they have different PIDs (Process IDs), so they run different code. In the case of ls, fork () will create a child process and return the output to STDOUT if the executable is inside $ PATH, for example “/bin/ls/”.
While this is happening, the parent process waits for execution to complete and the child process terminates, after which memory is cleared and the parent process takes control again and waits for the next input from the user.
As you can see, many things happen behind the execution of a simple command or program with an argument, in the blink of an eye.
I want this post to be of help to those who read it, carrying it out is part of my training at Holberton Montevideo.
It will be until next time, happy learning!