Detailed analysis based on Linux process management

The last article is embedded for some time in the future, or Linux world, an embedded er to explore the Linux kernel experience, we talked about Linux kernel development and application development, today we talk about Linux's key part of Linux process management.

The OS is dry? Processing provides an abstraction of the hardware layer, and it also has a lot of hardware management functions, and these functions, in one sentence, are to deal with the space-time multiplexing problem of each component (time and space reuse) Problems, such as cpu is time-sharing reuse (of course, there are special cases of multi-core cpu, and memory is time-sharing and empty...).

Since ancient times, the big cows have defined the OS very few, and I think the most convincing OS definition is: "The intersection of all software (specifically, the application)", I think the computer is a special machine at the beginning of the birth, in A special program that can only be customized on a machine, and this program must do all the hardware coordination and complete the work that he should have. Slowly develop the computer to the general purpose, in order to improve the system utilization. And to avoid the blind and repetitive underlying implementation, the OS is formed step by step with requirements and constantly improved...

I don't know if you found it, many things are repeated on the historical stage, as if it corresponds to the phrase "20 years later is a hero", look at the current Google Chrome OS, in fact, the original prototype is white is to chrome The necessary underlying layer of the browser is re-detached from the OS, so that it has the ability to run independently. This is not the taste of the earliest dedicated machine. The computer is very magical. A good idea may change the whole market, even a century... ...

Far away, come back, go back to reusability, because cpu has to run a lot of programs, but the number of cups is limited, which involves the reuse of cpu, which is reused by multiple processes (process and program The difference is not much to say, I am familiar with the knowledge of OS, simply mention the essence of the program is to apply for a process from the kernel, then copy the code in the package to the code field of the corresponding process and set the relevant variable data order The process runs, so the program is just static code, and the process is a process of constantly loading code from the program. This is because the CPUs are reused between processes, so someone is required to be responsible for their organized and disciplined reuse. And coordinate the priorities of the process, switching rules, some processing after switching...

In addition, how to produce the process, how to switch, how to destroy... Who is this? Of course, OS, for Linux, of course, the kernel. After all, the OS is used to run the program, and the process is the soul of the program, showing the importance of process management, and the first step from the process.

Although, the process is a program in the execution period, but understand that the process is not limited to a piece of executable code, you think, in order to run the process, you have to know who the process is, different from other processes; To ensure that the current process can not freely access the address space of other processes, if this is not limited, then write a hacking program more casual; also involved in multi-threading problems; in addition, because the process is to reuse cpu, it is a while Execution, change for another time, then you have to make sure that it can be executed last time, or if it is messed up... So, the process needs something:

Open file

Suspended signal (the processing of an event in Linux is based on the signal mechanism, just like the event mechanism of windows)

Kernel internal data (this is the legendary recovery site, to restore the state before the process switch needs to save the site)

Processor state (if you don't understand it, this is also part of the live hold, because some programs are executed)

Address space

One or more thread of execution (thread implementation under Linux is very interesting and very simple, essentially several shared processes, without setting up a dedicated thread data structure)

The above is the main component of the Linux process, of course, process management, there must be management of the process, management related process strategy and life cycle and other things, we will slowly talk about it.

It is said that a long time ago, the process has been alive since the time of creation. The process of living in the Linux world, the fork() system call, will give birth to a small process, which is faster than the water of the daughter country. The process has no ears and no eyes. He knows that he was born. Since fork() is a biological mother, then this is what the biological mother knows best. The fork() system call returns twice: once to the parent process and once to the child process.

The new process is to execute the new different program immediately, and then call the exec*() family function to create a new address space and load the new program. (fork() is actually implemented by the clone() system call.)

Eventually, the program will exit execution via the exit() system call. This function will terminate the process and release the resources it occupies. The parent process will query the child process through the wait4() system call, which makes the process have the ability to wait for a specific process to execute. After the process exits, it is set to a dead state until the parent process calls wait() or waitpid().

Knowing that the process is not just composed of a piece of execution code, let's talk about the general process of the process under Linux. In fact, a process is equivalent to the dynamic execution of a software (strictly speaking, the dynamic execution of a software subsystem, of course, we can imagine the subsystem as a sub-software, which will be easy to understand). To create a process in Linux, you need to use the fork() system call. The generation of a child process is achieved by copying the parent process. After fork, it will return twice: once to return to the parent process and once to the child process. Why do I have to return twice, how is it different? I just started this problem, because the child process copied the code of the parent process, and the context of the fork() return point is the same when returning, but The returned value is not the same, in order to distinguish whether it is a parent process or a child process...

No, the new child process is created, and then what? It is definitely not the work of the current process, or what to create him? So, this time you then call the exec*() family function, which can create new The address space is loaded and executed into the current process.

Finally, the program exits execution through exit() syscall (system call, which is used later). This function will terminate the process and release the resources it occupies. The parent process uses wait4() syscall to query whether the child process is terminated. This makes the process have the ability to wait for a specific process to complete (this is not the legendary synchronization? Is there a wood? There is wood? Haha). After the process exits, it is set to a dead state, knowing that the parent process calls wait() or waitpid() (in fact, they seem to be based on wait4()).

The above is a simple process of creating a process to recycle. The child process is created by the parent process, and the parent process recycles. It is a bit of a feeling of restoring the scene. However, in a relatively secure system, the "recovery scene" and other similar concepts can be seen everywhere, just like borrowing money. It is not difficult to borrow again. It is also the reason to do the program. Haha, slowly realize that there are many philosophical reasons in programming.

1, process descriptors and task structure

In software design, the first is abstract nouns, all nouns will find its data abstraction in the computer, which is the great data structure children's shoes. He may be abstracted into a variable, an array, a struct, an object...may be of any type, and as long as it meets your needs, he is the most perfect abstraction.

The process is placed in the task list in the kernel, and the task list is a two-way circular list. Each item in the linked list is of type task_struct, the process descriptor. Defined in, contains all the information of a specific process.

1.1 Assignment Process Descriptor

Some of the progress of the process, but can not be expressed indiscriminately, just like chasing a girlfriend, can not see people to confess, it is not a rogue, the number of places is limited, just accept it. OS can multi-channel and run the process is also a few, if you want to create a process, do not run the machine, it becomes a virus program, eat cpu. Linux uses the slab allocator to allocate the task_struct structure. Dynamically generate task_struct from slab, just create a new thread_info at the bottom of the stack, and then use the data of this structure to easily calculate the offset.

Detailed analysis based on Linux process management

It contains pointers to task_struct and information about the process.

1.2 Storage of process descriptors

The Kernel uniquely identifies a process by PID. The PID is of type pid_t. It is also of type int. The maximum value of pid is 32768 (the Max value of short int). You can change its value in /proc/sys/kernel/pid_max, because 32768 processes in a large company's web server cluster work seem to be insufficient.

The kernel generally processes the process directly through the task_struct process, which directly finds or calculates the current task_struct through the current macro. Some platform registers are rich, do not need to calculate their values, keep the value of the current running process in a dedicated register and OK. For example, the powerPC uses the r2 register, and the x86 register has to be calculated specifically.

1.3 Process Status

The process has been in one of five states:

TASK_RUNNING / / running status

TASK_INTERRUPTIBLE / / interruptible state

TASK_UNINTRRUPTIBLE / / non-interruptible state

TASK_ZOMBIE//Zombie

TASK_STOPPED//stop

The figure below is an approximate conversion process. Not very detailed, you can search for it...

5*7 Dot Matrix LED Display

Wuxi Ark Technology Electronic Co.,Ltd. , https://www.arkledcn.com