A process is a program executing on the operating system. It consists of a program (machine code) and a state of the program (current control point, variable values, call stack, open file descriptors, etc.).
This section presents the Unix system calls to create new processes and make them run other programs.
The system call fork creates a process.
The new child process is a nearly perfect clone of the
parent process which called fork
. Both processes execute
the same code, are initially at the same control point (the return
from fork
), attribute the same values to all variables, have
identical call stacks, and hold open the same file descriptors to
the same files. The only thing which distinguishes the two processes
is the return value from fork
: zero in the child process,
and a non-zero integer in the parent. By checking the return value
from fork
, a program can thus determine if it is in the parent
process or the child and behave accordingly:
The non-zero integer returned by fork
in the parent process
is the process id of the child. The process id is used by
the kernel to uniquely identify each process. A process can obtain
its process id by calling getpid.
The child process is initially in the same state as the parent process
(same variable values, same open file descriptors). This state is not
shared between the parent and the child, but merely duplicated at the
moment of the fork
. For example, if one variable is bound to a
reference before the fork
, a copy of that reference and its
current contents is made at the moment of the fork
; after the
fork
, each process independently modifies its “own”
reference without affecting the other process.
Similarly, the open file descriptors are copied at the moment of the
fork
: one may be closed and the other kept open. On the other
hand, the two descriptors designate the same entry in the file table
(residing in system memory) and share their current position: if one
reads and then the other, each will read a different part of the file;
likewise, changes in the read/write position by one process with lseek
are
immediately visible to the other.
The command leave hhmm
exits immediately, but
forks a background process which, at the time hhmm
, reports that
it is time to leave.
The program begins with a rudimentary parsing of the command line,
in order to extract the time provided. It then calculates the delay
in seconds (line 8). The time
call returns the current date, in seconds from the epoch (January 1st
1970, midnight). The function localtime splits
this duration into years, months, days, hours, minutes and seconds.
It then creates a new process using fork
. The parent process
(whose return value from fork
is a non-zero integer) terminates
immediately. The shell which launched leave
thereby returns
control to the user. The child process (whose return value from
fork
is zero) continues executing. It does nothing during the
indicated time (the call to sleep
), then displays its message and
terminates.
The system call wait
waits for one of the child processes created
by fork
to terminate and returns information about how it did.
It provides a parent-child synchronization mechanism and a very
rudimentary form of communication from the child to the parent.
The primitive system call is waitpid and the function
wait ()
is merely a shortcut for the expression waitpid [] (-1)
.
The behavior of waitpid [] p
depends on the value of p
:
p
> 0, it awaits the termination of the child with id
equal to p
.
p
= 0, it awaits any child with the same group id as the
calling process.
p
= −1, it awaits any process.
p
<−1, it awaits a child process with group id equal
to -p
.
The first component of the result is the process id of the child
caught by wait
. The second component of the result is a value of type
process_status:
WEXITED r | The child process terminated normally via
exit or by reaching the end of the program; r is the return
code (the argument passed to exit ). |
WSIGNALED s | The child process was killed by a signal
(ctrl-C, kill , etc., see chapter 4
for more information about signals); s identifies the signal. |
WSTOPPED s | The child process was halted by the signal
s ; this occurs only in very special cases where a process
(typically a debugger) is currently monitoring the execution of
another (by calling ptrace ).
|
If one of the child processes has already terminated by the time the
parent calls wait
, the call returns immediately. Otherwise, the
parent process blocks until some child process terminates (a behavior
called “rendezvous”). To wait for n child processes, one must
call wait
n times.
The command waitpid
accepts two optional flags for its first
argument: the flag WNOHANG
indicates not to wait if there is
a child that responds to the request but has not yet terminated.
In that case, the first result is 0
and the second undefined.
The flag WUNTRACED
returns the child processes that have been
halted by the signal sigstop
. The command raises the exception
ECHILD
if no child processes match p
(in particular, if
p
is -1
and the current process has no more children).
The function fork_search
below performs a linear search in an
array with two processes. It relies on the function simple_search
to perform the linear search.
After the fork
, the child process traverses the upper half of
the table, and exits with the return code 1 if it found an element
satisfying the predicate cond
, or 0 otherwise
(lines 16 and 17). The parent process
traverses the lower half of the table, then calls wait
to
sync with the child process (lines 21
and 22). If the child terminated normally, it combines
its return code with the boolean result of the search in the lower
half of the table. Otherwise, something horrible happened, and the
function fork_search
fails.
In addition to the synchronization between processes, the wait
call also ensures recovery of all resources used by the child
processes. When a process terminates, it moves into a “zombie”
state, where most, but not all, of its resources (memory, etc.) have
been freed. It continues to occupy a slot in the process table to
transmit its return value to the parent via the wait
call.
Once the parent calls wait
, the zombie process is removed from
the process table. Since this table is of fixed size, it is important
to call wait
on each forked process to avoid leaks.
If the parent process terminates before the child, the child is
given the process number 1 (usually init
) as parent. This
process contains an infinite loop of wait
calls, and will
therefore make the child process disappear once it finishes. This
leads to the useful “double fork” technique if you cannot
easily call wait
on each process you create (because you cannot
afford to block on termination of the child process,
for example).
The child terminates via exit
just after the second fork
.
The grandson becomes an orphan, and is adopted by init
. In this
way, it leaves no zombie processes. The parent immediately calls
wait
to reap the child. This wait
will not block for long
since the child terminates very quickly.
The system calls execve, execv, and execvp launch a program within the current process. Except in case of error, these calls never return: they halt the progress of the current program and switch to the new program.
The first argument is the name of the file containing the program to
execute. In the case of execvp
, this name is looked for in the
directories of the search path (specified in the environment variable
PATH
).
The second argument is the array of command line arguments with which
to execute the program; this array will be the Sys.argv
array
of the executed program.
In the case of execve
, the third argument is the environment
given to the executed program; execv
and execvp
give the current environment unchanged.
The calls execve
, execv
, and execvp
never return a
result: either everything works without errors and the process starts
the requested program or an error occurs (file not found, etc.), and
the call raises the exception Unix_error
in the calling program.
The following three forms are equivalent:
Here is a “wrapper” around the command grep
which
adds the option -i
(to ignore case) to the list of arguments:
Here’s a “wrapper” around the command emacs
which
changes the terminal type:
The process which calls exec
is the same one that executes the
new program. As a result, the new program inherits some features of
the execution environment of the program which called exec
:
The following program is a simplified command interpreter: it reads lines from standard input, breaks them into words, launches the corresponding command, and repeats until the end of file on the standard input. We begin with the function which splits a string into a list of words. Please, no comments on this horror.
We now move on to the main loop of the interpreter.
The function exec_command
executes a command and handles errors.
The return code 255 indicates that the command could not be executed.
(This is not a standard convention; we just hope that few commands
terminate with a return code of 255.) The function
print_status
decodes and prints the status information returned
by a process, ignoring the return code of 255.
Each time through the loop, we read a line from stdin
with the
function input_line
. This function raises the End_of_file
exception when the end of file is reached, causing the loop to
exit. We split the line into words, and then call fork
. The
child process uses exec_command
to execute the command. The
parent process calls wait
to wait for the command to finish and
prints the status information returned by wait
.
Add the ability to execute commands in the background if they are
followed by &
.
Answer.