[OSTEP] 3. Process API

ch5: 프로세스를 다루는 API들을 알아보자

2024.07.16.

CH5: Interlude: Process API

CH5: Interlude: Process API

discuss process creation in UNIX systems

1. The `fork()` System Call

used to create a new process

Example

Figure 5.1: Calling fork() (p1.c)

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(int argc, char* argv[]) {
	printf("hello world (pid:%d)\n", (int)getpid());
	int rc = fork();
	if (rc < 0) {
		// fork failed
		fprintf(stderr, "fork failed\n");
		exit(1);
	}
	else if (rc == 0) {
		// child (new process)
		printf("hello, I am child (pid:%d)\n", (int)getpid());
	}
	else {
		// parent goes down this path (main)
		printf("hello, I am parent of %d (pid:%d)\n",
			rc, (int)getpid());
	}
	return 0;
}

output

prompt> ./p1
hello world (pid:29146)
hello, I am parent of 29147 (pid:29146)
hello, I am child (pid:29147)
prompt>

process identifier (PID)
- used to name the process in UNIX systems

process(PID 29146) prints out a hello world message
process(PID 29146) calls fork() system call
- The odd part:
  - the process that is created is an (almost) exact copy of the calling process
  - to the OS, it looks like there are two copies of the program p1 running, and both are about to return from the fork() system call.
  - the child doesn't start running at main(), rather, it just comes into like as if it had called fork() itself
- che chilid isn't ans exact copy
  - the value it returns to the caller of fork() is different:
    - the parent receives the PID of the child
    - the child receives a return code of zero

the output (of p1.c) is not deterministic
- Assuming we are running on a system with a single CPU (for simplicity), then either the child or the parent might run at that point.
- output below can be happen:
```
prompt> ./p1
hello world (pid:29146)
hello, I am child (pid:29147)
hello, I am parent of 29147 (pid:29146)
prompt>
```
- the CPU scheduler determines which process runs at a given moment in time

2. The `wait()` System Call

make parent to wait for a child process to finish what it has been doing.

Example

Figure 5.2: Calling fork() And wait() (p2.c)

parent process calls wait() to delay its execution until the child finishes executing.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(int argc, char* argv[]) {
	printf("hello world (pid:%d)\n", (int)getpid());
	int rc = fork();
	if (rc < 0) { // fork failed; exit
		fprintf(stderr, "fork failed\n");
		exit(1);
	}
	else if (rc == 0) { // child (new process)
		printf("hello, I am child (pid:%d)\n", (int)getpid());
	}
	else { // parent goes down this path (main)
		int rc_wait = wait(NULL);
		printf("hello, I am parent of %d (rc_wait:%d) (pid:%d)\n",
			rc, rc_wait, (int)getpid());
	}
	return 0;
}

output

prompt> ./p2
hello world (pid:29266)
hello, I am child (pid:29267)
hello, I am parent of 29267 (rc_wait:29267) (pid:29266)
prompt>

the child will always print first.

fork() system call won’t return until the child has run and exited

3. The `exec()` System Call

run a program that is different from the calling program

On Linux, there are six variants of exec(): execl(), execlp(), execle(), execv(), execvp(), and execvpe()

Example

Figure 5.3: Calling fork(), wait(), And exec() (p3.c)

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>

int main(int argc, char* argv[]) {
	printf("hello world (pid:%d)\n", (int)getpid());
	int rc = fork();
	if (rc < 0) { // fork failed; exit
		fprintf(stderr, "fork failed\n");
		exit(1);
	}
	else if (rc == 0) { // child (new process)
		printf("hello, I am child (pid:%d)\n", (int)getpid());
		char* myargs[3];
		myargs[0] = strdup("wc"); // program: "wc" (word count)
		myargs[1] = strdup("p3.c"); // argument: file to count
		myargs[2] = NULL; // marks end of array
		execvp(myargs[0], myargs); // runs word count
		printf("this shouldn’t print out");
	}
	else { // parent goes down this path (main)
		int rc_wait = wait(NULL);
		printf("hello, I am parent of %d (rc_wait:%d) (pid:%d)\n",
			rc, rc_wait, (int)getpid());

	}
	return 0;
}

output
```
prompt> ./p3
hello world (pid:29383)
hello, I am child (pid:29384)
29 107 1030 p3.c
hello, I am parent of 29384 (rc_wait:29384) (pid:29383)
prompt>
```
- given the name of an executable (e.g., wc), and some arguments (e.g., p3.c), it loads code (and static data) from that executable and overwrites its current code segment (and current static data) with it; the heap and stack and other parts of the memory space of the program are re-initialized.
- Then the OS simply runs that program, passing in any arguments as the argv of that process.
- Thus, it does not create a new process; rather, it transforms the currently running program (formerly p3) into a different running program (wc).
- After the exec() in the child, it is almost as if p3.c never ran; a successful call to exec() never returns.

4. Why?

the separation of `fork()` and `exec()`

essential in build a UNIX shell
- it lets the shell run code after the call to fork() but before the call to exec()

redirection

prompt> wc p3.c > newfile.txt

the output of the program wc is redirected into the output file newfile.txt
when the child is created, before calling exec(), the shell closes standard output and opens the file newfile.txt
program below do the same.

Figure 5.4: All Of The Above With Redirection (p4.c)

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <sys/wait.h>

int main(int argc, char* argv[]) {
	int rc = fork();
	if (rc < 0) {
		// fork failed
		fprintf(stderr, "fork failed\n");
		exit(1);

	}
	else if (rc == 0) {
		// child: redirect standard output to a file
		close(STDOUT_FILENO);
		open("./p4.output", O_CREAT | O_WRONLY | O_TRUNC, S_IRWXU);

		// now exec "wc"...
		char* myargs[3];
		myargs[0] = strdup("wc"); // program: wc (word count)
		myargs[1] = strdup("p4.c"); // arg: file to count
		myargs[2] = NULL; // mark end of array
		execvp(myargs[0], myargs); // runs word count

	}
	else {
		// parent goes down this path (main)
		int rc_wait = wait(NULL);
	}
	return 0;
}

output
```
prompt> ./p4
prompt> cat p4.output
32 109 846 p4.c
prompt>
```
1. When p4 is run, it looks as if nothing has happend. However, p4 did indeed call fork() and then run the wc program via a call to execvp().
2. the output has been redirected to the file p4.output
3. When we cat the output file, we can find the output string

pipe

the output of one process is connected to an inkernel pipe, and the input of another process is connected to that same pipe
Unix pipes are implemented in a similar way, but with pipe() system call

5. Process Control

There are a lot of other interfaces for interacting with processes in UNIX systems

kill() system call is used to send signals to a process
in UNIX shells, certain keystroke combinations are configured to deliver a specific gignal to the currently running process
- control-c: SIGINT(interrupt, normally terminating a process)
- control-z: SIGTSTP(stop), you can resume it later

Table Of Contents

CH5: Interlude: Process API

1. The fork() System Call

Example

2. The wait() System Call

Example

3. The exec() System Call

Example

4. Why?

the separation of fork() and exec()

redirection

pipe

5. Process Control

1. The `fork()` System Call

2. The `wait()` System Call

3. The `exec()` System Call

the separation of `fork()` and `exec()`