📖 前言：本期介绍进程控制（创建、终止、等待、替换）。

🕒 1. 进程创建
- 🕘 1.1 fork函数初识
- 🕘 1.2 fork的返回值问题
- 🕘 1.3 写时拷贝
- 🕘 1.4 创建多个进程
🕒 2. 进程终止
- 🕘 2.1 进程退出码
- 🕘 2.2 进程如何退出
🕒 3. 进程等待
- 🕘 3.1 进程等待的必要性
- 🕘 3.2 进程等待的方法
- - 🕤 3.2.1 回收子进程资源wait
  - 🕤 3.2.2 获取子进程的退出信息waitpid
  - 🕤 3.2.3 获取子进程status
- 🕘 3.3 再谈进程退出
- 🕘 3.4 进程的阻塞和非阻塞等待
- - 🕤 3.4.1 阻塞状态VS非阻塞状态
🕒 4. 进程替换
- 🕘 4.1 演示
- 🕘 4.2 原理
- - 🕤 4.2.1 多进程问题
- 🕘 4.3 替换函数
- - 🕤 4.3.1 execlp
  - 🕤 4.3.2 execv
  - 🕤 4.3.3 execvp
  - 🕤 4.3.4 execle
- 🕘 4.4 调用自己创建的程序
- 🕘 4.5 应用场景：模拟shell命令行解释器
- - 🕤 4.5.1 当前路径的概念
  - 🕤 4.5.2 改变当前路径：chdir函数

🕒 1. 进程创建

🕘 1.1 fork函数初识

前面我们提到过，fork函数就是从已存在进程中创建一个新进程。新进程为子进程，而原进程为父进程。

#include <unistd.h>
pid_t fork(void);
// 返回值：子进程中返回0，父进程返回子进程id，出错返回-1

那么在调用fork函数之前只有一个进程，当进程调用fork时，当控制转移到内核中的fork代码后，内核做：

分配新的内存块和内核数据结构给子进程（内核数据结构：PCB地址空间+页表，构建对应的映射关系）
将父进程部分数据结构内容拷贝至子进程
添加子进程到系统进程列表当中（哈希表存储）
fork返回，开始调度器调度

🕘 1.2 fork的返回值问题

如何理解fork函数有两个返回值问题？

在这里插入图片描述
对于fork函数，当调用时，fork函数内部会有两个执行流，对应父进程和子进程，当fork函数内部代码执行完毕后，子进程也就被创建好了并有可能在OS的运行队列中准备被调度了，父进程和子进程各自执行return，这样在main()函数中调用fork函数时，从fork返回的两个执行流就会分别执行main()调用fork之后的代码，因此我们之前所看到的两个结果就是父子进程对应的执行流所造成的。

如何理解fork返回之后，给父进程返回子进程pid，给子进程返回0？

根据实际生活我们知道父亲与孩子的关系是1：n，因此孩子找父亲具有唯一性。而由于子进程多，父进程想具体调用某一个子进程时就需要这个子进程得有一个名字才能调用这个子进程，因此给父进程返回对应子进程的pid。

如何理解同一个id值，怎么会保存两个不同的值，让if else if同时执行？

对于pid_t id = fork()，我们知道返回的本质就是写入，所以谁先返回，谁就先写入对应的id，由于进程具有独立性，因此进程就会进行写时拷贝，因此同一个id，地址是一样的，但内容却不同。

🕘 1.3 写时拷贝

在上一篇进程地址空间中，我们提到过什么是写时拷贝，这里总结一下。

通常，父子代码共享，父子在不写入时，数据也是共享的，当任意一方试图写入，便以写时拷贝的方式各自一份副本。（虚拟内存就是进程地址空间）
在这里插入图片描述
即当我们不修改数据时，父子进程的虚拟内存所对应的物理内存都是同一块物理地址（内存），当子进程的数据被修改，那么就会将子进程修改所对应数据的物理内存出进行写时拷贝，在物理内存中拷贝一份放在物理内存的另一块空间，将子进程虚拟内存与这个新的地址通过页表进行关联。

🕘 1.4 创建多个进程

代码如下：

#include <stdio.h>
#include <unistd.h>

int main()
{
    int cnt = 0;
    while(1)
    {
        int ret = fork();
        if(ret < 0){
            printf("fork error!, cnt: %d\n", cnt);
            break;
        }
        else if(ret == 0){
            // child
            while(1) sleep(1);
        }
        else
        {
        	// partent
        	;
        }
        cnt++;
    }
    return 0;
}

由于开的进程过多，会导致整个OS崩掉，只需要重启服务器就可以解决了。

🕒 2. 进程终止

🕘 2.1 进程退出码

我们在C/C++中，在代码最后都会写上return 0;，对于这个返回值我们称它为进程退出码。对于正确的进程一般都以0作为进程退出码，而非0就作为错误的进程的退出码，因此不同的错误对应的退出码也是不同的。

退出码的意义：0：success, !0：表示失败。!0具体是多少，即表示不同的错误。

#include <stdio.h>
int addToTarget(int from, int to)
{
    int sum = 0;
    for(int i = from; i < to; i++)
    {
        sum += i;
    }
	return sum;
}
int main()
{
	int num = addToTarget(1, 100);
	if(num == 5050)
	    return 0;
	else
	    return 1;
	// 进程退出时对应的退出码  
	// 标定进程执行的结果是否正确
}

我们可以通过echo $?来查看进程退出码

[hins@VM-12-13-centos exit]$ ./mytest
[hins@VM-12-13-centos exit]$ echo $?	# 记录最近一个进程在命令行中执行完毕时对应的退出码
1
[hins@VM-12-13-centos exit]$ echo $?	
0							# 这里为什么是0呢？答案是echo也是一个进程

之前我们C语言学过strerror(n)，n为自然数，即n的不同的值就代表着不同的错误。需要头文件#include<string.h>。我们可以用下面一段代码来观察错误含义有哪些。

for(int i=0; i<200; i++)
{
    printf("%d: %s\n", i, strerror(i));
}

0：Success
1：0peration not permitted
2：No such file or directory
3：No such process
4：Interrupted system call
5：Input/output error
6：No such device or address
......

综上我们可以总结一下：

代码运行结束，结果正确 ——— return 0;
代码运行结束，结果不正确———return !0; （退出码这个时候起效果。确定对应的错误）
代码运行异常，退出码无意义。

🕘 2.2 进程如何退出

main函数return返回

这也是我们经常用的方式

任意地方调用 exit(code)退出

code为退出码，下面就演示一下：

#include <stdio.h>
#include <stdlib.h>
int main()
{
	printf("Hello World!\n");  
	exit(12); 
}

[hins@VM-12-13-centos exit]$ ./mytest
Hello World!
[hins@VM-12-13-centos exit]$ echo $?
12

由此，当我们想知道进程如何结束的时候，可以直接观察退出码。

在函数内部exit时，进程也会直接结束，函数也不会有返回值，下面就来看看这个例子：

#include <stdio.h>
#include <stdlib.h>
int addToTarget(int from, int to)
{
    int sum = 0;
    for(int i = from; i < to; i++)
    {
        sum += i;
    }
    exit(15); 
	// return sum;
}
int main()
{
	printf("Hello World!\n");  
	int ret = addToTarget(1, 100);
	printf("%d\n", ret);  
}

[hins@VM-12-13-centos exit]$ ./mytest
Hello World!
[hins@VM-12-13-centos exit]$ echo $?
15

到exit语句就会将进程结束，后面的代码也就不会再去执行了。

_exit()退出

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int addToTarget(int from, int to)
{
    int sum = 0;
    for(int i = from; i < to; i++)
    {
        sum += i;
    }
    //exit(15); 	// 库函数
    _exit(31);		// 系统调用
	// return sum;
}
int main()
{
	printf("Hello World!\n");  
	int ret = addToTarget(1, 100);
	printf("%d\n", ret);  
}

[hins@VM-12-13-centos exit]$ ./mytest
Hello World!
[hins@VM-12-13-centos exit]$ echo $?
31

我们发现它和exit()是一样的功能。事实上，_exit()是OS调用的函数，而exit()是库函数，库函数是OS之上的函数，调用exit实际上就是exit内部调用_exit，但二者之间也会有区别，我们将换行符去掉，来演示一下：exit

int main()
{
	printf("Hello World!");  
	sleep(2);
	exit(6);		// 测试1
	// _exit(6);	// 测试2
}

请添加图片描述
可以看出如果是exit，进程结束后，会刷新缓冲区，打印的结果暂停2秒也会显示出来，_exit就不会。

因此可以得出总结：

exit终止进程，主动刷新缓冲区；
_exit终止进程，不会刷新缓冲区。

在这里插入图片描述

因此用户级的缓冲区一定在系统调用之上，具体会在基础IO的时候说明。

🕒 3. 进程等待

🕘 3.1 进程等待的必要性

之前讲过，子进程退出，父进程如果不管不顾，就可能造成“僵尸进程”的问题，进而造成内存泄漏。进程一旦变成僵尸状态，连kill -9也无能为力，因为谁也没有办法杀死一个已经死去的进程。最后，父进程派给子进程的任务完成的如何，我们需要知道。如，子进程运行完成，结果对还是不对，或者是否正常退出。父进程通过进程等待的方式，回收子进程资源，获取子进程退出信息。

🕘 3.2 进程等待的方法

🕤 3.2.1 回收子进程资源wait

我们可以通过man 2 wait打开手册了解该函数：
在这里插入图片描述

#include<sys/types.h>
#include<sys/wait.h>
pid_t wait(int*status);
返回值：
    成功返回被等待进程pid，失败返回-1。
参数：
    输出型参数，获取子进程退出状态,不关心则可以设置成为NULL

下面我们举个例子来模拟僵尸状态下的子进程被回收的结果：

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main()
{
	pid_t id = fork();
	if (id == 0)
	{
		// 子进程
		int cnt = 5;
		while (cnt)
		{
			printf("我是子进程: %d, 父进程: %d, cnt: %d\n", getpid(), getppid(), cnt--);
			sleep(1);
		}
		exit(0); // 进程退出
	}
	// 父进程
	sleep(10);	// 这样子进程会由于没有父进程回收会出现5秒Z状态
	pid_t ret = wait(NULL);
	if (id > 0)
	{
		printf("wait success：%d\n", ret);
	}
	sleep(5);
}

我们调出进程查看器观察：while :; do ps ajx | head -1 && ps ajx | grep mytest | grep -v grep; sleep 1; done

请添加图片描述

🕤 3.2.2 获取子进程的退出信息waitpid

在这里插入图片描述

pid_t waitpid(pid_t pid, int *status, int options);
返回值：
    当正常返回的时候waitpid返回收集到的子进程的进程ID；
    如果设置了选项WNOHANG,而调用中waitpid发现没有已退出的子进程可收集,则返回0；
    如果调用中出错,则返回-1,这时errno会被设置成相应的值以指示错误所在；
参数：
    pid：
        Pid=-1,等待任一个子进程。与wait等效。
        Pid>0.等待其进程ID与pid相等的子进程。
    status:
        WIFEXITED(status): 若为正常终止子进程返回的状态，则为真。（查看进程是否是正常退出）
        WEXITSTATUS(status): 若WIFEXITED非零，提取子进程退出码。（查看进程的退出码）
    options:
    	WNOHANG: 若pid指定的子进程没有结束，则waitpid()函数返回0，不予以等待。若正常结束，则返回该子进程的ID。

同样举个例子：

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main()
{
	pid_t id = fork();
	if (id == 0)
	{
		// 子进程
		int cnt = 5;
		while (cnt)
		{
			printf("我是子进程: %d, 父进程: %d, cnt: %d\n", getpid(), getppid(), cnt--);
			sleep(1);
		}
		exit(10); // 进程退出
	}
	// 父进程
	int status = 0; // 不是被整体使用的，有自己的位图结构
    pid_t ret = waitpid(id, &status, 0);
    if(id > 0)
    {
        printf("wait success: %d, sig number: %d, child exit code: %d\n", ret, (status & 0x7F), (status>>8)&0xFF);
    }
    
	sleep(5);
}

[hins@VM-12-13-centos exit]$ ./mytest
我是子进程: 24005, 父进程: 24004, cnt: 5
我是子进程: 24005, 父进程: 24004, cnt: 4
我是子进程: 24005, 父进程: 24004, cnt: 3
我是子进程: 24005, 父进程: 24004, cnt: 2
我是子进程: 24005, 父进程: 24004, cnt: 1
wait success: 24005, sig number: 0, child exit code: 10

🕤 3.2.3 获取子进程status

wait和waitpid，都有一个status参数，该参数是一个输出型参数，由操作系统填充。如果传递NULL，表示不关心子进程的退出状态信息。否则，操作系统会根据该参数，将子进程的退出信息反馈给父进程。status不能简单的当作整形来看待，可以当作位图来看待，具体细节如下图（只研究status低16比特位）：

而上面所说的实际上就是：对于这个拿到子进程的退出结果，实际上并不能直接反应出我们想要的结果，其结果是一个复合类型，我们需要将其进行拆分：
在这里插入图片描述

对于32个bit位在这里只有尾部16个bit位是有意义的，因此我们将这些拿出来，即0~7位返回0代表正常的终止信号（即没有出问题），8~15次低8位代表子进程对应的退出码。

我们往子进程加个错误，模拟代码没跑完的情况。

...
while (cnt)
{
	printf("我是子进程: %d, 父进程: %d, cnt: %d\n", getpid(), getppid(), cnt--);
	sleep(1);
	int *p = NULL;
	*p = 100;		// 错误
}
...

[hins@VM-12-13-centos exit]$ ./mytest
我是子进程: 25101, 父进程: 25100, cnt: 5
wait success: 25101, sig number: 11, child exit code: 0

可以看到sig number显示为11，代表有问题，通过kill -l可以发现问题是段错误。

在这里插入图片描述
阅读struct task_struct的源码，我们发现对于进程退出码和终止信号都在这个PCB中。即我们可以总结成一张图：

在这里插入图片描述
上述的过程我们也再总结一下：

让OS释放子进程的僵尸状态
获取子进程的退出结果（如果子进程不结束，父进程就会一直处于阻塞等待，等待子进程退出）

对于WIFEXITED(status)和WEXITSTATUS(status)的使用，接下来举个父子进程正常运行时杀掉子进程的例子：

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <assert.h>

int main()
{
	pid_t id = fork();
	assert(id != -1);

	if (id == 0)
	{
		//child
		int cnt = 50;
		while (cnt)
		{
			printf("child running, pid: %d, ppid: %d, cnt: %d\n", getpid(), getppid(), cnt--);
			sleep(1);
		}
		exit(11);
	}

	// parent
	int status = 0;
	int ret = waitpid(id, &status, 0);
	if (ret > 0)
	{
		// 是否正常退出
		if (WIFEXITED(status))
		{
			// 判断子进程运行结果是否OK
			printf("exit code: %d\n", WEXITSTATUS(status));
		}
		else
		{
			printf("child exit not normal!\n");
		}
	}
	return 0;
}

调出进程查看器观察：ps ajx | head -1 && ps ajx | grep mychild
在这里插入图片描述

🕘 3.3 再谈进程退出

进程退出会变成僵尸，会把自己的退出结果写入到自己的task_struct中
wait/waitpid 是一个系统调用，即以OS的身份进行，因此OS也有能力去读取子进程的status。

即前两条都意味着子进程的退出信号和退出结果都保留在子进程的PCB中。

🕘 3.4 进程的阻塞和非阻塞等待

首先以一个例子来解释阻塞和非阻塞：

到饭点了，你和舍友相约一起去干饭，但是舍友王者还没结束，于是你决定看着舍友打完再去干饭。第二天饭点时，你又遇到了舍友王者还没结束的情况，于是决定自己刷会儿视频，然后时不时看看舍友水晶爆了没，等到那声victory出现后，你们便一起去干饭了。

对于上述这个例子，看着舍友打完这个过程就是一种阻塞状态，因为你一直在检测舍友状态（即是否打完王者），而你也没有下一步的行动。第二天同一情况时，你选择刷会儿视频，然后时不时检测舍友状态，这种时不时检测的行为就是一种非阻塞状态，而这多次非阻塞就是一个轮询的过程。因此看舍友状态就相当于系统调用wait/waitpid，你就相当于父进程，舍友就相当于子进程。

对于阻塞等待，我们上面已经演示过，那么下面就直接上非阻塞状态的过程：

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <assert.h>

int main()
{
	pid_t id = fork();
	assert(id != -1);

	if (id == 0)
	{
		//child
		int cnt = 10;
		while (cnt)
		{
			printf("child running, pid: %d, ppid: %d, cnt: %d\n", getpid(), getppid(), cnt--);
			sleep(3);
		}
		exit(11);
	}

	// parent
	int status = 0;
	while (1)
	{
		pid_t ret = waitpid(id, &status, WNOHANG);	// WNOHANG：非阻塞->子进程没有退出，父进程检测的时候，立即返回
		if (ret == 0)
		{
			// waitpid调用成功&&子进程没退出
			// 子进程没有退出，我的waitpid没有等待失败，仅仅是检测到了子进程没退出
			printf("wait done, but child is running......\n");
		}
		else if (ret > 0)
		{
			// waitpid调用成功&&子进程退出了
			printf("wait success, exit code: %d, sign: %d\n", (status >> 8) & 0xFF, status & 0x7F );
			break;
		}
		else
		{
			// waitpid调用失败
			printf("waitpid call failed\n");
			break;
		}
		sleep(1);
	}
	return 0;
}

上述代码的意思是：子进程在执行期间，父进程则会一直等待并通过while的方式去轮询非阻塞状态，直到子进程退出。

[hins@VM-12-13-centos child]$ ./mychild
wait done, but child is running......
child running, pid: 22017, ppid: 22016, cnt: 10
wait done, but child is running......
wait done, but child is running......
child running, pid: 22017, ppid: 22016, cnt: 9
wait done, but child is running......
wait done, but child is running......
wait done, but child is running......
child running, pid: 22017, ppid: 22016, cnt: 8
wait done, but child is running......
wait done, but child is running......
wait done, but child is running......
......
child running, pid: 22017, ppid: 22016, cnt: 1
wait done, but child is running......
wait done, but child is running......
wait done, but child is running......
wait success, exit code: 11, sign: 0

如果子进程出异常了，那么父进程也能够抓到，为了演示这种情况我们在子进程中增加一个野指针的错误：

...
while (cnt)
{
	printf("child running, pid: %d, ppid: %d, cnt: %d\n", getpid(), getppid(), cnt--);
	sleep(3);
	int *p = NULL;
	*p = 100;		// 野指针问题
}
...

[hins@VM-12-13-centos child]$ ./mychild
wait done, but child is running......
child running, pid: 23075, ppid: 23074, cnt: 10
wait done, but child is running......
wait done, but child is running......
wait success, exit code: 0, sign: 11

此时子进程的退出码为0，而终止信号是11号，对于异常的进程退出，他的退出码是没有意义的，所以我们返回为0的退出码不看。

那什么时候会等待失败呢？id错误的时候会等待失败。

......
	while (1)
	{
		pid_t ret = waitpid(id+1, &status, WNOHANG);	// id+1，错误的id
		if (ret == 0)
......

[hins@VM-12-13-centos child]$ ./mychild
waitpid call failed		# 等待失败
child running, pid: 23924, ppid: 23923, cnt: 10
[hins@VM-12-13-centos child]$ child running, pid: 23924, ppid: 1, cnt: 9
child running, pid: 23924, ppid: 1, cnt: 8
[hins@VM-12-13-centos child]$ child running, pid: 23924, ppid: 1, cnt: 7
child running, pid: 23924, ppid: 1, cnt: 6
^C		# 此时子进程被领养，CTRL+C无效
[hins@VM-12-13-centos child]$ child running, pid: 23924, ppid: 1, cnt: 5
child running, pid: 23924, ppid: 1, cnt: 4
child running, pid: 23924, ppid: 1, cnt: 3
# 可以kill -9 23924 杀掉子进程
......

🕤 3.4.1 阻塞状态VS非阻塞状态

非阻塞状态的好处：不会占用父进程的所有精力，可以在轮询期间，干干别的。那么就来用代码演示一下：

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <string.h>
#include <assert.h>
#define NUM 10
typedef void(*func_t)();	// 函数指针
func_t handlerTask[NUM] ;
//样例任务
void task1() 
{
	printf("handler task1\n");
}

void task2()
{
	printf("handler task2\n");
}

void task3()
{
	printf("handler task3\n");
}

void loadTask()
{
	memset(handlerTask, 0, sizeof(handlerTask));
	handlerTask[0] = task1;
	handlerTask[1] = task2;
	handlerTask[2] = task3;
}

int main()
{
	pid_t id = fork();
	assert(id != -1);

	if (id == 0)
	{
		//child
		int cnt = 10;
		while (cnt)
		{
			printf("child running, pid: %d, ppid: %d, cnt: %d\n", getpid(), getppid(), cnt--);
			sleep(1);
		}
		exit(10);
	}
	loadTask();
	// parent
	int status = 0;
	while (1)
	{
		pid_t ret = waitpid(id, &status, WNOHANG);	// WNOHANG：非阻塞->子进程没有退出，父进程检测的时候，立即返回
		if (ret == 0)
		{
			// waitpid调用成功&&子进程没退出
			// 子进程没有退出，我的waitpid没有等待失败，仅仅是检测到了子进程没退出
			printf("wait done, but child is running......parent running other things\n");
			for (int i = 0; handlerTask[i] != 0; i++)
			{
				handlerTask[i]();	// 采用回调的方式，执行我们想让父进程在空闲的时候做的事情

			}
			
		}
		else if (ret > 0)
		{
			// waitpid调用成功&&子进程退出了
			printf("wait success, exit code: %d, sign: %d\n", (status >> 8) & 0xFF, status & 0x7F );
			break;
		}
		else
		{
			// waitpid调用失败
			printf("waitpid call failed\n");
			break;
		}
		sleep(1);
	}
	return 0;
}

[hins@VM-12-13-centos child]$ ./mychild
wait done, but child is running......parent running other things
handler task1
handler task2
handler task3
child running, pid: 27116, ppid: 27115, cnt: 10
wait done, but child is running......parent running other things
handler task1
handler task2
handler task3
child running, pid: 27116, ppid: 27115, cnt: 9

......
wait done, but child is running......parent running other things
handler task1
handler task2
handler task3
child running, pid: 27116, ppid: 27115, cnt: 1
wait done, but child is running......parent running other things
handler task1
handler task2
handler task3
wait success, exit code: 10, sign: 0

也许有同学认为非阻塞是最好的，其实不然，这两个状态是并行存在的，并没有好坏之分。

接下来小结一下：

进程等待是什么？

答：通过系统调用，让父进程等待子进程的一种方式。

进程为什么要等待？

答：释放子进程僵尸，获取子进程状态。（退出码，退出信号）

进程怎么等待？

答：通过wait/waitpid通过指定方式阻塞或者非阻塞的方式进行等待。

🕒 4. 进程替换

创建子进程的目的：

想让子进程执行父进程代码的一部分（执行父进程对应磁盘代码中的一部分）
想让子进程执行一个全新的程序（让子进程想办法加载磁盘是指定的程序，执行新程序的代码和数据，这就是进程的程序替换）

🕘 4.1 演示

在这一小节中，包含6种函数，为了演示，就在这里拿出一个函数看看进程程序替换究竟是什么样子。

int execl(const char *path, const char *arg, ...);//将指定的程序加载到内存中，让指定进程进行执行

对于一个程序加载到内存去执行，首先是找到这个程序，然后通过不同的选项去以不同的方式去执行，这与环境变量是一样的。因此对于此execl函数来讲，第一个参数path就代表找到程序对应的路径，第二个就代表选项，选哪种方式运行程序的选项；而后面的...，我们为他引入一个新的名词：可变参数列表。顾名思义我们在C语言中的scanf以及printf类的函数，无论传入多少个参数都没有限制，实际上就是可变参数列表的作用，因此，excel里的可变参数列表的作用就是让我们能在传入选项参数时能够传入任意数量的选项。（如 cmd 选项1，选项2……）

知道了这个函数功能之后，开始操作：

#include <stdio.h>
#include <unistd.h>
int main()
{
	// .c->exe->load->process->运行->执行我们所写的代码
	printf("process is running...\n");	
	// load->exe
	execl("/usr/bin/ls"/*要执行哪个程序*/,"ls","--color=auto"/*颜色方案*/, "-a", "-l", NULL/*你想怎么执行*/);	// all exec* end of NULL
	printf("process is running...\n" );
	return 0;
}

在这里插入图片描述
可以看到这个函数实现的效果与原来的指令无异，这就叫进程的程序替换。

但是我们发现第一个printf打印出来了，但是execl后面的printf却没有打印出来，这是为什么呢？通过下面理解：

🕘 4.2 原理

程序替换的本质，就是将指定程序的代码和数据加载到指定的位置（覆盖自己的代码和数据），进程替换的时候，并没有创建新的进程。

在这里插入图片描述
当我们执行代码时，就会创建进程地址空间与物理内存磁盘之间形成映射关系，当执行上面的代码时就是这样，执行第一个printf会照常打印，到了execl函数时，就会发生进程的程序替换，也就是说，我们所编写的代码会被我们调用的execl对应磁盘内部的代码覆盖，即将指定程序的代码和数据覆盖自己的代码和数据，执行这个新的代码和数据，所以我们明白了为什么execl后面的printf没有执行。

我们知道，只要是一个函数调用就有可能失败，即没有替换成功，而对于这exec系列的函数，失败了返回-1，则程序不被替换，因此execl下面的代码也会继续执行。成功替换则没有返回值。

🕤 4.2.1 多进程问题

这次我们通过fork创建子进程，并在子进程执行对应的execl函数：

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <assert.h>

int main()
{
	printf("process is running...\n");
	pid_t id = fork();
	assert(id != -1);

	if (id == 0)
	{
		// child
		// 这里的替换，会影响父进程吗？
		// 类比：命令行怎么写，这里就怎么传
		sleep(1);
		execl("/usr/bin/ls", "ls", "-a", "-l", "--color=auto", NULL);
		exit(1); //must failed
	}

	int status = 0;
	pid_t ret = waitpid(id, &status, 0);
	if (ret > 0) 
		printf("wait success: exit code: %d, sig: %d\n", (status >> 8) & 0xFF, status & 0x7F);
	return 0;
}

如果我们随便打一个不存在的位置或者程序，那么code的值就会变成-1。那这个时候，子进程调用的execl会影响父进程吗？答案当然是否定的，进位进程具有独立性，下面就来理解一下具体是什么原因：

在这里插入图片描述
当只存在一个父进程时，就会创建出上面这样的映射关系，当fork函数开始执行，子进程生成，就会创建出子进程的PCB，以及对应的虚拟内存、页表，与父进程共享对应的物理内存。而当子进程调用execl时，由于子进程发生改变，本着进程直之间具有独立性的原则，子进程就会发生写时拷贝，将共享的数据段和代码段在物理内存的另一个位置进行写时拷贝，并与新的位置形成映射，这样便不会影响到父进程。此外我们也可以看出，数据和代码都可以发生写时拷贝。

总结： 虚拟地址空间+页表保证进程独立性，一旦有执行流想替换代码或者数据，就会发生写时拷贝。

🕘 4.3 替换函数

除了execl，还有其他类似的接口，六种以exec开头的函数，统称exec函数。

#include <unistd.h>`
int execl(const char *path, const char *arg, ...);// l(list) : 表示参数采用列表。
int execlp(const char *file, const char *arg, ...);// p(path) : 有p自动搜索环境变量PATH，只要传入的名字不需要具体路径。
int execle(const char *path, const char *arg, ...,char *const envp[]);// e(env) : 表示自己维护环境变量。
int execv(const char *path, char *const argv[]);// v(vector) : 参数用数组，统一传递，而不用进行使用可变参数方案。
int execvp(const char *file, char *const argv[]);// vp就是v和p的结合。

函数解释：这些函数如果调用成功则加载新的程序从启动代码开始执行，不再返回。如果调用出错则返回-1，所以exec函数只有出错的返回值而没有成功的返回值。

函数的具体使用如下：

🕤 4.3.1 execlp

execlp("ls", "ls", "-a", "-l", "--color=auto", NULL);	// 只写一个ls也可以

上面的两个ls是不重复的，第一个ls代表着要执行谁，第二个ls代表着要怎么执行。

🕤 4.3.2 execv

char *const argv_[] = {
	    "ls",
	    "-a",
	    "-l",
	    "--color=auto",
	    NULL
};
execv("/usr/bin/ls", argv_);

🕤 4.3.3 execvp

v,p就是组合在一起

char *const argv_[] = {
	    "ls",
	    "-a",
	    "-l",
	    "--color=auto",
	    NULL
};
execvp("ls", argv_);

🕤 4.3.4 execle

注：先阅读4.4小节完再看本小节。

// mybin.c
#include <stdio.h>
#include <stdlib.h>
int main()
{
	// 系统就有
    printf("PATH:%s\n", getenv("PATH"));
    printf("PWD:%s\n", getenv("PWD"));
    // 自定义
    printf("MYENV:%s\n", getenv("MYENV"));
	printf("I'm another C program.\n");
	printf("I'm another C program.\n");
	printf("I'm another C program.\n");                                             
	return 0;
}

// myexec.c
char *const envp_[] = {
	(char*)"MYENV=11112222233334444",
	NULL
};
execle("./mybin", "mybin", NULL, envp_); // 自定义环境变量

[hins@VM-12-13-centos exec]$ ./myexec
process is running...
total 44
drwxrwxr-x  2 hins hins 4096 Feb 15 19:23 .
drwxrwxr-x 11 hins hins 4096 Feb 15 12:33 ..
-rw-rw-r--  1 hins hins  146 Feb 15 17:46 Makefile
-rwxrwxr-x  1 hins hins 8464 Feb 15 19:23 mybin
-rw-rw-r--  1 hins hins  347 Feb 15 19:23 mybin.c
-rwxrwxr-x  1 hins hins 8824 Feb 15 19:23 myexec
-rw-rw-r--  1 hins hins 1147 Feb 15 19:22 myexec.c
wait success: exit code: 0, sig: 0
PATH:(null)
PWD:(null)
MYENV:11112222233334444
I'm another C program.
I'm another C program.
I'm another C program.

发现这样使用之后，系统内部的环境变量使用不了，只能使用自定义的。这是因为我们的函数的最后一个参数的原因，即传入的环境变量，没有传入就不会使用，因此如果我们这么改：

extern char **environ;
execle("./mybin", "mybin", NULL, environ); // 系统环境变量

就会反过来：我们自定义的环境变量就不会生效，只有系统的才会生效。

但是我们想让两者同时生效，这就引入了一个前几章提到的函数：putenv

extern char **environ;
putenv((char*)"MYENV=4443332211"); //将指定环境变量导入到系统中 environ指向的环境变量表
execle("./mybin", "mybin", NULL, environ); // 系统环境变量

忽略警告，结果如下：

[hins@VM-12-13-centos exec]$ ./myexec
process is running...
total 44
drwxrwxr-x  2 hins hins 4096 Feb 15 19:42 .
drwxrwxr-x 11 hins hins 4096 Feb 15 12:33 ..
-rw-rw-r--  1 hins hins  146 Feb 15 17:46 Makefile
-rwxrwxr-x  1 hins hins 8464 Feb 15 19:23 mybin
-rw-rw-r--  1 hins hins  347 Feb 15 19:23 mybin.c
-rwxrwxr-x  1 hins hins 8952 Feb 15 19:42 myexec
-rw-rw-r--  1 hins hins 1268 Feb 15 19:42 myexec.c
wait success: exit code: 0, sig: 0
PATH:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/hins/.local/bin:/home/hins/bin
PWD:/home/hins/testLinux/exec
MYENV:4443332211
I'm another C program.
I'm another C program.
I'm another C program.

这样就满足了我们的需求。

问：对于execle函数和main函数，在进程调用的时候是谁先被调用呢？

在我们之前的代码中，main函数通常是这样的参数：（VS上没有是因为编译器在编译时自动生成）

int main(int argc, char* argv[], char* env[])

答：exec先被调用。exec系列的函数的功能是将我们的程序加载到内存中！我们知道一个程序要想运行必须加载到内存中让CPU去执行，而对于Linux来说，程序加载是通过exec系列的函数加载到内存中的，因此Linux中的exec系列函数也被称为加载器。因此我们毫无疑问，程序一定是先加载再执行main。

那main也作为函数，也需要被传参，exec系列的函数和main函数的参数关联如下：
在这里插入图片描述
事实上，他们的参数就是这种一一对应的映射关系！即main函数被exec调用！这是我们看不到的。

而对于exec系列中不带有env参数的那些函数，照样能够拿到默认的环境变量，原因是environ通过地址空间的方式让子进程拿到的。

对于虚拟地址空间，我们回忆一下，
在这里插入图片描述

对于这个命令行参数和环境变量，就是通过第三方变量environ这个虚拟地址，以这个地址作为起始就可以拿到所有的环境变量，如果需要的话，也可以通过这个虚拟地址传入到main函数参数里去使用。

以上就是程序替换的全部内容。对于execvpe，参数类型都是一样的，只不过是组合的形式出现而已，不再赘述。

此外，上面的exec类的函数，有了各种组合，观察规律发现，缺了一种组合：execve，那我们直接man execve查看对应的信息，发现其是单独出现在2号手册，而上面的那些函数都是在3号手册，最终得出一个结论：execve是唯一一个系统调用的接口，而上面的那些函数都是在execve基础上进行的封装！（封装是为了让我们有很多的选择性，提供给不同的替换场景）

现在就可以总结一下函数的特征：
$\begin{array}{|l|l|l|l|} \hline \text { 函数名 } & \text { 参数格式 } & \text { 是否带路径 } & \text { 是否使用当前环境变量 } \\ \hline \text { execl } & \text { 列表 } & \text { 不是 } & \text { 是 } \\ \hline \text { execlp } & \text { 列表 } & \text { 是 } & \text { 是 } \\ \hline \text { execle } & \text { 列表 } & \text { 不是 } & \text { 不是, 须自己组装环境变量 } \\ \hline \text { execv } & \text { 数组 } & \text { 不是 } & \text { 是 } \\ \hline \text { execvp } & \text { 数组 } & \text { 是 } & \text { 是 } \\ \hline \text { execve } & \text { 数组 } & \text { 不是 } & \text { 不是, 须自己组装环境变量 } \\ \hline \end{array}$

在这里插入图片描述

（在使用中，忽略一些参数其实也是对的，但为了理解最好不要那样做！）

🕘 4.4 调用自己创建的程序

我们在同一个目录创建mybin.c，编写代码如下：

#include<stdio.h>
int main()
{
	printf("I'm another C program.\n");
	printf("I'm another C program.\n");
	printf("I'm another C program.\n");                                             
	return 0;
}

我们需要用生成的myexec调用这个程序生成的mybin，因此在Makefile中也需要改成能够同时生成myexec和mybin的指令，对于Makefile文件，只会生成第一个程序，因此在这里这样改就可以同时生成：

.PHONY:all
all: mybin myexec

mybin:mybin.c
	gcc -o $@ $^ -std=c99
myexec:myexec.c
	gcc -o $@ $^ -std=c99
.PHONY:clean
clean:
	rm -f myexec mybin

这样处理之后，再将原myexec.c中的内容少加改动：（注，mybin不是环境变量中的内容，因此不能用带p的函数）

execl("./mybin","mybin",NULL);

处理完毕之后，结果如下：

[hins@VM-12-13-centos exec]$ make
gcc -o mybin mybin.c -std=c99
gcc -o myexec myexec.c -std=c99
[hins@VM-12-13-centos exec]$ ./myexec
process is running...
total 44
drwxrwxr-x  2 hins hins 4096 Feb 15 18:00 .
drwxrwxr-x 11 hins hins 4096 Feb 15 12:33 ..
-rw-rw-r--  1 hins hins  146 Feb 15 17:46 Makefile
-rwxrwxr-x  1 hins hins 8360 Feb 15 18:00 mybin
-rw-rw-r--  1 hins hins  168 Feb 15 17:52 mybin.c
-rwxrwxr-x  1 hins hins 8776 Feb 15 18:00 myexec
-rw-rw-r--  1 hins hins  969 Feb 15 18:00 myexec.c
wait success: exit code: 0, sig: 0
I'm another C program.
I'm another C program.
I'm another C program.

这样就通过myexe.c调用了自己创建的mybin程序了。

对于这种调用方式，是没有语言之间的隔阂的，即我们可以在多种编程语言之间互调（C++、Java、Python等）。

🕘 4.5 应用场景：模拟shell命令行解释器

我们将子进程的代码更改一下：

int main(int argc, char* argv[])
{
	printf("process is running...\n");
	pid_t id = fork();
	assert(id != -1);

	if (id == 0)
	{
		sleep(1);
		execvp(argv[1], &argv[1]);
		exit(1); //must failed
	}

	int status = 0;
	pid_t ret = waitpid(id, &status, 0);
	if (ret > 0) 
		printf("wait success: exit code: %d, sig: %d\n", (status >> 8) & 0xFF, status & 0x7F);
	return 0;
}

不传入argv[0]的原因是argv[0]代表自己的程序：myexec，这样的话就会出现死循环的情况，因为会一直调用，所以为了跳过，我们从第二个元素argv[1]的地址开始。

[hins@VM-12-13-centos exec]$ ./myexec ls -a -l --color=auto	# 依次代表argv[0]、argv[1]、argv[2]、argv[3]、argv[4]
process is running...
total 44
drwxrwxr-x  2 hins hins 4096 Feb 15 21:17 .
drwxrwxr-x 11 hins hins 4096 Feb 15 12:33 ..
-rw-rw-r--  1 hins hins  146 Feb 15 17:46 Makefile
-rwxrwxr-x  1 hins hins 8464 Feb 15 19:23 mybin
-rw-rw-r--  1 hins hins  347 Feb 15 19:23 mybin.c
-rwxrwxr-x  1 hins hins 8776 Feb 15 21:17 myexec
-rw-rw-r--  1 hins hins 1335 Feb 15 21:17 myexec.c
wait success: exit code: 0, sig: 0

那如果我们将第一个./myexec去掉，发现不就是相当于自己写了一个shell吗？因此下面我们来编写shell命令行解释器：

新建目录myshell，然后写代码

# Makefile
myshell:myshell.c
	gcc -o $@ $^ -std=c99 #-DDEBUG
.PHONY:clean
clean:
	rm -f myshell

// myshell.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <assert.h>

#define NUM 1024
#define OPT_NUM 64

char lineCommand[NUM];
char *myargv[OPT_NUM]; //指针数组

int main()
{
    while(1)
    {
        // 输出提示符
        printf("用户名@主机名 当前路径# ");
        fflush(stdout);

        // 获取用户输入, 输入的时候，输入\n
        char *s = fgets(lineCommand, sizeof(lineCommand)-1, stdin);
        assert(s != NULL);
        (void)s;
        // 清除最后一个\n , abcd\n
        lineCommand[strlen(lineCommand)-1] = 0; // ?
        //printf("test : %s\n", lineCommand);
        
        // "ls -a -l -i" -> "ls" "-a" "-l" "-i" -> 1->n
        // 字符串切割
        myargv[0] = strtok(lineCommand, " ");
        int i = 1;
        if (myargv[0] != NULL && strcmp(myargv[0], "ls") == 0)
		{
			myargv[i++] = (char*)"--color=auto";
		}
		// 如果没有子串了，strtok->NULL, myargv[end] = NULL
        while(myargv[i++] = strtok(NULL, " "));
        
        // 测试是否成功, 条件编译
#ifdef DEBUG
        for(int i = 0 ; myargv[i]; i++)
        {
            printf("myargv[%d]: %s\n", i, myargv[i]);
        }
#endif
        // 内建命令 --> echo

        // 执行命令
        pid_t id = fork();
        assert(id != -1);

        if(id == 0)
        {
            execvp(myargv[0], myargv);
            exit(1);
        }
        waitpid(id, NULL, 0);
    }
}

这样就可以很好的模拟出shell命令行解释器了，但还有一个问题：就是返回上一级路径时，对于我们这个代码是这样的情况：

[hins@VM-12-13-centos myshell]$ ./myshell
用户名@主机名 当前路径# pwd
/home/hins/testLinux/myshell
用户名@主机名 当前路径# cd ..
用户名@主机名 当前路径# pwd
/home/hins/testLinux/myshell

但是按照正常的命令行来说应该是变化的，因此下面就来尝试解决这个问题：

🕤 4.5.1 当前路径的概念

这里touch一个新的myproc.c来解释：

#include <stdio.h>
#include <unistd.h>

int main()
{
	while (1)
	{
		printf("I'm a process.%d\n", getpid());
		sleep(1);
	}
	return 0;
}

然后gcc mypro.c -o myproc
观察进程：ls /proc/进程id -al

在这里插入图片描述
当前进程的工作目录，就是当前路径。因此，若是想实现路径的改变，就需要实现进程工作目录的改变，说到这里，大家也应该明白，这个当前进程的工作目录也是可以修改的。

🕤 4.5.2 改变当前路径：chdir函数

只需要在myproc.c加上一行chdir("/home/自定义路径");即可，这样当前进程的工作目录就被修改了

回到上面，为什么我们自己写的shell，cd .. 的时候路径没有变化呢？

在上面实现的shell模拟代码中，我们fork出了子进程，子进程有自己的工作目录，因此cd更改的是子进程的工作目录，子进程执行完毕，继续用的是父进程，就是我们的shell，因此在这个过程中父进程也就是shell的工作目录并没有发生变化。

接下来将编写的模拟shell进行修改：

......
while (myargv[i++] = strtok(NULL, " "));
// 如果是cd命令，不需要创建子进程,让shell自己执行对应的命令，本质就是执行系统接口
if (myargv[0] != NULL && strcmp(myargv[0], "cd") == 0)
{
	if (myargv[1] != NULL) chdir(myargv[1]);
	continue;
}
......

[hins@VM-12-13-centos myshell]$ ./myshell
用户名@主机名 当前路径# pwd
/home/hins/testLinux/myshell
用户名@主机名 当前路径# cd ..
用户名@主机名 当前路径# pwd
/home/hins/testLinux

这样就补充了之前的不足。像cd这种不需要让我们的子进程来执行，而是让shell自己执行的命令，被称为内建/内置命令。

还有一个问题：echo内建命令。对于echo我们知道，通过echo $? 能够获得最近一次进程的退出码和终止信号。

myshell最终代码：

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <assert.h>

#define NUM 1024
#define OPT_NUM 64

char lineCommand[NUM];
char* myargv[OPT_NUM]; //指针数组
int  lastCode = 0;
int  lastSig = 0;

int main()
{
	while (1)
	{
		// 输出提示符
		printf("用户名@主机名 当前路径# ");
		fflush(stdout);

		// 获取用户输入, 输入的时候，输入\n
		char* s = fgets(lineCommand, sizeof(lineCommand) - 1, stdin);
		assert(s != NULL);
		(void)s;
		// 清除最后一个\n , abcd\n
		lineCommand[strlen(lineCommand) - 1] = 0; // ?
		//printf("test : %s\n", lineCommand);

		// "ls -a -l -i" -> "ls" "-a" "-l" "-i" -> 1->n
		// 字符串切割
		myargv[0] = strtok(lineCommand, " ");
		int i = 1;
		if (myargv[0] != NULL && strcmp(myargv[0], "ls") == 0)
		{
			myargv[i++] = (char*)"--color=auto";
		}

		// 如果没有子串了，strtok->NULL, myargv[end] = NULL
		while (myargv[i++] = strtok(NULL, " "));

		// 如果是cd命令，不需要创建子进程,让shell自己执行对应的命令，本质就是执行系统接口
		if (myargv[0] != NULL && strcmp(myargv[0], "cd") == 0)
		{
			if (myargv[1] != NULL) chdir(myargv[1]);
			continue;
		}
		if (myargv[0] != NULL && myargv[1] != NULL && strcmp(myargv[0], "echo") == 0)
		{
			if (strcmp(myargv[1], "$?") == 0)
			{
				printf("%d, %d\n", lastCode, lastSig);
			}
			else
			{
				printf("%s\n", myargv[1]);
			}
			continue;
		}
		// 测试是否成功, 条件编译
#ifdef DEBUG
		for (int i = 0; myargv[i]; i++)
		{
			printf("myargv[%d]: %s\n", i, myargv[i]);
		}
#endif
		// 内建命令 --> echo

		// 执行命令
		pid_t id = fork();
		assert(id != -1);

		if (id == 0)
		{
			execvp(myargv[0], myargv);
			exit(1);
		}
		int status = 0;
		pid_t ret = waitpid(id, &status, 0);
		assert(ret > 0);
		(void)ret;
		lastCode = ((status >> 8) & 0xFF);
		lastSig = (status & 0x7F);
	}
}

[hins@VM-12-13-centos myshell]$ ./myshell
用户名@主机名 当前路径# echo $?
0, 0
用户名@主机名 当前路径# echo 1234
1234
用户名@主机名 当前路径# ^C

OK，以上就是本期知识点“进程控制（创建、终止、等待、替换）”的知识啦~~ ，感谢友友们的阅读。后续还会继续更新，欢迎持续关注哟📌~
💫如果有错误❌，欢迎批评指正呀👀~让我们一起相互进步🚀
🎉如果觉得收获满满，可以点点赞👍支持一下哟~

❗ 转载请注明出处
作者：HinsCoder
博客链接：🔎 作者博客主页

【Linux详解】——进程控制（创建、终止、等待、替换）

目录

🕒 1. 进程创建

🕘 1.1 fork函数初识

🕘 1.2 fork的返回值问题

🕘 1.3 写时拷贝

🕘 1.4 创建多个进程

🕒 2. 进程终止

🕘 2.1 进程退出码

🕘 2.2 进程如何退出

🕒 3. 进程等待

🕘 3.1 进程等待的必要性

🕘 3.2 进程等待的方法

🕤 3.2.1 回收子进程资源wait

🕤 3.2.2 获取子进程的退出信息waitpid

🕤 3.2.3 获取子进程status

🕘 3.3 再谈进程退出

🕘 3.4 进程的阻塞和非阻塞等待

🕤 3.4.1 阻塞状态VS非阻塞状态

🕒 4. 进程替换

🕘 4.1 演示

🕘 4.2 原理

🕤 4.2.1 多进程问题

🕘 4.3 替换函数

🕤 4.3.1 execlp

🕤 4.3.2 execv

🕤 4.3.3 execvp

🕤 4.3.4 execle

🕘 4.4 调用自己创建的程序

🕘 4.5 应用场景：模拟shell命令行解释器

🕤 4.5.1 当前路径的概念

🕤 4.5.2 改变当前路径：chdir函数

相关文章