关于Linux下的进程等待（进程篇）

为什么存在进程等待？进程等待是在做什么？

怎样去执行进程等待？

status

options

为什么存在进程等待？进程等待是在做什么？

代码示例：模仿僵尸进程

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int main()
{
    pid_t id = fork();

    if(id == 0)
    {
        //child process
        int cnt = 5;
        while(cnt--)
        {
            printf("I am child process! cnt: %d ,pid: %d, ppid: %d\n", cnt, getpid(), getppid());
            sleep(1);
        }

    }
    else
    {
        //father process
        while(1)
        {  
           printf("I am father process! pid: %d, ppid: %d\n", getpid(), getppid());
           sleep(1);
        }
    }

    return 0;
}

运行输出：

因为如果子进程退出，父进程不接收子进程的退出状态，就可能造成‘僵尸进程’的问题，进而造成内存泄漏。
进程一旦变成僵尸状态，发送信号 kill -9 也是不可以的，因为谁也没有办法杀死一个已经死去的进程。
其次父进程派创建子进程，是需要子进程执行相关的程序，我们需要知道。子进程执行程序，结果对还是不对，或者是否正常退出。
总结就是：父进程通过进程等待的方式，回收子进程资源，获取子进程退出信息

怎样去执行进程等待？

这里需要用到两个接口：

wait（）

#include<sys/types.h>

#include<sys/wait.h>

pid_t wait(int*status);

返回值：

成功返回被等待进程 pid ，失败返回 -1 。

参数：

输出型参数，获取子进程退出状态 , 不关心则可以设置成为 NULL

waitpid()

pid_ t waitpid(pid_t pid, int *status, int options);

返回值：

当正常返回的时候， waitpid 返回收集到的子进程的进程 ID ；

如果设置了选项 WNOHANG, 而调用中 waitpid 发现没有已退出的子进程可收集 , 则返回 0 ；

如果调用中出错 , 则返回 -1, 这时 errno 会被设置成相应的值以指示错误所在；

参数：

pid ：

Pid=-1, 等待任一个子进程。与 wait 等效。

Pid>0. 等待其进程 ID 与 pid 相等的子进程。

status:

WIFEXITED(status): 若为正常终止子进程返回的状态，则为真。（查看进程是否是正常退出）

WEXITSTATUS(status): 若 WIFEXITED 非零，提取子进程退出码。（查看进程的退出码）

options:

WNOHANG: 若 pid 指定的子进程没有结束，则 waitpid() 函数返回 0 ，不予以等待。若正常结束，则返回该子进程的ID 。

注意：这里有些细节需要明示一下：

waitpid的返回值：

返回值 > 0 表示等待子进程成功，子进程运行已结束
返回值 == 0 表示等待子进程成功，子进程正在运行
返回值 < 0 表示等待子进程失败

waitpid的参数 pid：

pid > 0 表示等待进程ID与pid相等的子进程
pid < 0 表示等待任意的子进程

示例：

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <sys/types.h>

int main()
{
    pid_t id = fork();

    if(id == 0)
    {
        //child process
        int cnt = 5;
        while(cnt--)
        {
            printf("I am child process! cnt: %d ,pid: %d, ppid: %d\n", cnt, getpid(), getppid());
            sleep(1);
        }

    }
    else
    {
        //father process
        while(1)
        {  
           printf("I am father process! pid: %d, ppid: %d\n", getpid(), getppid());
           sleep(1);

            pid_t ret = wait(NULL); //阻塞等待
            if(ret > 0) printf("wait child process success!  ret :%d\n", ret);


        }
    }

    return 0;
}

输出：

观察发现，这里的确没有存在僵尸进程的问题了。

waitpid()

示例：

     //pid_t ret = wait(NULL);  
     pid_t ret = waitpid(id, NULL, 0);//阻塞等待

输出：

如果子进程已经退出，调用wait/waitpid时，wait/waitpid会立即返回，并且释放资源，获得子进程退出信息。
如果在任意时刻调用wait/waitpid，子进程存在且正常运行，则进程可能阻塞。
如果不存在该子进程，则立即出错返回。

status

输出型参数，获取子进程退出状态

代码示例：子进程退出码设置为99，查看status是否能获得到

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <sys/types.h>

int main()
{
    pid_t id = fork();

    if(id == 0)
    {
        //child process
        int cnt = 5;
        while(cnt--)
        {
            printf("I am child process! cnt: %d ,pid: %d, ppid: %d\n", cnt, getpid(), getppid());
            sleep(1);
        }
        
        exit(99);

    }
    else
    {
        //father process

            //pid_t ret = wait(NULL);
            
            int status = 0;
            pid_t ret = waitpid(id, &status, 0);//阻塞等待

            if(ret > 0) printf("waitpid child process success!  ret :%d, status:%d\n", ret, status);

      //  while(1)
      //  {  
      //     printf("I am father process! pid: %d, ppid: %d\n", getpid(), getppid());
      //     sleep(7);
      //  }
    }

    return 0;
}

输出：

[wxq@VM-4-9-centos code_4_10]$ ./test 
I am child process! cnt: 4 ,pid: 4736, ppid: 4735
I am child process! cnt: 3 ,pid: 4736, ppid: 4735
I am child process! cnt: 2 ,pid: 4736, ppid: 4735
I am child process! cnt: 1 ,pid: 4736, ppid: 4735
I am child process! cnt: 0 ,pid: 4736, ppid: 4735
waitpid child process success!  ret :4736, status:3840
[wxq@VM-4-9-centos code_4_10]$

为什么这里status的值是3840呢？？而不是我们设置的99呢？？？

因为status并不是按照整数来整体使用的! ! !
而是按照比特位的方式，将32个比特位进行划分，退出码只占了次低8位

具体如下：

所以进程异常退出，或者崩溃,本质上是操作系统杀掉了进程（程序运行起来就是进程，此时与语言没有任何关系，只和操作系统有关)

那么?系统是如何得知这个进程有问题，又是如何杀掉这个进程的呢?---信号（在这里就不过多赘述）

既然我们已经明白了退出码是在次低8位，那通过位操作符就可以得到

"&":有0就为0，同为0就为0，同为1就为1
0xFF : 0000 0000 0000 .... .... 1111 1111

0x7F: 0000 0000 0000 .... .... 0111 1111

if(ret > 0) printf("waitpid child process success!  ret :%d, status:%d\n", ret, (status >> 8) & 0XFF);

输出：

[wxq@VM-4-9-centos code_4_10]$ vim process_wait.c 
[wxq@VM-4-9-centos code_4_10]$ make
gcc -o test process_wait.c
[wxq@VM-4-9-centos code_4_10]$ ./test 
I am child process! cnt: 4 ,pid: 9576, ppid: 9575
I am child process! cnt: 3 ,pid: 9576, ppid: 9575
I am child process! cnt: 2 ,pid: 9576, ppid: 9575
I am child process! cnt: 1 ,pid: 9576, ppid: 9575
I am child process! cnt: 0 ,pid: 9576, ppid: 9575
waitpid child process success!  ret :9576, status:99

的确通过这个途径，我们可以得到退出码。当然，我们也可以通过status，得到子进程退出的信号标号：

printf("等待子进程退出成功：ret: %d\n,子进程的信号编号：%d\n 子进程的退出码：%d\n",
                    ret, status & 0x7F, (status >> 8) & 0xFF);

输出：信号编号为0，退出成功

[wxq@VM-4-9-centos code_4_10]$ ./test 
I am child process! cnt: 4 ,pid: 14639, ppid: 14638
I am child process! cnt: 3 ,pid: 14639, ppid: 14638
I am child process! cnt: 2 ,pid: 14639, ppid: 14638
I am child process! cnt: 1 ,pid: 14639, ppid: 14638
I am child process! cnt: 0 ,pid: 14639, ppid: 14638
等待子进程退出成功：ret: 14639
,子进程的信号编号：0
 子进程的退出码：99
[wxq@VM-4-9-centos code_4_10]$

接下来可以对信号编号进行测试，看看是否准确：

测试1：

输出：信号8：SIGFPE ：浮点数错误（溢出）程序错误，此时退出码无意义

测试2：

输出：

不正常退出，退出码无意义。

所以，程序异常，不光光是内部代码有问题，也可能是外力直接杀掉(子进程代码跑完了吗?﹖不确定)

所以经过上述测试，其实通过status拿到子进程的退出码和退出信号是没有问题的。

但是，有没有发现一个问题，难道我每一次获取子进程的退出码和信号，还需要位运算吗？这不是太麻烦了，所以status提供了 - 宏！

WIFEXITED(status): 若为正常终止子进程返回的状态，则为真。（查看进程是否是正常退出）

WEXITSTATUS(status): 若 WIFEXITED 非零，提取子进程退出码。（查看进程的退出码

所以一般我们会这样来获取：

代码示例：

            //father process

            //pid_t ret = wait(NULL);
            
            int status = 0;
            pid_t ret = waitpid(id, &status, 0);//阻塞等待

          // printf("等待子进程退出成功：ret: %d\n,子进程的信号编号：%d\n 子进程的退出码：%d\n",
          //          ret, status & 0x7F, (status >> 8) & 0xFF); 

            if(ret > 0)
            {
                if(WIFEXITED(status))
                {
                    printf("子进程正常退出，退出码：%d\n", WEXITSTATUS(status));
                }
                else
                {
                    printf("子进程异常退出：%d\n", WIFEXITED(status));
                }
            }

输出：

[wxq@VM-4-9-centos code_4_10]$ vim process_wait.c 
[wxq@VM-4-9-centos code_4_10]$ make
gcc -o test process_wait.c
[wxq@VM-4-9-centos code_4_10]$ ./test 
I am child process! cnt: 4 ,pid: 20268, ppid: 20267
I am child process! cnt: 3 ,pid: 20268, ppid: 20267
I am child process! cnt: 2 ,pid: 20268, ppid: 20267
I am child process! cnt: 1 ,pid: 20268, ppid: 20267
I am child process! cnt: 0 ,pid: 20268, ppid: 20267
子进程正常退出，退出码：99
[wxq@VM-4-9-centos code_4_10]$

options

pid_t ret = waitpid(id，&status，0 );   //默认是在阻塞状态去等待子进程状态变化–退出

只有子进程退出的时候，父进程才会调用waitpid函数，进行返回(注意，父进程依旧在运行)

waitpid/wait可以在存在多个子进程的情况下，让子进程退出具有一定的顺序性，将来让父进程进行更多的收尾工作

options参数默认为0 ：代表阻塞等待
WNOHANG:非阻塞等待

WNOHANG到底是什么：

[wxq@VM-4-9-centos code_4_10]$ grep -ER 'WNOHANG' /usr/include/
/usr/include/sys/wait.h:   If the WNOHANG bit is set in OPTIONS, and that child
/usr/include/sys/wait.h:   If the WNOHANG bit is set in OPTIONS, and that child
/usr/include/bits/waitflags.h:#define	WNOHANG		1	/* Don't block waiting.  */
/usr/include/valgrind/vki/vki-linux.h:#define VKI_WNOHANG	0x00000001
/usr/include/linux/wait.h:#define WNOHANG		0x00000001
[wxq@VM-4-9-centos code_4_10]$

其实就是宏定义： #define WNOHANG 1

waitpid(id, &status, 1); 这里也可以传1，但是怕长时间忘记了1的含义，所以设置了这个宏，这里也叫做魔术数字。

所以0就是阻塞等待， 1就是非阻塞等待，只不过1被设置成了宏

那么什么是阻塞等待，什么是非阻塞等待呢？

阻塞等待：一般都是在内核中阻塞，等待被唤醒（伴随着切换）

非阻塞的等待：父进程通过调用waitpid来进行等待，如果子进程没有退出,我们waitpid这个系统调用，立马返回!

示例：

进程阻塞的本质，是进程阻塞在系统函数的内部!
这也就意味着后面的代码不再向后继续执行
当条件满足的时候，父进程被唤醒，从哪里唤醒?
是waitpid重新调用，还是从if的后面，if
（为什么？因为挂起父进程的时候，pc指针会存储父进程下一步命令的地址，换言之，pc指针会指向这里）
再继续向后执行父进程的代码

举个例子：

我给小美打电话，说，寒假作业借我抄一下

小美说，好啊，我快写完了，你要不要来我家等一下

①我一想，还有这好事，那我就去小美家等着吧，然后挂断电话 ---》此时就是阻塞调用

②不行，我是一个圣人君子，我不去，你写完了我再去拿，然后转头和好兄弟去了网吧

过了一会，我问小美，写完没啊，小美说没写完，我挂断电话

又过了一会，我继续问小美，写完没啊，小美说没写完，我又挂断电话

再过了一会，我还是问小美，写完没啊，小美说没写完，我再次挂断电话

而每一次打电话 ---》就是非阻塞调用

每一次打电话的过程，就是基于非阻塞调用的轮询检测方案!

我（代表用户） --- > 打电话（代表系统调用) ---> 小美（代表操作系统）

那么表现在代码上是什么样子呢？？？

示例：

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <sys/types.h>

int main()
{
    pid_t id = fork();

    if(id == 0)
    {
        //child process
        int cnt = 5;
        while(cnt--)
        {
            printf("I am child process! cnt: %d ,pid: %d, ppid: %d\n", cnt, getpid(), getppid());
            sleep(1);
        }
        
        exit(99);

    }
    else
    {
        //father process

        int quit = 0;
        while(!quit)
        {
            int status = 0;
            pid_t result = waitpid(-1, &status, WNOHANG); //-1表示等待任意的子进程  WNOHANG：以非阻塞的方式等待
            if(result > 0)
            {        
                printf("等待子进程退出成功，退出码：%d\n", WEXITSTATUS(status)); 
                quit = 1;
            }
            else if(result == 0 )
            {
                printf("子进程仍在运行，暂时并未退出，父进程可以继续执行自己的相关代码\n");
            }
            else
            {
                printf("waitpid error\n");
            }
            sleep(1);

            printf("hello father process\n");

        }



            //pid_t ret = wait(NULL);
            
//            int status = 0;
//            pid_t ret = waitpid(id, &status, 0);//阻塞等待
//
//          //  printf("等待子进程退出成功：ret: %d\n,子进程的信号编号：%d\n 子进程的退出码：%d\n",
//          //          ret, status & 0x7F, (status >> 8) & 0xFF); 
//
//        
//            if(ret > 0)
//            {
//                if(WIFEXITED(status))
//                {
//                    printf("子进程正常退出，退出码：%d\n", WEXITSTATUS(status));
//                }
//                else
//                {
//                    printf("子进程异常退出：%d\n", WIFEXITED(status));
//                }
//            }
//
//
      //  while(1)
      //  {  
      //     printf("I am father process! pid: %d, ppid: %d\n", getpid(), getppid());
      //     sleep(7);
      //  }
    }

    return 0;
}

输出：

[wxq@VM-4-9-centos code_4_10]$ vim process_wait.c 
[wxq@VM-4-9-centos code_4_10]$ make
gcc -o test process_wait.c
[wxq@VM-4-9-centos code_4_10]$ ./test 
子进程仍在运行，暂时并未退出，父进程可以继续执行自己的相关代码
I am child process! cnt: 4 ,pid: 10576, ppid: 10575
hello father process
子进程仍在运行，暂时并未退出，父进程可以继续执行自己的相关代码
I am child process! cnt: 3 ,pid: 10576, ppid: 10575
hello father process
子进程仍在运行，暂时并未退出，父进程可以继续执行自己的相关代码
I am child process! cnt: 2 ,pid: 10576, ppid: 10575
hello father process
I am child process! cnt: 1 ,pid: 10576, ppid: 10575
子进程仍在运行，暂时并未退出，父进程可以继续执行自己的相关代码
hello father process
I am child process! cnt: 0 ,pid: 10576, ppid: 10575
子进程仍在运行，暂时并未退出，父进程可以继续执行自己的相关代码
hello father process
子进程仍在运行，暂时并未退出，父进程可以继续执行自己的相关代码
hello father process
等待子进程退出成功，退出码：99
hello father process
[wxq@VM-4-9-centos code_4_10]$

最后思考一下：既然进程是具有独立性的，进程退出码，不也是子进程的数据吗?父进程为什么可以拿到呢?? wait/waitpid究竟干了什么呢? ? ?