Linux/Uinx 系统编程：进程管理（3）

本章来讲解进程管理的最后一部分内容。

文章目录

Linux/Uinx 系统编程：进程管理（3）
- I/O重定向
- - 原理
  - - FILE结构体的内部结构
    - 重定向的实现过程
  - scanf 与 printf
  - - scanf
    - printf
  - 重定向标准输入
  - - 重定向示例代码
- 管道
- - 管道的使用方式
  - - 管道命令处理
  - 命名管道
  - - 命名管道终端示例
    - 命名管道C程序示例

I/O重定向

在shell中，我们可以通过：>或者<来执行重定向

将文件中的内容，输入到运行的程序中或者将程序的输出输入到文件中。

但是他们的原理是怎么样的呢？

原理

实际上，sh进程有三个用于终端I/O的文件流：stdin,stdout,stderr，每一个流本质上都是指向执行映像区FILE结构体的一个指针。

下面给出FILE结构体的内容：

FILE结构体的内部结构

下面给出FILE结构体的组成，这里只提出一点，不多做介绍，感兴趣的读者可以自行查询相关资料。

#ifndef _FILE_DEFINED
struct _iobuf {
        char *_ptr;  //文件输入的下一个位置
        int   _cnt;   //当前缓冲区的相对位置
        char *_base;  //文件的起始位置
        int   _flag;  //文件标志
        int   _file;  //文件的有效性验证
        int   _charbuf;//检查缓冲区状况，若无缓冲区则不读取
        int   _bufsiz; //文件的大小
        char *_tmpfname;//临时文件名
        };
typedef struct _iobuf FILE;
#define _FILE_DEFINED
#endif  /* _FILE_DEFINED */

重定向的实现过程

上文已经说到，在每个C程序中，有三个用于输入输出的IO流。事实上，每个IO流都对应着Linux内核中的一个打开文件，用**文件描述符（数字）**表示。

stdin、stdout、stderr的文件描述符号分别为0、1、2

当某个进程复刻出一个子进程时，该子进程会继承父进程的所有打开文件。因此，子进程也具有与父进程相同的文件流和文件描述符号。

最后一句话说明：Linux内核中的IO文件只有三个，所有进程共用这些文件

scanf 与 printf

scanf和printf函数本质上都是借用上面的三个文件来进行输入输出。下面来详细介绍一下具体原理：

scanf

scanf函数的工作原理是这样的：

scanf首先会检查stdin（标准输入）这个流指向的FILE结构体中的缓冲数组是否有数据。
如果缓冲数组为空，scanf会执行read系统调用，通过FILE结构体中的文件描述符从0号文件（也就是标准输入）读取内容。
读取到的内容会被存放到缓冲数组中，然后scanf会从这个数组中读取信息。

因此他们之间的联系可以看成这样：

scanf与FILE和输入文件的关系

printf

printf函数的工作原理与scanf类似，但方向相反。以下是printf函数的基本工作原理：

printf首先将你提供的格式化字符串和参数处理成一个完整的字符串。
这个字符串首先被放入stdout（标准输出）这个流指向的FILE结构体中的缓冲数组。
如果缓冲数组满了，或者遇到了换行符，或者你调用了fflush函数，printf会执行write系统调用，通过FILE结构体中的文件描述符将缓冲数组的内容写入1号文件（也就是标准输出）。

如下图：
printf与FILE和输出文件的关系

重定向标准输入

如何做到重定向呢？

由上面的原理可知：printf函数和scanf都是通过文件描述符在文件中读取内容的，虽然中间有缓冲区，但是本质上，就是在文件中的读取，因此我们能不能尝试更改FILE中的文件描述符呢？

答案是可以的，Linux下的C语言提供了一个函数dup将fd复制到数值最小的未使用的文件描述符号中。具体使用方法如下：

#include <unistd.h>
dup(fd);

其中fd是文件描述符。

重定向示例代码

我们先创建一个文件input表示输入的内容，以替换标准输入文件（fd = 0）：

1 2 3 4 5

结尾没有换行和空格。

接下来写出重定向文件的内容：

/*************************************************************************
	> File Name: io.c
	> Author:Royi 
	> Mail:royi990001@gmail.com 
	> Created Time: Thu 01 Feb 2024 05:45:49 PM CST
	> Describe: 
 ************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
#include <time.h>
#include <fcntl.h>
#include <unistd.h>


int main() {
    int fd = open("./input", O_RDONLY);
    char ch;
    close(0);
    dup(fd);
    while ((ch = getchar()) != '\n' && ch != EOF) {
        putchar(ch);
    }
    return 0;
}

这个程序先使用open函数以只读的形式打开了input文件，返回文件描述符号fd。

接下来使用close函数关闭文件描述符0，也就是标准输入文件，此时最小的未打开的文件描述符就是0

最后使用dup(fd)，将创建的fd符号复制到最小的未打开的文件描述符的上。

经过这样的一番操作，scanf函数对应的stdin指向的FILE结构体中的fd不再是0，而是新的fd，也就是open函数返回的值。

运行程序时，会直接看到输出：

1 2 3 4 5

这样就完成了重定向。

当然C语言还有别的函数dup2，它具有两个参数：

#include <unistd.h>

dup2(fd1, fd2);

这个函数的作用是将fd1复制到fd2中，如果fd2是打开的状态，那么就先关闭它，然后进行复制，这样不需要我们调用close函数去关闭想要重定向的文件描述符，同时也可以重定向非最小未使用的文件描述符，大大提高了程序的灵活性，这里不再做解释。

管道

管道是用于进程交换数据的单项进程间通信通道。有一个读入端和写入端。

管道有两类：

普通管道(匿名管道)：用于相关进程（父子进程）。
命名管道：用于不相关进程（非父子进程）。

管道的读、写进程按照以下的方式进行同步：

管道上有数据时，读进程会根据需要读取（不会超过管道大小）
管道上没有数据时，但仍有写进程，读进程会等待数据
写进程将数据写入管道时，会唤醒读进程，使他们继续读取
如果管道没有数据也没有写进程，读进程返回0，并停止读取
如果管道仍然有写进程，读进程会等待数据
当写进程写入管道时，如果管道有空间，会根据需要尽可能多的写入
如果管道没有空间但是有读进程，写进程会等待空间
当读进程读出管道时，会唤醒等待的写进程
如果管道不再有读进程，写进程会将这种情况视为管道中断错误，并终止写入

总结：0返回值意味着管道没有数据也没有写进程。只要有写进程，读进程就不会消失。如果读进程消失但是写进程没有消失的话就报错。

管道的使用方式

进程不能通过管道给自己传输数据，原因如下：

如果进程先从管道中读取，那么将无法从读取的系统调用中返回，因为管道中没有内容，读进程会等待写进程写入，但是写进程是谁呢？是自己，这样就将自己锁死了
相反，如果写入管道的话，需要读进程接收（在4KB以内没有问题，没有超出管道大小，可以成功，但是大部分数据超过4KB），在管道满了之后，写进程会等待读进程读出，但是读进程是谁呢？是自己，也相当于把自己锁死了

使用管道时，必须有两个进程，一个作为管道的输入进程，另一个作为管道的输出进程

接下来以一个程序示例：

/*************************************************************************
	> File Name: pipe.c
	> Author:Royi 
	> Mail:royi990001@gmail.com 
	> Created Time: Thu 01 Feb 2024 09:36:16 PM CST
	> Describe: parent to child with pipe 
 ************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
#include <time.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>

int pd[2], n, i;
pid_t pid;
char line[256];


int main() {
    pipe(pd);
    printf("pd = [%d, %d]\n", pd[0], pd[1]);
    if ((pid = fork()) == -1) {
        perror("fork");
        exit(1);
    }

    // parent 
    if (pid) {
        close(pd[0]);
        printf("I'm parent %d, and I closed pd[0]\n", getpid());
        while (i++ < 10) {
            printf("Parent %d is wrinting to pipe\n", getpid());
            n = write(pd[1], "I'm your PAPA", 16);
            printf("Parent %d wrote %d bytes to pipe\n", getpid(), n);
            sleep(1);
        }
        printf("Parent %d exited.", getpid());
    } else {
        // child
        close(pd[1]);
        printf("I'm child %d, and I closed pd[1]\n", getpid());
        while (1) {
            printf("child %d is reading from pipe\n", getpid());
            if ((n = read(pd[0], line, 128))) {
                line[n] = 0;
                printf("child read %d bytes from pipe: %s\n", n, line);
            } else {
                // pipe has no data and to writer.
                exit(0);
            }
            sleep(1);
        }
    }
    return 0;
}

函数pipe()创建了一管道并且在pd[2]中返回了两个文件描述符。其中pd[0]用于从管道读取，pd[1]用于向管道中写入。
为了让两个进程交替运行，在每个循环中加入了sleep函数。
多运行几次发现，读进程在管道中没有数据时，会持续等待

运行示例结果：

pd = [3, 4]
I'm parent 3030, and I closed pd[0]
Parent 3030 is wrinting to pipe
Parent 3030 wrote 16 bytes to pipe
I'm child 3031, and I closed pd[1]
child 3031 is reading from pipe
child read 16 bytes from pipe: I'm your PAPA
Parent 3030 is wrinting to pipe
Parent 3030 wrote 16 bytes to pipe
child 3031 is reading from pipe
child read 16 bytes from pipe: I'm your PAPA
Parent 3030 is wrinting to pipe
Parent 3030 wrote 16 bytes to pipe
child 3031 is reading from pipe
child read 16 bytes from pipe: I'm your PAPA
Parent 3030 is wrinting to pipe
Parent 3030 wrote 16 bytes to pipe
child 3031 is reading from pipe
child read 16 bytes from pipe: I'm your PAPA
Parent 3030 is wrinting to pipe
Parent 3030 wrote 16 bytes to pipe
child 3031 is reading from pipe
child read 16 bytes from pipe: I'm your PAPA
Parent 3030 is wrinting to pipe
child 3031 is reading from pipe
Parent 3030 wrote 16 bytes to pipe
child read 16 bytes from pipe: I'm your PAPA
Parent 3030 is wrinting to pipe
child 3031 is reading from pipe
Parent 3030 wrote 16 bytes to pipe
child read 16 bytes from pipe: I'm your PAPA
child 3031 is reading from pipe
Parent 3030 is wrinting to pipe
Parent 3030 wrote 16 bytes to pipe
child read 16 bytes from pipe: I'm your PAPA
Parent 3030 is wrinting to pipe
child 3031 is reading from pipe
Parent 3030 wrote 16 bytes to pipe
child read 16 bytes from pipe: I'm your PAPA
Parent 3030 is wrinting to pipe
child 3031 is reading from pipe
Parent 3030 wrote 16 bytes to pipe
child read 16 bytes from pipe: I'm your PAPA
child 3031 is reading from pipe
Parent 3030 exited.%

通过对程序修改可以实现先让父进程死亡（输入进程消失），会发现接收进程返回，但是如果接收进程只读几次的话，会出现141号（BROKEN_PIPE）报错，这个报错就是因为读取进程死亡而导致管道文件失效造成的。

下面是修改后的示例代码：

/*************************************************************************
	> File Name: pipe.c
	> Author:Royi 
	> Mail:royi990001@gmail.com 
	> Created Time: Thu 01 Feb 2024 09:36:16 PM CST
	> Describe: parent to child with pipe 
 ************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
#include <time.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <signal.h>

int pd[2], n, i;
pid_t pid;
char line[256];


int main() {
    pipe(pd);
    printf("pd = [%d, %d]\n", pd[0], pd[1]);
    if ((pid = fork()) == -1) {
        perror("fork");
        exit(1);
    }

    // parent 
    if (pid) {
        signal(SIGPIPE, SIG_IGN);
        close(pd[0]);
        printf("I'm parent %d, and I closed pd[0]\n", getpid());
        while (i++ < 10) {
            printf("Parent %d is wrinting to pipe\n", getpid());
            n = write(pd[1], "I'm your PAPA", 16);
            if (n == -1) {
                perror("write");
                exit(1);
            }
            printf("Parent %d wrote %d bytes to pipe\n", getpid(), n);
            sleep(1);
        }
        printf("Parent %d exited.", getpid());
    } else {
        // child
        close(pd[1]);
        printf("I'm child %d, and I closed pd[1]\n", getpid());
        i = 0;
        while (i++ < 3) {
            printf("child %d is reading from pipe\n", getpid());
            if ((n = read(pd[0], line, 128))) {
                line[n] = 0;
                printf("child read %d bytes from pipe: %s\n", n, line);
            } else {
                // pipe has no data and to writer.
                printf("Child had not find bytes of needing to read.");
                exit(0);
            }
        }
        printf("Child is exiting...\n");
    }
    return 0;
}

BROKEN_PIPE报错
在修改中的代码上，由于write函数为系统调用，而在执行write函数时出现了管道中断的错误，因此内核会发出一个信号“BROKEN_PIPE”，导致程序直接停止，因此无法看到报错信息，在程序开头添加signal(SIGPIPE, SIG_IGN);可以忽略掉信号继续执行程序，并通过程序看到报错信息。

管道命令处理

在Linux中，管道是如何使用的呢？

命令行：

cmd1 | cmd2

包含一个管道符号“ | ”。

sh将通过一个进程运行cmd1，另一个进程运行cmd2，他们通过一个管道连接在一起。因此cmd1的输出变成cmd2的输入。下文展示了管道命令的使用方法：

当sh获取命令行cmd1 | cmd2时，会复刻出一个子进程sh，并且等待子进程sh照常终止。

子进程sh：浏览命令行中是否有|符号。在这种情况下，cmd1 | cmd2有一个管道符号。将命令函划分为头部=cmd1，尾部=cmd2。

然后子进程sh执行以下类似的代码片段：

int pd[2];
pipe(pd);
pid = fork();
if (pid) {
	close(pd[0]);
	close(1);
	dup(pd[1]);
	close(pd[1]);
	exec(head);
} else {
	close(pd[1]);
	close(0);
	dup(pd[0]);
	close(pd[0]);
	exec(tail);
}

管道写进程重定向其 fd = 1 到 pd[1]，管道读进程重定向其 fd = 0 到 pd[0]。这样一来就可以通过管道连接起来了。

命名管道

命名管道又叫FIFO，它们有”名称“，并且在文件系统中以特殊文件的形式存在。

它们会一直存在下去，直到使用rm和unlink将其删除。它们可与非相关进程一起使用，并不局限于管道创建进程的子进程。

命名管道终端示例

在sh中，通过mknod命令创建一个命名管道：

mknod mypipe p

或者在C程序中发出mknod()系统调用：

int r = mknod("mypipe", S_IFIFO, 0);

两种方式都会在当前目录中创建一个名为mypipe的管道文件。

使用：

ls -l mypipe

可以查看文件属性。

创建mypipe
其中，数字1代表连接数，0代表大小。

进程可以像访问普通文件一样访问命名管道。

对命名管道的写入和读取是由Linux内核同步的。

如何使用这个管道呢？我们来做一个最基本的演示：

我们需要两个sh，这里我以我的服务器为例，打开两个sh终端：

在第一个终端上的该目录下执行：

echo "hello" > mypipe

将”hello“重定向到mypipe中，如下图：

终端1

此时会发现陷入了阻塞状态。

这是因为管道中存在内容还没有读出，sh进程正在等待。

接下来在第二个终端上，执行：

cat mypipe

读出管道中的文件，如下图：

终端2

读出管道中的内容之后，第一个终端也退出了阻塞状态：

终端1

读者可以自行尝试一下。

命名管道C程序示例

如何在C程序中使用命名管道呢？

接下来展示示例代码：

/*************************************************************************
	> File Name: named_pipe.c
	> Author:Royi 
	> Mail:royi990001@gmail.com 
	> Created Time: Fri 02 Feb 2024 03:20:10 PM CST
	> Describe: 
 ************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
#include <time.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

char *line = "testing named pipe";

int main() {
    int fd;
    mknod("mypipe", S_IFIFO | 0677, 0);
    fd = open("./mypipe", O_WRONLY);
    write(fd, line, strlen(line));
    close(fd);
    return 0;
}

/*************************************************************************
	> File Name: named_pipe.c
	> Author:Royi 
	> Mail:royi990001@gmail.com 
	> Created Time: Fri 02 Feb 2024 03:20:10 PM CST
	> Describe: 
 ************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
#include <time.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

char buf[128];

int main() {
    int fd = open("./mypipe", O_RDONLY);
    read(fd, buf, 128);
    printf("%s\n", buf);
    close(fd);
    return 0;
}