Linux知识点 – Linux多线程（一）

文章目录

Linux知识点 -- Linux多线程（一）
一、理解线程
- 1.从资源角度理解线程
- 2.执行流
- 3.多线程编程
- 4.线程的资源
- 5.线程切换的成本更低
- 6.线程的优缺点
- 7.线程异常
二、线程控制
- 1.clone函数
- 2.线程异常
- 3.线程等待
- 4.回调函数的返回值
- 5.线程退出
- 6.线程取消
- 7.线程id
- 8.线程局部存储
- 9.程序替换
- 10.分离线程
- 11.C++提供的线程库

一、理解线程

1.从资源角度理解线程

Linux中的线程：
通过一定的技术手段，将当前进程的资源，以一定的方式划分给不同的task_strcut，这里的每一个task_struct可以称之为一个线程；这些线程对应的都是统一的地址空间和页表；

这是Linux系统下的特有的线程解决方案；
从内核视角来看，进程是承担分配资源的基本实体，因为在OS看来，进程和线程的结构都是task_struct，而线程会划分进程的资源，因此进程是申请资源的最小单位；
线程是在进程内部执行，是OS调度的基本单位；
线程在进程的地址空间内运行，CPU其实并不关心执行流是进程还是线程，只关心PCB；

2.执行流

我们以前写过的代码都是内部只有一个执行流的代码，而多线程就是有多个执行流；
进程 = 执行流（至少一个） + 内核数据结构 + 进程对应的代码和数据
task_struct就是进程内部的一个执行流，CPU其实不关心当前是进程还是线程，站在CPU的角度，看到的都是task_struct；
其实Linux系统下没有真正意义上的线程结构，Linux是用进程PCB来模拟线程结构的，因此Linux下的进程统一称为：=轻量级进程；Linux系统下，在CPU眼中，看到的PCB都要比传统的进程更加轻量化；
Linux并不能直接给我们提供线程相关的接口，只能提供轻量级进程的接口；
线程的实现是在用户层实现了一套多线程方案，以库的方式提供给用户进行使用，pthread线程库就是Linux的原生线程库；

3.多线程编程

使用pthread线程库中的接口就可以实现Linux系统下的多线程；
在这里插入图片描述

参数：
thread：线程id
attr：线程属性（默认）
start_routine：函数指针，执行进程代码一部分时的入口函数
arg：传递给函数指针的参数，完成函数回调
返回值：成功返回0；失败返回错误码

makefile：

mythread:mythread.cpp
	g++ -o $@ $^ -lpthread
.PHONY:clean
clean:
	rm -f mythread

-lpthread参数的作用是引入线程库；
因为多线程的创建方法是库提供的，所以必须引入该库；
如果编译时不加-lpthread：
在这里插入图片描述
mythread.cpp

#include<iostream>
#include<string>
#include<cstdio>
#include<unistd.h>
#include<pthread.h>

using namespace std;

//pthread函数的回调函数参数，传的是返回值是void*的函数指针，函数的参数也是void*
void* threadRun(void* args)
{
    const string name = (char*)args;
    while(true)
    {
        cout << name << ", pid: " << getpid() << "\n" << endl;
        sleep(1);
    }
}


int main()
{
    pthread_t tid[5];
    char name[64];
    for(int i = 0; i < 5; i++)
    {
        snprintf(name, sizeof(name), "%s-%d", "thread", i);
        pthread_create(tid+i, nullptr, threadRun, (void*)name);
        sleep(1);//缓解传参的bug
    }

    while(true)
    {
        cout << "main thread, pid: " << getpid() << endl;
        sleep(3);
    }
    
    return 0;
}

上述代码在一个进程内创建了5个线程，每个线程都运行了threadRun函数，打印出他们的PID；
运行结果：
在这里插入图片描述
这些线程的pid都是一样的，证明线程在进程内部运行；

在用户层面上看到的只是一个进程；
可以使用ps -aL代码产看线程，-L就是查看轻量级进程的选项；

LWP是轻量级进程PID，这些是不一样的，第一个PID和LWP是一样的，就叫做主线程；
OS调度的时候，看的是LWP；
所有的线程都是使用进程的资源，一旦进程退出，所有线程都退出；

4.线程的资源

进程的多个线程共享同一地址空间，如果定义一个函数，各线程都可以调用；如果定义一个全局变量，各线程都可以访问；除此之外，各线程还共享以下资源和环境：
文件描述符；
每种信号的处理方式；
当前工作目录；
用户id和组id；
线程共享进程数据，但也拥有自己的一部分数据：
线程ID；
一组寄存器；
栈；
errno；
信号屏蔽字；
调度优先级；
（其中，栈和寄存器代表着线程的动态属性，寄存器就是线程的山下文）

5.线程切换的成本更低

切换线程不需要切换地址空间和页表；
CPU内部有L1~L3的cache（缓存：对内存中的代码和数据，根据局部性原理，预读到CPU内部；）
如果进程切换，cache就立即失效，新进程过来，只能重新缓存；而线程切换不需要刷新缓存；

6.线程的优缺点

优点
缺点

7.线程异常

单个线程如果出现除0、野指针等问题导致线程崩溃，进程也会随着崩溃；
线程是进程的执行分支，线程出现异常，就类似进程出现异常，进而触发信号机制，进程终止，该进程内所有的线程也随之退出；

二、线程控制

1.clone函数

在这里插入图片描述
fork创建多进程，底层调用的就是clone函数；
创建多线程底层也需要调用clone函数；

2.线程异常

当一个线程触发除0错误时：

#include <iostream>
#include <string>
#include <cstdio>
#include <unistd.h>
#include <pthread.h>

using namespace std;

// pthread函数的回调函数参数，传的是返回值是void*的函数指针，函数的参数也是void*
void *threadRoutine(void *args)
{
    while (true)
    {
        cout << "new thread: " << (char *)args << " running ..." << endl;
        sleep(1);
        int a = 10;
        a /= 0;
    }
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, nullptr, threadRoutine, (void *)"thread 1");

    while (true)
    {
        cout << "main thread: " << " running ..." << endl;
        sleep(1);
    }

    return 0;
}

运行结果：
在这里插入图片描述
一个线程异常退出，整个进程都会退出；

3.线程等待

线程在创建并执行的时候，主线程也是需要等待的，如果主线程不等待，就会引发类似于进程的僵尸问题，导致内存泄漏；

pthread_join接口：线程等待

参数：
thread：线程id；
retval：void**类型，用来拿到线程执行的结果；

#include <iostream>
#include <string>
#include <cstdio>
#include <unistd.h>
#include <pthread.h>

using namespace std;

// pthread函数的回调函数参数，传的是返回值是void*的函数指针，函数的参数也是void*
void *threadRoutine(void *args)
{
    int i = 0;
    while (true)
    {
        cout << "new thread: " << (char *)args << " running ..." << endl;
        sleep(1);
        if(i++ == 3) 
            break;
    }
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, nullptr, threadRoutine, (void *)"thread 1");

    pthread_join(tid, nullptr);//默认会阻塞等待新线程退出
    cout << "main thread wait done ... main quit\n ";

    // while (true)
    // {
    //     cout << "main thread: " << " running ..." << endl;
    //     sleep(1);
    // }

    return 0;
}

上面的代码，新线程运行完threadRoutine函数就会退出，而主线程pthread_join会默认阻塞等待新线程退出，然后回回收新线程资源；
监控脚本：

while :; do ps -aL | head -1 && ps -L | grep mythread; sleep 1; done

运行结果：
在这里插入图片描述
主线程会等待新线程退出后，清理完资源再退出；

4.回调函数的返回值

pthread_join的第二个参数是void**类型，是用来拿到线程执行结果的；

#include <iostream>
#include <string>
#include <cstdio>
#include <unistd.h>
#include <pthread.h>

using namespace std;

// pthread函数的回调函数参数，传的是返回值是void*的函数指针，函数的参数也是void*
void *threadRoutine(void *args)
{
    int i = 0;
    while (true)
    {
        cout << "new thread: " << (char *)args << " running ..." << endl;
        sleep(1);
        if(i++ == 3) 
            break;
    }
    cout << "new thread quit" << endl;
    return (void*)10;
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, nullptr, threadRoutine, (void *)"thread 1");

    void* ret = nullptr;
    pthread_join(tid, &ret);//默认会阻塞等待新线程退出
    cout << "main thread wait done ... main quit\n ";
    cout << "ret: " << (long long)ret << endl;

    return 0;
}

在上面代码中：

ret是指针变量，开辟了void*大小的空间，Linux是64位系统，地址为8字节；
指针是一个常量，指向一个地址，指针变量是存放指针数据的变量；
回调函数的返回值类型是void*；
果需要拿到一个void*类型的返回值，就需要使用void**类型；
这里返回的时候将10强转为void*类型，意味着10现在是一个指针，代表地址；在用void**类型拿到返回值之后，解引用一下就是10这个指针；
因为Linux地址是8字节的，因此拿到返回值后，使用的时候需要强转为long long类型

运行结果：
在这里插入图片描述
还可以拿到新线程在堆上申请空间后的返回结果：

#include <iostream>
#include <string>
#include <cstdio>
#include <unistd.h>
#include <pthread.h>

using namespace std;

// pthread函数的回调函数参数，传的是返回值是void*的函数指针，函数的参数也是void*
void *threadRoutine(void *args)
{
    int i = 0;
    int* data = new int[3];//新线程申请空间
    while (true)
    {
        cout << "new thread: " << (char *)args << " running ..." << endl;
        sleep(1);
        data[i] = i;
        if(i++ == 3) 
            break;
    }
    cout << "new thread quit" << endl;
    return (void*)data;//返回空间的地址
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, nullptr, threadRoutine, (void *)"thread 1");

    int* ret = nullptr;
    pthread_join(tid, (void**)&ret);//默认会阻塞等待新线程退出
    cout << "main thread wait done ... main quit\n ";
    for(int i = 0; i < 3; i++)
    {
        cout << ret[i] << endl;
    }
    
    return 0;
}

运行结果：
在这里插入图片描述

注：pthread_join函数不会拿到线程的退出码，这是因为一个线程出问题，整个进程都会退出；

5.线程退出

#include <iostream>
#include <string>
#include <cstdio>
#include <unistd.h>
#include <pthread.h>

using namespace std;

// pthread函数的回调函数参数，传的是返回值是void*的函数指针，函数的参数也是void*
void *threadRoutine(void *args)
{
    int i = 0;
    int* data = new int[3];//新线程申请空间
    while (true)
    {
        cout << "new thread: " << (char *)args << " running ..." << endl;
        sleep(1);
        data[i] = i;
        if(i++ == 3) 
            break;
    }
    exit(10);
    cout << "new thread quit" << endl;
    return (void*)data;//返回空间的地址
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, nullptr, threadRoutine, (void *)"thread 1");

    int* ret = nullptr;
    pthread_join(tid, (void**)&ret);//默认会阻塞等待新线程退出
    cout << "main thread wait done ... main quit\n ";
    for(int i = 0; i < 3; i++)
    {
        cout << ret[i] << endl;
    }

    return 0;
}

运行结果：
在这里插入图片描述
可以看出，新线程调用了exit函数退出后，后面的代码都不运行了；
线程内部调用exit，整个进程都会退出；

线程退出使用pthread_exit接口：

参数：
__retval：void*类型的参数，相当于进程退出码，能够用pthread_join接口拿到；

#include <iostream>
#include <string>
#include <cstdio>
#include <unistd.h>
#include <pthread.h>

using namespace std;

// pthread函数的回调函数参数，传的是返回值是void*的函数指针，函数的参数也是void*
void *threadRoutine(void *args)
{
    int i = 0;
    int* data = new int[3];//新线程申请空间
    while (true)
    {
        cout << "new thread: " << (char *)args << " running ..." << endl;
        sleep(1);
        data[i] = i;
        if(i++ == 3) 
            break;
    }
    cout << "new thread quit" << endl;
    pthread_exit((void*)11);//退出码是11
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, nullptr, threadRoutine, (void *)"thread 1");

    int* ret = nullptr;
    pthread_join(tid, (void**)&ret);//默认会阻塞等待新线程退出
    cout << "main thread wait done ... main quit\n ";

    cout << "ret: " << (long long)ret << endl;

    return 0;
}

运行结果：
在这里插入图片描述

6.线程取消

pthread_cancle：

参数：
thread：线程id；

#include <iostream>
#include <string>
#include <cstdio>
#include <unistd.h>
#include <pthread.h>

using namespace std;

// pthread函数的回调函数参数，传的是返回值是void*的函数指针，函数的参数也是void*
void *threadRoutine(void *args)
{
    while (true)
    {
        cout << "new thread: " << (char *)args << " running ..." << endl;
        sleep(1);
    }
    cout << "new thread quit" << endl;
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, nullptr, threadRoutine, (void *)"thread 1");

    int count = 0;
    while (true)
    {
        cout << "main thread: "
             << " running ..." << endl;
        sleep(1);
        count++;
        if (count == 5)
            break;
    }

    pthread_cancel(tid);
    cout << "pthread cancle: " << tid << endl;

    int *ret = nullptr;
    pthread_join(tid, (void **)&ret); // 默认会阻塞等待新线程退出
    cout << "main thread wait done ... main quit\n ";
    sleep(5);
    return 0;
}

上面的代码：新线程一直运行，由主线程取消新线程；
运行结果：
在这里插入图片描述

注意：
（1）线程被取消，主线程等待的时候，退出码是-1；

（2）取消进程的使用场景为：新线程已经跑了一段时间了，主线程想要取消它；
（3）不要使用新线程取消主线程，因为主线程承担调用join去回收新线程；

7.线程id

打印出线程id：
在这里插入图片描述

与LWP是不同的：

因为线程id的本质是一个地址；
我们目前使用的不是Linux自带的创建线程的接口，我们用的是pthread库中的接口；
线程运行需要有独立的栈区，保证栈区是每一个线程独占的方法是：由用户层提供栈区；
在这里插入图片描述

pthread库内部会管理维护每个线程的私有数据；
线程的栈区是由库提供的，就在共享区；
因此线程id就代表该线程在库内部的属性数据的起始地址；
主线程用的是内核级的栈区，新线程用的是库提供的栈区，在共享区；

在这里插入图片描述
新线程的栈区底层就是用clone函数实现的，child_stack参数就是栈区的地址；

线程可以自己获取自己的id：

#include <iostream>
#include <string>
#include <cstdio>
#include <unistd.h>
#include <pthread.h>

using namespace std;

// pthread函数的回调函数参数，传的是返回值是void*的函数指针，函数的参数也是void*
void *threadRoutine(void *args)
{
    while (true)
    {
        cout << "new thread: " << (char *)args << " running ..." << pthread_self() << endl;
        sleep(1);
    }
    cout << "new thread quit" << endl;
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, nullptr, threadRoutine, (void *)"thread 1");

    while (true)
    {
        cout << "main thread: "
             << " running ..." << endl;
        sleep(1);
    }

    int *ret = nullptr;
    pthread_join(tid, (void **)&ret); // 默认会阻塞等待新线程退出
    cout << "main thread wait done ... main quit\n ";
    return 0;
}

运行结果：
在这里插入图片描述

8.线程局部存储

#include <iostream>
#include <string>
#include <cstdio>
#include <unistd.h>
#include <pthread.h>

using namespace std;

int g_val = 0;

// pthread函数的回调函数参数，传的是返回值是void*的函数指针，函数的参数也是void*
void *threadRoutine(void *args)
{
    while (true)
    {
        cout << (char *)args << " : " << g_val << &g_val << endl;
        sleep(1);
        g_val++;
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, nullptr, threadRoutine, (void *)"thread 1");

    while (true)
    {
        cout << "main thread: " << " : " << g_val << &g_val << endl;
        sleep(1);
    }
    
    return 0;
}

上面的代码创建了一个全局变量，由主线程和新线程同时访问：
在这里插入图片描述
可以看到，所有进程是共享全局变量的；

__thread：将全局变量变成每个线程私有：
在全局变量前面加上__thread修饰：

两个线程就访问的是各自的全局变量了，让每一个线程各自拥有一个全局变量；

9.程序替换

如果在线程的内部调用函数替换：

#include <iostream>
#include <string>
#include <cstdio>
#include <unistd.h>
#include <pthread.h>

using namespace std;

__thread int g_val = 0;

// pthread函数的回调函数参数，传的是返回值是void*的函数指针，函数的参数也是void*
void *threadRoutine(void *args)
{
    sleep(5);
    execl("/bin/ls", "ls");
    
    while (true)
    {
        cout << (char *)args << " : " << g_val << " &: " << &g_val << endl;
        sleep(1);
        g_val++;
    }
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, nullptr, threadRoutine, (void *)"thread 1");

    while (true)
    {
        cout << "main thread: " << " : " << g_val << " &: "<< &g_val << endl;
        sleep(1);
        //count++;
        // if (count == 5)
        //     break;
    }

    return 0;
}

运行结果：
在这里插入图片描述
在线程内部进行程序替换，会导致整个进程都被替换，整个进程直接执行替换的程序；

10.分离线程

默认情况下，新创建的线程是joinable的，线程退出后，需要对其进行pthread_join操作，否组无法释放资源；
如果不关心线程的返回，join是一种负担，这个时候可以告诉系统，当线程退出时，自动释放线程资源，这就是线程分离；
pthread_detach：线程分离接口；

#include <iostream>
#include <string>
#include <cstdio>
#include <cstring>
#include <cerrno>
#include <unistd.h>
#include <pthread.h>

using namespace std;

__thread int g_val = 0;

// pthread函数的回调函数参数，传的是返回值是void*的函数指针，函数的参数也是void*
void *threadRoutine(void *args)
{
    pthread_detach(pthread_self());
    while (true)
    {
        cout << (char *)args << " : " << g_val << " &: " << &g_val << endl;
        sleep(1);
        g_val++;
        // data[i] = i;
        // if(i++ == 3)
        break;
    }

    pthread_exit((void*)11);//退出码是11
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, nullptr, threadRoutine, (void *)"thread 1");

    while (true)
    {
        cout << "main thread: " << " : " << g_val << " &: "<< &g_val << endl;
        sleep(1);
        break;
    }

    int n = pthread_join(tid, nullptr);
    cout << "errstring: " << strerror(n) << endl;

    return 0;
}