一. 线程的创建

1.1 pthread_create函数

1.2 线程id的本质

二. 多线程中的异常和程序替换

2.1 多线程程序异常

2.2 多线程中的程序替换

三. 线程等待

四. 线程的终止和分离

4.1 线程函数return

4.2 线程取消 pthread_cancel

4.3 线程退出 pthread_exit

4.4 线程分离 pthread_detach

五. 总结

一. 线程的创建

1.1 pthread_create函数

函数原型：int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(start_routine)(void*), void *args)

函数功能：创建新线程

函数参数：

thread -- 输出型参数，用于获取新线程的id

attr -- 设置线程属性，一般采用nullptr，表示为默认属性

start_routine -- 新创建线程的入口函数

args -- 传入start_routine函数的参数

返回值：成功返回0，失败返回对应错误码

关于pthread系列函数的错误检查问题：

一般的Linux系统调用相关函数，都是成功返回0，失败返回-1。
但函数pthread系列函数不是，这些函数都是成功返回0，失败返回错误码，不对全局错误码进行设置。

代码1.1演示了如何通过pthread_create函数创建线程，在主函数中，分别以%lld和%x的方式输出子线程id，图1.1为代码的运行结果。

代码1.1：创建线程

#include <iostream>
#include <cstdio>
#include <cstring>
#include <pthread.h>
#include <unistd.h>

// 新建线程的入口函数
void *threadRoutine(void *args)
{
    while(true)
    {
        std::cout << (char*)args << std::endl;
        sleep(1);
    }

    return nullptr;
}

int main()
{
    pthread_t tid;   // 接收子线程id的输出型参数

    // 调用pthread_create函数创建线程
    // tid接收新线程的id，nullptr表示新线程为默认属性
    // 新线程的入口函数设为threadRoutine，参数为"thread 1"
    int n = pthread_create(&tid, nullptr, threadRoutine, (char*)"thread 1");

    if(n != 0)  // 检验新线程是否创建成功
    {
        std::cout << "error:" << strerror(n) << std::endl;
        exit(1);
    }

    while(true)
    {
        printf("main thread, tid = %lld 0x%x\n", tid, tid);
        sleep(1);
    }

    return 0;
}

1.2 线程id的本质

如1.2所示，在Linux的线程库pthread中，提供了用于维护每个线程的属性字段，包括描述线程的结构体struct pthread、线程的局部存储、线程栈等，用于对每个线程的维护。

每个线程在线程库中用于维护它的属性字段的起始地址，就是这个线程的id，换言之，线程id就是动态库（地址空间共享区）的一个地址，Linux为64位环境，因此，代码1.1输出的线程id会很大，这个值就对应地址空间共享区的位置。

为了保证每个线程的栈区是独立的，Linux采用的方法是线程栈在用户层提供，这样每个线程都会在动态线程库内部分得一块属于自身的“栈区”，这样就可以保证线程栈的独立性，而主线程的栈区，就使用进程地址空间本身的栈区。

Linux保证线程栈区独立性的方法：

子线程的栈区在用户层提供。
主线程栈区采用地址空间本身的栈区。

线程id的本质：地址空间共享区的一个地址。

二. 多线程中的异常和程序替换

2.1 多线程程序异常

在多线程程序中，如果某个线程在执行期间出现了异常，那么整个进程都可能会退出，在多线程场景下，任意一个线程出现异常，其影响范围都是整个进程。

如代码2.1创建了2个子线程，其中threadRun2函数中人为创造除0错误引发异常，发现整个进程都退出了，不会出现只有一个线程终止的现象。

结论：任意一个线程出现异常，其影响范围都是整个进程，会造成整个进程的退出。

代码2.1：多线程程序异常

#include <iostream>
#include <pthread.h>
#include <unistd.h>

void *threadRoutine1(void *args)
{
    while(true)
    {
        std::cout << (char*)args << std::endl;
        sleep(1);
    }
    return nullptr;
}

void *threadRoutine2(void *args)
{
    while(true)
    {
        std::cout << "thread 2, 除0错误！" << std::endl;
        int a = 10;
        a /= 0; 
    }
    return nullptr;
}

int main()
{
    pthread_t tid1, tid2;

    //先后创建线程1和2
    pthread_create(&tid1, nullptr, threadRoutine1, (void*)"thread 1");
    sleep(1);
    pthread_create(&tid2, nullptr, threadRoutine2, (void*)"thread 2");

    while(true)
    {
        std::cout << "main thread ... ... " << std::endl;
        sleep(1);
    }
    
    return 0;
}

2.2 多线程中的程序替换

与多线程中线程异常类似，多线程中某个线程如果进行了程序替换，那么并不会出现这个线程去运行新的程序，其他线程正常执行原来的工作的情况，而是整个进程都被替换去执行新的程序。

代码2.2在threadRoutine1函数中通过execl去执行系统指令ls，运行代码我们发现，在子线程中进行程序替换后，主线程也不再继续运行了，进程执行完ls指令，就终止了。

结论：多线程程序替换是整个进程都被替换，而不是只替换一个线程。

代码2.2：多线程程序替换

#include <iostream>
#include <cstdio>
#include <cstring>
#include <pthread.h>
#include <unistd.h>

void *threadRoutine1(void *args)
{
    while(true)
    {
        std::cout << (char*)args << std::endl;
        execl("/bin/ls", "ls", nullptr);   // 子线程中进行程序替换
        exit(0);
    }
    return nullptr;
}

int main()
{
    pthread_t tid;
    
    // 创建线程
    int n = pthread_create(&tid, nullptr, threadRoutine1, (void*)"thread 1");
    if(n != 0)  
    {
        // 检验线程创建成功与否
        std::cout << strerror(n) << std::endl;
        exit(1);
    }

    while(true)
    {
        std::cout << "main thread" << std::endl;
        sleep(1);
    }

    return 0;
}

三. 线程等待

线程等待与进程等待类似，主线程需要等待子线程退出，以获取子线程的返回值。如果主线程不等待子线程，而主线程也不退出，那么子线程就会处于“僵尸状态”，其task_struct一直得不到释放，引起内存泄漏。

通过pthread_join函数，可以实现对线程的等待。
线程等待只能是阻塞等待，不能非阻塞等待。

pthread_join函数 -- 等待线程

函数原型：int pthread_join(pthread_t thread, void **ret);

函数参数：

thread -- 等待线程的id

ret -- 输出型参数，获取线程函数的返回值

返回值：成功返回0，失败返回错误码

在代码3.1中，线程函数threadRoutine中在堆区new了5个int型数据的空间，并赋值为1~5，线程函数返回指向这块堆区资源的指针，主线程等待子线程退出，主线程可以看到这块资源。注意线程函数返回值的类型为void*，使用返回值的时候要注意强制类型转换。

代码3.1：pthread_join线程等待

#include <iostream>
#include <cstdio>
#include <cstring>
#include <pthread.h>
#include <unistd.h>

void *threadRoutine(void *args)
{
    std::cout << (char*)args << std::endl;
    int *pa = new int[5];
    for(int i = 0; i < 5; ++i)
    {
        pa[i] = i + 1;
    }
    return (void*)pa;
}

int main()
{
    pthread_t tid;
    pthread_create(&tid, nullptr, threadRoutine, (void*)"thread 1");

    int *pa = nullptr;
    // 等待线程退出，pa接收线程函数返回值
    pthread_join(tid, (void**)&pa);
    
    // 获取线程函数返回值指向的空间内的资源
    std::cout << "thread exit" << std::endl;
    for(int i = 0; i < 5; ++i)
    {
        printf("pa[%d] = %d\n", i, pa[i]);
    }

    delete[] pa;
    
    return 0;
}

四. 线程的终止和分离

可以实现线程终止的方法有：

线程函数return。
由另一个线程将当前线程取消pthread_cancel。
线程退出pthread_exit。

4.1 线程函数return

pthread_create函数的第三个参数start_routine为线程函数指针，新创建的线程就负责执行这个函数，如果这个函数运行完毕return退出，那么，线程就退出了。

但是这种方法对主线程不适用，如果主线程退出，就是进程终止了，全部线程都会退出。

结论：如果线程函数return，那么线程就退出了，但主线程return进程就退出了，不适用这种退出方式。

线程函数接收一个void*类型的参数，返回void*类型参数，如果线程函数运行到了return，那么这个线程就退出了，如代码3.1中的threadRoutine，就是采用return来终止线程的。

代码4.1验证了主线程退出的情况，设定线程函数为死循环IO输出，但是主线程在创建完子线程sleep(2)之后return，发现线程函数并没有继续运行，证明了主线程退出不适用于return这种方法来终止。

代码4.1：验证主线程不能通过return退出

// 线程函数死循环
void *threadRoutine1(void *args)
{
    while(true)
    {
        std::cout << (char*)args << std::endl;
        sleep(1);
    }
    return nullptr;
}

int main()
{
    pthread_t tid;
    
    // 创建线程
    int n = pthread_create(&tid, nullptr, threadRoutine1, (void*)"thread 1");
    std::cout << "main thread" << std::endl;
    sleep(2);   // 主线程sleep 2s后退出

    return 5;
}

4.2 线程取消 pthread_cancel

pthread_cancel函数可用于通过指定线程id，来取消线程。

pthread_cancel -- 取消线程

函数原型：int pthread_cancel(pthread_t thread)

函数参数：thread -- 被取消的线程的id

返回值：成功返回0，不成功返回非0的错误码

一般而言，采用主线程取消子线程的方式来取消线程，一个线程取消自身也是可以的，但一般不会这样做，pthread_cancel(pthread_self()) 可用于某个线程取消其自身，其中pthread_self函数的功能是获取线程自身的id。

pthread_self函数 -- 获取线程自身的id。

如果一个线程被取消了，那么就无需在主线程中通过pthread_join对这个线程进行等待，但如果使用了pthread_join对被取消的线程进行等待，那么pthread_join的第二个输出型参数会记录到线程函数的返回值为-1。

结论：如果一个线程被pthread_cancel了，那么pthread_join会记录到线程函数返回(void*)-1。

在代码4.2中，通过pthread_cancel函数，取消子线程，然后pthread_join等待子线程，输出强转为long long类型的返回值ret，记录到ret的值为-1。

代码4.2：取消子线程并等待取消了的子线程

// 线程函数
void *threadRoutine1(void *args)
{
    while(true)
    {
        std::cout << (char*)args << std::endl;
        sleep(1);
    }
    return (void*)10;
}

int main()
{
    pthread_t tid;
    
    // 创建线程
    pthread_create(&tid, nullptr, threadRoutine1, (void*)"thread 1");
    std::cout << "main thread" << std::endl;
    sleep(2);   

    pthread_cancel(tid);   // 取消id为tid的子线程

    void *ret = nullptr;
    int n = pthread_join(tid, &ret);    // 等待已经取消的线程退出
    
    std::cout << "ret : " << (long long)ret << std::endl;

    return 0;  
}

4.3 线程退出 pthread_exit

pthread_exit 函数在线程函数中，可用于指定线程函数的返回值并退出线程，与return的功能基本完全相同，注意，exit不可用于退出线程，在任何一个线程中调用exit，都在让整个进程退出。

pthread_exit 函数 -- 让某个线程退出

函数原型：void pthread_exit(void *ret)

函数参数：ret -- 线程函数的退出码（返回值）

代码4.3在线程函数中调用pthread_exit终止线程，指定返回值为(void*)111，在主线程中等待子线程，并将线程函数返回值存入ret中，输出(long long)ret的值，证明子线程返回(void*)111。

代码4.3：通过pthread_exit终止线程

#include <iostream>
#include <cstdio>
#include <cstring>
#include <pthread.h>
#include <unistd.h>

// 线程函数
void *threadRoutine1(void *args)
{
    int count = 0;
    while(true)
    {
        std::cout << (char*)args << ", count:" << ++count << std::endl;
        if(count == 3) pthread_exit((void*)111);
        sleep(1);
    }
    return nullptr;
}

int main()
{
    pthread_t tid;
    
    // 创建线程
    pthread_create(&tid, nullptr, threadRoutine1, (void*)"thread 1");
    std::cout << "main thread" << std::endl;
    sleep(5);   

    void *ret = nullptr;
    pthread_join(tid, &ret);   
    std::cout << "[main thread] child thread exit, ret:" << (long long)ret << std::endl;

    return 0;  
}

4.4 线程分离 pthread_detach

严格意义上讲，pthread_detach并不算线程退出。即使一个线程函数中使用了pthread_detach(pthread_self())对其自身进行分离，线程函数在pthread_detach之后的代码也会正常被执行。

pthread_detach一般用于不需要关心退出状态的线程，被pthread_detach分离的子线程，即使主线程不等待子线程退出，子线程也不会出现僵尸问题。

一般来说，都是线程分离其自身，当然也可以通过主线程分离子线程，但不推荐这么做。

经pthread_detach分离之后的线程，不应当pthread_join等待，如果等待一个被分离的线程，那么pthread_join函数会返回错误码。

结论：(1).pthread_detach用于将不需要关系关系退出状态的子线程分离 (2).被分离的线程不应被等待，如果被等待，那么pthread_join会返回非0错误码。

代码4.4演示了经pthread_detach分离之后线程函数继续运行，等待被分离的线程失败的情景。

代码4.4：线程分离及等待被分离的线程

#include <iostream>
#include <cstdio>
#include <cstring>
#include <pthread.h>
#include <unistd.h>

// 线程函数
void *threadRoutine1(void *args)
{
    // 子线程将其自身分离
    pthread_detach(pthread_self());

    int count = 0;
    while(true)
    {
        std::cout << (char*)args << ", count:" << ++count << std::endl;
        if(count == 3) pthread_exit((void*)111);
        sleep(1);
    }

    return (void*)10;
}

int main()
{
    pthread_t tid;
    
    // 创建线程
    pthread_create(&tid, nullptr, threadRoutine1, (void*)"thread 1");
    std::cout << "main thread" << std::endl;
    sleep(5);   

    void *ret = nullptr;
    int n = pthread_join(tid, &ret);    // 等待已经取消的线程退出 

    if(n != 0)  // 检验是否等待成功
    {
        std::cout << "wait thread error -> " << strerror(n) << std::endl;
    }

    return 0;  
}

五. 总结

pthread_create函数可以创建子线程，关于线程的管理方法及属性字段，被记录在动态库里，线程id本质上就是地址空间共享区的某个地址。
由于Linux在系统层面不严格区分进程和线程，CPU调用只认PCB，因此为了保证每个线程栈空间的独立性，子线程的栈由用户层（动态库）提供，主线程的栈区就是地址空间的栈区。
在多线程中，任何一个线程出现异常，影响范围都是整个进程，如果在某个线程中调用exec系列函数替换程序，那么整个进程都会被替换掉。
pthread_join的功能为在主线程中等待子线程，如果子线程没有被detach且不被主线程等待，那么子线程就会出现僵尸问题。
有三种方法可以终止线程：(1). 线程函数return，这种方法不适用于主线程。(2). pthread_exit 函数终止线程函数。(3). pthread_cancel 取消线程，被取消的线程不需要被等待，如果等待会记录到线程函数返回(void*)-1。
如果某个子线程的退出状态不需要关心，那么就可以通过pthread_detach分离子线程，分离后的线程不应被等待，如果被等待，那么pthread_join函数就会返回非零错误码。