Chapter 27 Interlude: The Threading API

第 27 章 插叙:线程 API

线程 API 介绍 和 使用...

1. 线程创建

#include <pthread.h>

int pthread_create(
    pthread_t *thread,              // 与线程交互的句柄, 传入以便初始化
    const pthread_attr_t *attr,     // 线程属性(栈大小, 调度优先级)...可以为NULL
    void *(*start_routine)(void *), // 线程执行的函数
    void *arg                       // 线程执行函数的参数
);

2. 线程完成

You might notice that the use of pthread_create() to create a thread, followed by an immediate call to pthread_join(), is a pretty strange way to create a thread. In fact, there is an easier way to accomplish this exact task; it’s called a procedure call. Clearly, we’ll usually be creating more than just one thread and waiting for it to complete, otherwise there is not much purpose to using threads at all.

#include <stdio.h>
#include <pthread.h>
#include <assert.h>
#include <stdlib.h>
#include "common.h"
#include "common_threads.h"

typedef struct myarg_t
{
  int a;
  int b;
} myarg_t;

typedef struct myret_t
{
  int x;
  int y;
} myret_t;

void *mythread(void *arg)
{
  myarg_t *m = (myarg_t *)arg;
  printf("%d %d\n", m->a, m->b);
  myret_t *r = Malloc(sizeof(myret_t));
  r->x = 1;
  r->y = 2;
  return (void *)r;
}

int main(int argc, char *argv[])
{
  pthread_t p;
  myret_t *m;
  myarg_t args;
  args.a = 10;
  args.b = 20;
  Pthread_create(&p, NULL, mythread, (void *)&args);
  Pthread_join(p, (void **)&m);
  printf("%d %d\n", m->x, m->y);
  return 0;
}

We should note that not all code that is multi-threaded uses the join routine. For example, a multi-threaded web server might create a number of worker threads, and then use the main thread to accept requests and pass them to the workers, indefinitely. Such long-lived programs thus may not need to join.

However, a parallel program that creates threads to execute a particular task (in parallel) will likely use join to make sure all such work completes before exiting or moving onto the next stage of computation.

3. 锁

POSIX 线程库提供的最有用的函数集:LOCK

提供互斥进入临界区的那些函数...

最基本的一对

3.1. 当有一段代码是临界区的时候

The intent of the code is as follows:

  • if no other thread holds the lock when pthread mutex lock() is called, the thread will acquire the lock and enter the critical section.

  • If another thread does indeed hold the lock, the thread trying to grab the lock will not return from the call until it has acquired the lock (implying that the thread holding the lock has released it via the unlock call).

  • Of course, many threads may be stuck waiting inside the lock acquisition function at a given time; only the thread with the lock acquired, however, should call unlock.

遗憾的是,这段代码有两个重要的问题...

  1. 缺乏正确的初始化

    1. 使用 PTHREAD_MUTEX_INITIALIZER 来初始化

    2. 或调用 pthread_mutex_init(&lock, NULL);

  2. 获取锁和释放锁没有检查错误码

    1. 可以进行包装函数...

    2. #define Pthread_mutex_lock(m) assert(pthread_mutex_lock(m) == 0);

还有其他的函数对...

3.2. 用于获取锁

如果锁被占用:

  • trylock 会失败

  • timedlock 会在超时获取锁后返回

    • 以先发生者为准

这些都是我们后续聊到死锁的好的解决方案...

4. 条件变量

condition variable...当线程之间必须发生某种信号时,如果一个线程在等待另一个线程继续执行某些操作,条件变量就很有用。

4.1. 调用

线程在调用条件变量函数的时候,必须先拥有这把锁...

不要省略成这样

引入 CPU 自旋...反而性能更差...

引入竞态条件,会出错...

而且也不要使用 if(read == 0)...未及时检查更新可能导致未知后果...

5. 编译和允许

使用线程的 API

  • 要添加头文件 pthread.h

  • 链接时需要 pthread 库:-lpthread 标记

但他能不能工作,完全是另一回事...

5.1. ASIDE: THREAD API GUIDELINES

There are a number of small but important things to remember when you use the POSIX thread library (or really, any thread library) to build a multi-threaded program. They are:

  • Keep it simple. Above all else, any code to lock or signal between threads should be as simple as possible. Tricky thread interactions lead to bugs.

  • Minimize thread interactions. Try to keep the number of ways in which threads interact to a minimum. Each interaction should be carefully thought out and constructed with tried and true approaches (many of which we will learn about in the coming chapters).

  • Initialize locks and condition variables. Failure to do so will lead to code that sometimes works and sometimes fails in very strange ways.

  • Check your return codes. Of course, in any C and UNIX programming you do, you should be checking each and every return code, and it’s true here as well. Failure to do so will lead to bizarre and hard to understand behavior, making you likely to (a) scream, (b) pull some of your hair out, or (c) both.

  • Be careful with how you pass arguments to, and return values from, threads. In particular, any time you are passing a reference to a variable allocated on the stack, you are probably doing something wrong.

  • Each thread has its own stack. As related to the point above, please remember that each thread has its own stack. Thus, if you have a locally-allocated variable inside of some function a thread is executing, it is essentially private to that thread; no other thread can (easily) access it. To share data between threads, the values must be in the heap or otherwise some locale that is globally accessible.

  • Always use condition variables to signal between threads. While it is often tempting to use a simple flag, don’t do it.

  • Use the manual pages. On Linux, in particular, the pthread man pages are highly informative and discuss many of the nuances presented here, often in even more detail. Read them carefully!

6. 作业(编码作业)

又被中文版阉割了...

In this section, we’ll write some simple multi-threaded programs and use a specific tool, called helgrind, to find problems in these programs. Read the README in the homework download for details on how to build the programs and run helgrind.

6.1. 您需要查看几个不同的 C 程序:

  • main-race.c:一个简单的竞争条件

  • main-deadlock.c:一个简单的死锁

  • main-deadlock-global.c:解决死锁问题

  • main-signal.c:一个简单的子/父信号示例

  • main-signal-cv.c:通过条件变量更有效地发出信号

  • common_threads.h:带有包装器的头文件,使代码检查错误并更具可读性

6.2. Question1 & 2

共享变量 + 双线程访问 == 代码

其中

表示 有 data race...

取消共享变量访问,修改之后 || 加锁...之后

没有报错了

6.3. Q3 & Q4 & Q5

Helgrind 通过检测锁的获取顺序来识别潜在的死锁问题。它会记录每个锁的获取顺序,并检查是否存在违反顺序的情况。

死锁的典型场景:

  • 线程 A 和线程 B 分别以相反的顺序获取两个锁,导致循环等待:

    • 线程 A 持有锁 m1 ,等待锁 m2 。

    • 线程 B 持有锁 m2 ,等待锁 m1 。

全局锁...

一定程度能解决思索问题,但是还是会报错的!

全局锁 g 确实确保了线程在获取其他锁之前必须先获取全局锁,但这并不能解决线程在全局锁保护下以不同顺序获取其他锁的问题。换句话说,全局锁只是确保了线程在获取其他锁之前必须先获取全局锁,但它并没有强制线程以一致的顺序获取其他锁。

6.4. Q6 & Q7 & Q8

报告:

加上条件变量

Last updated