Chapter 26 Concurrency: an introduction

第 26 章并发：介绍

本章介绍单个运行进程提供的新抽象：线程（thread）

线程 && 进程

栈的区别...

简单传统的进程地址空间模型：单线程 - single-threaded...
- 只有一个栈
- 通常位于地址空间底部
多线程的进程中，每个线程独立运行
- 每个线程一个栈，分布在进程地址空间的各个部分...
- 线程的变量，函数参数，返回值存储于此
- thread - local

但又非常类似...

The state of a single thread is thus very similar to that of a process. It has a program counter (PC) that tracks where the program is fetching instructions from. Each thread has its own private set of registers it uses for computation; thus, if there are two threads that are running on a single processor, when switching from running one (T1) to running the other (T2), a context switch must take place. The context switch between threads is quite similar to the context switch between processes, as the register state of T1 must be saved and the register state of T2 restored before running T2. With processes, we saved state to a process control block (PCB); now, we’ll need one or more thread control blocks (TCBs) to store the state of each thread of a process. There is one major difference, though, in the context switch we perform between threads as compared to processes: the address space remains the same (i.e., there is no need to switch which page table we are using).

1. 实例：线程创造

1.1. Code

#include <stdio.h>
#include <assert.h>
#include <pthread.h>

void *mythread(void *arg)
{
  printf("%s\n", (char *)arg);
  return NULL;
}

int main(int argc, char *argv[])
{
  pthread_t p1, p2;
  int rc;
  printf("main: begin\n");
  rc = pthread_create(&p1, NULL, mythread, "A");
  assert(rc == 0);
  rc = pthread_create(&p2, NULL, mythread, "B");
  assert(rc == 0);

  // join waits for the threads to finish
  rc = pthread_join(p1, NULL);
  assert(rc == 0);
  rc = pthread_join(p2, NULL);
  assert(rc == 0);
  printf("main: end\n");
  return 0;
}

What runs next is determined by the OS scheduler, and although the scheduler likely implements some sensible algorithm, it is hard to know what will run at any given moment in time.

1.2. Result

并非先创建的先运行，也并非先调用的先运行...

都有可能发生，取决于操作系统内部的调度算法...

root@LAPTOP-GT06V0GS:/mnt/d/CSLab/osTEP/chapter26# ./t0
main: begin
A
B
main: end



root@LAPTOP-GT06V0GS:/mnt/d/CSLab/osTEP/chapter26# ./t0
main: begin
B
A
main: end

Computers are hard enough to understand without concurrency. Unfortunately, with concurrency, it simply gets worse. Much worse.

2. 为什么更糟糕：共享数据

Best Practice

#include "common.h":

包含着 GetTime() 方法和 Spin() 方法
times 和 stat 系统调用和 assert 方法

#include "common_threads.h"：

封装了常用的 thread 函数...
为一般的 thread 函数添加了 assert 判断

2.1. Code

#include <stdio.h>
#include <pthread.h>
#include "common.h"
#include "common_threads.h"

static volatile int counter = 0;

// mythread()
//
// Simply adds 1 to counter repeatedly, in a loop
// No, this is not how you would add 10,000,000 to
// a counter, but it shows the problem nicely.

void *mythread(void *arg)
{
  printf("%s: begin\n", (char *)arg);
  int i;
  for (i = 0; i < 1000000; i++)
  {
    counter++;
  }
  printf("%s: done\n", (char *)arg);
  return NULL;
}

// main()
//
// Just launches two threads (pthread_create)
// and then waits for them (pthread_join)

int main(int argc, char *argv[])
{
  pthread_t p1, p2;
  printf("main: begin (counter=%d)\n", counter);
  Pthread_create(&p1, NULL, mythread, "A");
  Pthread_create(&p2, NULL, mythread, "B");

  // join() waits for the threads to finish
  Pthread_join(p1, NULL);
  Pthread_join(p2, NULL);

  printf("main: done (counter=%d)\n", counter);
  return 0;
}

2.2. Result

root@LAPTOP-GT06V0GS:/mnt/d/CSLab/osTEP/chapter26# make t1
gcc -Wall -g -I../include t1.c -o t1
root@LAPTOP-GT06V0GS:/mnt/d/CSLab/osTEP/chapter26# ./t1
main: begin (counter=0)
A: begin
B: begin
B: done
A: done
main: done (counter=1074928)
root@LAPTOP-GT06V0GS:/mnt/d/CSLab/osTEP/chapter26# ./t1
main: begin (counter=0)
A: begin
B: begin
A: done
B: done
main: done (counter=1072662)

Not only is each run wrong, but also yields a different result! A big question remains: why does this happen?

the counter expect to be 1e7 + 1e7
and the sequence should be A --- then --- B

3. 核心问题：不可控的调度

What we have demonstrated here is called a race condition (or, more specifically, a data race): the results depend on the timing of the code’s execution. With some bad luck (i.e., context switches that occur at untimely points in the execution), we get the wrong result.

In fact, we may get a different result each time; thus, instead of a nice deterministic computation (which we are used to from computers), we call this result indeterminate, where it is not known what the output will be and it is indeed likely to be different across runs.

Because multiple threads executing this code can result in a race condition, we call this code a critical section. A critical section is a piece of code that accesses a shared variable (or more generally, a shared resource) and must not be concurrently executed by more than one thread.

What we really want for this code is what we call mutual exclusion. This property guarantees that if one thread is executing within the critical section, the others will be prevented from doing so.

4. 原子性愿望

全部或没有...

不会在执行之中被中断

4.1. TIP Atomic operations

Atomic operations are one of the most powerful underlying techniques in building computer systems, from the computer architecture, to concurrent code (what we are studying here), to file systems (which we’ll study soon enough), database management systems, and even distributed systems [L+93].

The idea behind making a series of actions atomic is simply expressed with the phrase “all or nothing”; it should either appear as if all of the actions you wish to group together occurred, or that none of them occurred, with no in-between state visible. Sometimes, the grouping of many actions into a single atomic action is called a transaction, an idea developed in great detail in the world of databases and transaction processing [GR92]. In our theme of exploring concurrency, we’ll be using synchronization primitives to turn short sequences of instructions into atomic blocks of execution, but the idea of atomicity is much bigger than that, as we will see. For example, file systems use techniques such as journaling or copyon-write in order to atomically transition their on-disk state, critical for operating correctly in the face of system failures. If that doesn’t make sense, don’t worry — it will, in some future chapter.

4.2. Synchronization Primitives 同步原语

Thus, what we will instead do is ask the hardware for a few useful instructions upon which we can build a general set of what we call synchronization primitives. By using this hardware support, in combination with some help from the operating system, we will be able to build multi-threaded code that accesses critical sections in a synchronized and controlled manner, and thus reliably produces the correct result despite the challenging nature of concurrent execution. Pretty awesome, right?

4.3. KEY CONCURRENCY TERMS

A critical section 临界区 is a piece of code that accesses a shared resource, usually a variable or data structure.
A race condition 竞态条件 (or data race [NM92]) arises if multiple threads of execution enter the critical section at roughly the same time; both attempt to update the shared data structure, leading to a surprising (and perhaps undesirable) outcome.
An indeterminate 不确定的 program consists of one or more race conditions; the output of the program varies from run to run, depending on which threads ran when. The outcome is thus not deterministic, something we usually expect from computer systems.
To avoid these problems, threads should use some kind of mutual exclusion 互斥 primitives; doing so guarantees that only a single thread ever enters a critical section, thus avoiding races, and resulting in deterministic program outputs.

5. 还有一个问题：等待另一个线程

线程之间还有别的交互，除了访问共享变量...

例如：disk IO 操作...

6. 作业（模拟作业）

欢迎使用这个模拟器！它的目的是通过观察线程如何交错来熟悉它们；模拟器x86.py将帮助你获得这种理解。

模拟器模拟了多线程执行短汇编代码的过程。需要注意的是，实际运行的操作系统代码（例如，执行上下文切换）并未显示；因此，您看到的只是用户代码的交错。

运行的汇编代码基于 x86，但有所简化。该指令集中包含四个通用寄存器（%ax、%bx、%cx、%dx）、一个程序计数器（PC）以及一小组指令，这些指令足以满足我们的目的。

6.1. 汇编代码学习

0 基础汇编速成了也是...之前玩过人力资源管理...然而只是一个玩具游戏...

   dx   >= >  <= <  != ==        Thread 0
    3   0  0  0  0  0  0
    2   0  0  0  0  0  0  1000 sub  $1,%dx
    2   1  1  0  0  1  0  1001 test $0,%dx	// 更新条件码
    2   1  1  0  0  1  0  1002 jgte .top
    1   1  1  0  0  1  0  1000 sub  $1,%dx
    1   1  1  0  0  1  0  1001 test $0,%dx	// 更新条件码
    1   1  1  0  0  1  0  1002 jgte .top
    0   1  1  0  0  1  0  1000 sub  $1,%dx
    0   1  0  1  0  0  1  1001 test $0,%dx	// 更新条件码
    0   1  0  1  0  0  1  1002 jgte .top
   -1   1  0  1  0  0  1  1000 sub  $1,%dx
   -1   0  0  1  1  1  0  1001 test $0,%dx	// 更新条件码
   -1   0  0  1  1  1  0  1002 jgte .top
   -1   0  0  1  1  1  0  1003 halt

sub $1, %dx：将 %dx 的值减1。
test $0, %dx：比较 %dx 和0，设置条件码（ >=, >, <=, <, !=, == ）。
jgte .top：如果 %dx 大于等于0，跳回 .top 继续循环。

模拟的全套资料

mov immediate, register     # moves immediate value to register
mov memory, register        # loads from memory into register
mov register, register      # moves value from one register to other
mov register, memory        # stores register contents in memory
mov immediate, memory       # stores immediate value in memory

add immediate, register     # register  = register  + immediate
add register1, register2    # register2 = register2 + register1
sub immediate, register     # register  = register  - immediate
sub register1, register2    # register2 = register2 - register1

test immediate, register    # compare immediate and register (set condition codes)
test register, immediate    # same but register and immediate
test register, register     # same but register and register

jne                         # jump if test'd values are not equal
je                          #                       ... equal
jlt                         #     ... second is less than first
jlte                        #               ... less than or equal
jgt                         #            ... is greater than
jgte                        #               ... greater than or equal

xchg register, memory       # atomic exchange: 
                            #   put value of register into memory
                            #   return old contents of memory into reg
                            # do both things atomically

nop                         # no op

note：

“immediate” 的形式是 $number
“memory” 的形式为“number”或“(reg)”或“number(reg)”或“number(reg,reg)”（如上所述）
'register' 是 %ax、%bx、%cx、%dx 之一

PreviousPart 2 Concurrency NextChapter 27 Interlude: The Threading API

Last updated 2 months ago