Rust: 并发编程

Basics

Use std::thread::spawn to create a thread (but not execute)
xxx.join() executes a thread, wait for its completion and collect its result.
- Its return type is Result<_, _>. Returns Err(...) if panics, Ok(...) otherwise.

`move` of closure

Watch the following example:

Why move for i32, u64?

#[allow(unused_variables)]
pub fn named_sleeper(value: i32, ms: u64) -> i32 {
    let builder = thread::Builder::new().name("sleeper".into());
    let handle = builder
        .spawn(move || {
            thread::sleep(Duration::from_millis(ms));
            value
        })
        .unwrap();
    handle.join().unwrap()
}

If you try to remove the move before the closure, you’ll get an error. The reason is that, by default, Rust will try to borrow the variables, even though its type has implemented Copy trait. So in fact, inside the closure, it’s actually &value, &ms.

Now we have a look of the signature of thread::spawn()

Signature of spawn()

pub fn spawn<F, T>(f: F) -> JoinHandle<T>
where
    F: FnOnce() -> T + Send + 'static,
    T: Send + 'static,

The closure should have static life time, which means its arguments cannot be on the stack. However, if we do not add the move keyword, the &value, &ms indeed refers to addresses of the data on the callstack, which violates the signature. This is just to prevent cases that the function exits before the thread has completed, so that during execution, the thread is holding a now-dangling reference to the data which was originally on the callstack.

Some Advanced Topics

Named Threads

Using thread::Builder, you can create threads with name, which becomes easier to debug.

thread::Builder

use std::thread;

fn named_thread_example() {
    let builder = thread::Builder::new()
        .name("my-worker".into())
        .stack_size(32 * 1024); // 32 KiB

    let handle = builder.spawn(|| {
        println!("Hello from thread: {:?}", thread::current().name());
        42
    }).unwrap();

    let result = handle.join().unwrap();
    println!("Thread returned: {}", result);
}

Scoped Threads

Scoped threads allow borrowing stack data without moving ownership. The threads are guaranteed to complete before the scope ends, so the references inside the threads are valid.

First, use thread::scope(|s| { ... }) to create a scope s. In the closure, use s.spawn() to create scoped threads.

thread::scope()

use std::thread;

fn scoped_thread_example() {
    let a = vec![1, 2, 3];
    let b = vec![4, 5, 6];

    let (sum_a, sum_b) = thread::scope(|s| {
        let h1 = s.spawn(|| a.iter().sum::<i32>());
        let h2 = s.spawn(|| b.iter().sum::<i32>());
        (h1.join().unwrap(), h2.join().unwrap())
    });

    // `a` and `b` are still accessible here.
    println!("sum_a = {}, sum_b = {}", sum_a, sum_b);
}

Thread Local Variables

Each thread gets its own independent copy of a thread_local! variable. By setting the variable type as RefCell<>, you can modify the data with .borrow_mut()

thread_local! macro

thread_local! {
    static THREAD_COUNT: RefCell<usize> = RefCell::new(0);
}

/// Use thread‑local storage to count how many times each thread calls `increment`.
///
/// Define a `thread_local!` static `THREAD_COUNT` of type `RefCell<usize>` initialized to 0.
/// Each call to `increment` should increase the thread‑local count by 1 and return the new value.
///
/// Hint: Use `THREAD_COUNT.with(|cell| { ... })` to access the thread‑local variable.
pub fn increment_thread_local() -> usize {
    THREAD_COUNT.with(|c| {
        let mut new = c.borrow_mut();
        *new += 1;
        *new
    })
}

`Arc<Mutex<T>>`

By using Arc<Mutex<T>>, data can be shared and modified across threads. Usually, we need to clone Arc as loop-local varaible, move it into the thread and push the thread to a vector for later batch join().

After that, inside the closure, use .lock().unwrap() to retrieve underlying data and apply modifications.

pub fn concurrent_collect(n_threads: usize) -> Vec<usize> {
    let answer: Arc<Mutex<Vec<usize>>> = Arc::new(Mutex::new(Vec::new()));
    let mut handles = vec![];

    for i in 0..n_threads {
        let copy = Arc::clone(&answer);

        handles.push(thread::spawn(move || {
            let mut vec = copy.lock().unwrap();
            vec.push(i);
        }));
    }

    for h in handles {
        h.join().unwrap();
    }

    let mut data = answer.lock().unwrap();
    data.sort();

    data.clone()
}

Multi Producer, Single Consumer Channels

Use std::sync::mpsc::channel() to create channels, which returns (sender, receiver). sender can be clone() to be used in multi-threading.

In addition, recv() will keep blocking before all senders are dropped.

mpsc::channel()

/// Create a producer thread that sends each element from items into the channel.
/// The main thread receives all messages and returns them.
pub fn simple_send_recv(items: Vec<String>) -> Vec<String> {
    let (sender, receiver) = mpsc::channel();
    for s in items {
        sender.send(s).unwrap();
    }
    drop(sender);
    let mut ans = vec![];
    while let Ok(s) = receiver.recv() {
        ans.push(s);
    }
    ans
}

/// Create `n_producers` producer threads, each sending a message in format `"msg from {id}"`.
/// Collect all messages, sort them lexicographically, and return.
///
/// Hint: Use `tx.clone()` to create multiple senders. Note that the original tx must also be dropped.
pub fn multi_producer(n_producers: usize) -> Vec<String> {
    let (sender, receiver) = mpsc::channel();
    let mut handles = vec![];
    let mut ans = vec![];
    for s in 0..n_producers {
        let sdr = sender.clone();
        handles.push(thread::spawn(move || {
            sdr.send(format!("msg from {s}")).unwrap();
            drop(sdr);
        }));
    }
    for h in handles {
        h.join().unwrap();
    }
    drop(sender);
    while let Ok(s) = receiver.recv() {
        ans.push(s);
    }
    ans
}