Class Introduction

Welcome to the KLA Custom Rust Training.

About the Trainer

Herbert has been developing software professionally for more than 20 years. Starting with BASIC, Pascal, and then moving onto C and C++, Herbert has developed custom applications ranging from web-server filters to games. Herbert is the author of Hands-on Rust and Rust Brain Teasers.

BookPublisher E-BookAmazon
Hands-on RustPragProg PageAmazon Page
Rust Brain TeasersPragProg PageAmazon Page

Resources

I recommend bookmarking the following resources:

Format

  • We'll take a short break roughtly every hour.
  • If you have questions, please let me know.
  • This is your training. Please feel free to talk to me at the breaks to see if we can cover material you need.

The Rust Ecosystem

In this section, we're going to walk through a lot of the Rust ecosystem to help you get a feel for how Rust projects are structured, the tooling that is available, and key differences from C++.

In this section, we'll be covering:

  • A quick introduction to the Rust toolchain and swiss-army knife Cargo.
  • Some basic Rust syntax and how it differs from C++.
  • Rust's safety guarantees and what they mean, including:
    • The borrow checker
    • Lifetimes
    • Reference counting
    • Data-race protection
    • Opt-in vs opt-out safety
  • Program layout with workspaces, crates, programs and libraries
  • Unit testing
  • Dependency Management
  • Benchmarking

Rust and C++ Tooling Equivalencies

This is a cheat sheet for you to refer to later.

Using Cargo

The cargo command is a swiss-army knife that handles building projects, testing them, controlling dependencies and more. It is extensible, you can add more features to it and use it to install programs.

Cargo CommandC++ EquivalentPurpose
Package Commands
cargo init
Compilation
cargo buildmakeBuilds your project, placing the output in the target directory.
cargo runmake ; ./my_programRuns cargo build, and then runs the resulting executable.
cargo checkBuild only the source, and skip assembly and linking for a quick check of syntax.
cargo cleanmake cleanRemoves all build artefacts and empties the target directory.
cargo rustcPass extra rustc commands to the build process
Formatting
cargo fmtFormats your source code according to the Rust defaults.
Testing
cargo testmake testExecutes all unit tests in the current project
cargo benchExecutes all benchmarks in the current project.
Linting
cargo clippyRuns the Clippy linter
cargo fixApplies all Clippy suggestions
Documentation
cargo docBuilds a documentation website from the current project's sourcecode.
cargo rustdocRun the documentation builder with extra command options.
Dependencies
cargo fetchDownloads all dependencies listed in Cargo.toml from the Internet.
cargo addAdd a dependency to the current project's Cargo.toml
cargo removeRemove a dependency from the current project's Cargo.toml file
cargo updateUpdate dependencies to the latest version in Cargo.toml
cargo treeDraw a tree displaying all dependencies, and each dependency's dependencies
cargo vendorDownload all dependencies, and provide instructions to modify your Cargo.toml to use the downloaded dependencies.

Rust Syntax 101

We'll take a quick and fast dive through some of Rust's syntax differences from C++. Most of the building blocks are the same, but the conventions are different.

We'll be covering:

  • Hello, World!
    • This is mostly an exercise to make sure that your Rust environments are setup correctly.
    • We'll compare C++ and Rust versions of the same program.
  • Functions
  • Variables
  • Scopes
  • Type Conversion
  • Expressions and Return Values
  • Overflow and Wrapping
  • Structures
  • Associated Functions
  • Methods
  • Arrays
  • Vectors

A Quick Hello World

Let's do a quick exercise. This is very simple, and you've probably already done this---we'll make Hello World and take a quick look at it. This will ensure that you have a working Rust installation. Then we'll compare it to a C++20 equivalent.

Source code for this section is in projects/part2/hello_world.

Step 1: Select a Parent Directory

Create a directory on your computer in which you will be placing projects we work on. We'll be placing projects underneath this directory. For example:

cd /home/herbert/rust
mkdir kla_dec_2023_live
cd kla_dec_2023_live

Step 2: Invoke cargo to Create a New Project

In your project directory, type:

cargo new hello_world

Step 3: Run Your Program!

Cargo creates hello world by default when you create a new project. It's all written! Invoke it by typing:

cargo run

You should see the following output:

   Compiling hello_world v0.1.0 (/home/herbert/Documents/Ardan/KLA Training 3 Day Milipitas/projects/hello_world)
    Finished dev [unoptimized + debuginfo] target(s) in 0.16s
     Running `/home/herbert/Documents/Ardan/KLA Training 3 Day Milipitas/target/debug/hello_world`
Hello, world!

If This Didn't Work!

If This Didn't Work, your Rust tooling isn't working. Some troubleshooting steps:

  1. Make sure that you installed Rust from rustup.rs and either followed the "source" instruction or opened a new terminal session.
  2. Make sure that typing rustc --version shows you a version number.
  3. If you received the message linker not found, you need to install build-essential (on Ubuntu/Debian type distributions) or an equivalent on Linux. If you are using MacOS, you need the build package. On Windows, Rustup will show you a link to the appropriate runtime.

What Did Cargo Do?

Cargo has created several files in your project folder:

FilenameDescription
hello_world/Cargo.tomlThe build manifest. Equivalent to a CMake file or Makefile.
hello_world/src/The source directory.
hello_world/src/main.rsThe main source code file. Every executable needs a main.rs file (libraries have a lib.rs)---you can override this, but it's a good default.

Cargo.toml

Rust has created a Cargo.toml file for you:

[package]
name = "hello_world"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
  • The [package] section defines the program itself.
    • The name will be the name of the emitted executable (with .exe on the end for Windows).
    • version uses semantic versioning. When referring to versions, "0" is special - every bump counts as a release. Once you hit 1.0, the dependency checker is a bit more strict. We'll talk about that later.
    • edition tells the Rust compiler which edition of the language you are using. Rust promises not to break language compatibility except when the edition number increases (roughly every 2-4 years). rustc retains compatibility with previous editions, unless a dangerous security reason appeared to remove something. This is designed to avoid the C++ difficulty of "we can never take away features" and "we can never change the interface".
  • The dependencies section determines dependent packages to download. We'll worry about that later.

main.rs

The main.rs file is a basic "Hello, world!" program:

fn main() {
    println!("Hello, world!");
}

If you've never seen Rust before, it might be a little confusing.

  • fn is "function". Unlike C++, it doesn't specify the return type---just that it is a function.
  • main is the name of the function. main is special, just like C++ --- it's the default invocation point for an executable program.
  • println! has an exclamation mark, indicating that its a macro. Formatting strings is a pretty big job---see the C++20 format system! Rust's formatting system uses the macro system to allow for extreme flexibility for parameters. It's very powerful, but it's also a poor example for the first thing you see because macros are not an introductory topic.

Equivalent C++

C++ source code for this project is in cpp/hello_world.

A simple C++ equivalent program is as follows:

#include <iostream>

int main() {
    std::cout << "Hello, World!" << std::endl;
    return 0;
}

Not everyone likes iostream. If you prefer printf or any of the other output systems, that's cool too.

It is accompanied by a CMakeLists.txt file:

cmake_minimum_required(VERSION 3.5)
project(HelloWorld)

add_executable(hello hello.cpp)

And you can build the project and execute it with the following:

# First time only
mkdir build
cd build
cmake ..

# And then
cd build
make
./hello

This will give you the expected output: Hello, World!.

Comparing Cargo.toml and CMake

The Cargo.toml and CMakeLists.txt files are similar: you specify the project details, and CMake builds the builder for you. Rust is doing a few more things:

  • Your executable is statically linked, and includes the Rust standard library (that's why the executable is so much larger).
  • Cargo includes dependency management, so your CMake file really includes vcpkg, Conan or one of the other build tools.
  • Cargo doesn't offer to create makefiles, Ninja build systems, etc. --- it's an all in one tool.

So in reality, your CMakeLists.txt file would be a bit bigger:

cmake_minimum_required(VERSION 3.5)
project(HelloWorld)

# Example vcpkg (commented out because I don't have it installed)
#set(CMAKE_TOOLCHAIN_FILE ~/vcpkg/scripts/buildsystems/vcpkg.cmake CACHE FILEPATH "Path to toolchain")

# Example static linking
#set(CMAKE_EXE_LINKER_FLAGS "-static-libgcc -static-libstdc++ -static")

add_executable(hello hello.cpp)

Comparing main.rs with hello.cpp

The files are quite similar.

  1. C++ brings in iostream with #include <iostream>. You don't need to do that for println!---Rust makes it available by default.
  2. Both define a main function. fn main() and int main() are almost equivalent---but the Rust version doesn't return anything.
  3. println! and std::cout << "Hello, World!" << std::endl; are equivalent. println! adds a \n to the end for you, which triggers a flush (println! is unbuffered). If you want to not emit a \n, you can use print!.
  4. return 0 returns an exit code of 0. Rust programs do that for you by default.

So despite the syntax being different, it's not all that different.

If you really want to be returning an exit code, you can use the following:

use std::process:ExitCode;

fn main() -> ExitCode {
    println!("Hello, world!");
    return ExitCode::from(0);
}

And that concludes our quick "Hello World" tour. We've covered:

  • How to create a Rust program with Cargo.
  • What the resulting files and structure cover.
  • An equivalent C++ and CMake setup.
  • Exit codes.

Functions, Variables and Scopes

Let's keep hacking at our Hello World project, and play around a bit to get used to Rust's handling of language concepts.

Let's take a look at some of the syntax differences you'll encounter.

Functions and Variables

Rust and C++ both use lots of functions! They work in a very similar fashion, but the syntax is quite different.

Here's a really simple Rust function example:

The source code for this is in projects/part2/double_fn.

fn double_it(n: i32) -> i32 {
    n * 2
}

fn main() {
    let i = 5;
    let j = double_it(i);
    println!("{i} * 2 = {j}");
}

Let's go through line-by-line quickly:

  • fn double_it declares a function named double_it
    • (n: i32) declares a function argument/parameter named n, of the type i32. Rust uses explicit sizing for variable types except for usize and isize---which work exactly like usize in C++ (the size of a pointer on your platform)
    • -> i32 indicates that the function returns an i32.
  • n * 2 returns n, multiplied by 2. Note that there's no ;. If you don't specify a semicolon, you are returning the result of an expression.
  • fn main is the main function.
  • let i = 5; creates a new variable binding named i and assigns the value 5 to it. You don't need to specify a type, because the Rust compiler can infer it from the fact that you used i with a function call. (It will default to i32 if there are no clues).
  • let j = double_it(i) calls double_it with the value of i (we'll talk about copy, reference and move later), and assigns the result to j.
  • println!("{i} is using the Rust format macro to add the value of i into the output string. You can use named variables but not expressions inside the format string. If you prefer, println!("{} * 2 = {}", i, j) is also valid. You can replace the j with a direct call to double_it if you prefer.

Here's some equivalent C++, using modern C++ equivalents:

The source code is in cpp/double_fn

#include <iostream>

int double_it(int x) {
    return x * 2;
}

int main() {
    auto i = 5;
    auto j = double_it(i);
    std::cout << i << " * 2 = " << j << std::endl;
    return 0;
}

It's very similar. Most of the concepts are the same. Variable declarations are the other way around, function declarations declare their return type first. Overall, though---you shouldn't have too much trouble getting used to the new way of arranging things.

Primitive Types

Rust is a lot more strict than C++ defaults about coercing types. Take the following C++ (it's in cpp/type_changes):

#include <iostream>

int double_it(long n) {
    return n * 2;
}

int main() {
    int i = 5;
    int j = double_it(i);
    std::cout << "i = " << i << ", j = " << j << std::endl;
    return 0;
}

The project compiles without warnings or errors, and outputs i = 5, j = 10 as you'd expect.

Let's do a line-by-line conversion to Rust:

fn double_it(n: i64) -> i32 {
    n * 2
}

fn main() {
    let i: i32 = 5;
    let j: i32 = double_it(i);
    println!("i = {i}, j = {j}");
}

The Rust project fails to compile. The error message is:

error[E0308]: mismatched types
 --> src/main.rs:2:5
  |
1 | fn double_it(n: i64) -> i32 {
  |                         --- expected `i32` because of return type
2 |     n * 2
  |     ^^^^^ expected `i32`, found `i64`
  |
help: you can convert an `i64` to an `i32` and panic if the converted value doesn't fit
  |
2 |     (n * 2).try_into().unwrap()
  |     +     +++++++++++++++++++++

error[E0308]: mismatched types
 --> src/main.rs:7:28
  |
7 |     let j: i32 = double_it(i);
  |                  --------- ^ expected `i64`, found `i32`
  |                  |
  |                  arguments to this function are incorrect
  |
note: function defined here
 --> src/main.rs:1:4
  |
1 | fn double_it(n: i64) -> i32 {
  |    ^^^^^^^^^ ------
help: you can convert an `i32` to an `i64`
  |
7 |     let j: i32 = double_it(i.into());
  |                             +++++++

The error message helpfully tells you how to fix the program, but the key here is that i32 and i64 are not the same type, so you can't pass one as the other.

Converting Types

If you see a lot of these error messages, it's a code smell. That is code that may not be such a great idea! Try to settle on types that are appropriate for what you are doing.

You actually have a few options for type conversion.

Converting with as

The first is as:

fn double_it(n: i64) -> i32 {
    n as i32 * 2
}

fn main() {
    let i: i32 = 5;
    let j: i32 = double_it(i as i64);
    println!("i = {i}, j = {j}");
}

as works, but it is the least safe option. as does a direct conversion, ignoring any overflow, data-loss, or precision loss. It's always safe to go from i32 to i64---you can't lose any data. Going from i64 to i32 may not be what you intended:

fn main() {
    let i: i64 = 2_147_483_648; // One more than i32 can hold
    let j = i as i32;
    println!("{j}");
}

You probably guessed that the result is -2147483648...

Takeaway: you can use as for safe conversions, it's not always the best idea.

Using into

The compiler error messages suggest using into. into is only provided for conversions where the type-conversion is safe and won't lose your data. We could use it like this:

fn double_it(n: i64) -> i32 {
    n as i32 * 2
}

fn main() {
    let i: i32 = 5;
    let j: i32 = double_it(i.into());
    println!("i = {i}, j = {j}");
}

This works, but we're still using n as i32. Why? i64 to i32 conversion can lose data---so Rust doesn't implement into(). Still, we're half way there.

Using try_into

For fallible conversions, Rust provides the try_into operation:

use std::convert::TryInto;

fn double_it(n: i64) -> i32 {
    let n: i32 = n.try_into().expect("{n} could not be converted safely into an i32");
    n * 2
}

fn main() {
    let i: i32 = 5;
    let j: i32 = double_it(i.into());
    println!("i = {i}, j = {j}");
}

try_into returns a Result type. We're going to go into those in detail later. For now, just think of it as equivalent to std::expected---it's either the expected result, or an error. You can use unwrap() to crash immediately on an error, or expect to crash with a nicer error message. There's lots of good ways to handle errors, too---but we'll get to that later.

Yes, that's a lot more pedantic. On the other hand, Rust makes you jump through a few hoops before you are confused by:

std::cout << double_it(2147483648) << std::endl;

It outputs 0

Expressions, Scopes and Return Values

Rust and C++ are both scope-heavy languages. Rust borrows its scope concept from O'Caml, and this tends to make idiomatic Rust code a little different. Any non-semicolon line in a scope is an implicit return. These are the same:

fn func1(n: i32) -> i32 { n }
fn func2(n: i32) -> i32 {
    return n;
}

fn main() {
    println!("{}, {}", func1(5), func2(5));
}

You can return out of scopes, too. But you can't use the return keyword, because that is setup to return from the function. This works:

fn main() {
    let i = {
        5
    };

    println!("{i}");
}

This doesn't:

fn main() {
    let i = {
        return 5;
    };

    println!("{i}");
}

Even functions that don't return anything, actually return the unit type (expressed as ()):

fn main() {
    let i = println!("Hello");
    println!("{i:?}");
}

The :? in the format specifier means "debug format". Any time that implements a trait named Debug---we'll cover those later---can be debug-formatted in this fashion. You can also use :#? to pretty-print.

The result of allowing scopes and expressions to return is that you can have conditional assignment (there's no ternary assignment in Rust):

fn main() {
    const n: i32 = 6;
    let i = if n == 6 {
        5
    } else {
        7
    };
    println!("{i}");
}

Overflow and Wrapping

C++ assumes wrapping, but undefined behavior (which allows for some great compiler optimizations, but is also confusing). Rust considers wrapping and overflow to be well-defined, but behavior you should specify.

You're probably used to this behavior from C++ (it's in the cpp/byte_overflow directory):

#include <iostream>
#include <cstdint>

int main() {
    uint8_t j = 0;
    for (int i = 0; i < 512; i++) {
        j++;
        std::cout << i << " : " << unsigned(j) << std::endl;
    }
    return 0;
}

This outputs 0 to 255 twice.

In Rust, the same program:

fn main() {
    let mut j: u8 = 0;
    for i in 0..512 {
        j += 1;
        println!("{i} : {j}");
    }
}

Running the program panics---crashes. It gives the error message "attempt to add with overflow".

Note that running in release mode (cargo run --release) skips his run-time check for performance. It's a great idea to run in debug mode sometimes.

Opting in to Wrapping

If your algorithm expects wrapping behavior, the easiest option is to use the wrapping_add function. That makes it clear that you expect wrapping, and acts appropriately:

fn main() {
    let mut j: u8 = 0;
    for i in 0..512 {
        j = j.wrapping_add(1);
        println!("{i} : {j}");
    }
}

If you'd just like to detect that wrapping would have occurred, you can use:

fn main() {
    let mut j: u8 = 0;
    for i in 0..512 {
        j = j.checked_add(1).unwrap(); // Returns `None` or `Some(result)`
        println!("{i} : {j}");
    }
}

This program will crash even on release-mode, because we've used unwrap - which deliberately panics on an error. You could detect the problem, choose how to handle it, and not crash.

You can use saturating functions also:

fn main() {
    let mut j: u8 = 0;
    for i in 0..512 {
        j = j.saturating_add(1);
        println!("{i} : {j}");
    }
}

If you don't like the extra typing, and want to associate a behavior with a type, you can use saturating and wrappign types.

use std::num::Wrapping;

fn main() {
    let mut j: Wrapping<u8> = Wrapping(0);
    for i in 0..512 {
        j += Wrapping(1);
        println!("{i} : {j}");
    }
}

Rust is being very explicit about behavior, because surprises are a bad thing!

Structures

Rust and C++ have similar support for structures (there is no class keyword). Like a C++ struct, a Rust struct is private by default.

#[derive(Debug)]
struct MyStruct {
    a: i32,
    b: u32,
    c: usize,
    pub d: String, // the `pub` keyword marks that field as "public"
}

fn main() {
    let val = MyStruct {
        a: 1,
        b: 2,
        c: 3,
        d: String::from("Hello"),
    };
    println!("{val:#?}");
}

#[derive] executes a procedural macro on the type at compilation time. Derive macros automatically implement traits for you---we'll be covering that later. In this case, #[derive(Debug)] feels like magic. It reflects on the structure type, and builds a debug formatter. It works as long as everything in the type also supports Debug.

Associated Functions

There's no direct equivalent of C++ constructors---definitely no rule of 0/5/7. Structures can have functions, and by convention constructors are associated functions. Associated functions use a structure as a namespace, and don't have access to an instance of a type. Here's a constructor:

#[derive(Debug)]
struct MyStruct {
    a: i32
}

impl MyStruct {
    fn new(a: i32) -> Self {
        Self { a }
    }
}

fn main() {
    let my_struct = MyStruct::new(5);
    println!("{my_struct:#?}");
}

We're using the built-in helper Self---with a capital S---that refers to "the type I'm currently implementing". You can put the full type name in if you prefer.

A nice side effect is that you can have as many constructors as you want, and put anything you like in the namespace. Please only put related functions in the namespace---otherwise, finding things later can be really annoying.

Methods

You can also define methods. For example:

#[derive(Debug)]
struct MyStruct {
    a: i32
}

impl MyStruct {
    fn new(a: i32) -> Self {
        Self { a }
    }

    fn get_a(&self) -> i32 {
        self.a
    }
}

fn main() {
    let my_struct = MyStruct::new(5);
    println!("{}", my_struct.get_a());
}

Note that writing getters/setters isn't required at all. Some organizations like them, some don't.

Arrays

Statically-sized arrays (no VLAs from C) are built-in:

fn main() {
    let array = [1, 2, 3, 4];
    let another_array = [0; 5]; // Repeat the first item a total of 5 times
    println!("{array:?}");
    println!("{another_array:?}");
}

What happens if you read outside of an array?

fn main() {
    let array = [0; 5];
    println!("{}", array[6]);
}

In this simple example, the compiler actually detects that it will fail at runtime and refuses to compile it! You can't count on it doing that. As soon as you add some complexity, LLVM won't spot the issue:

fn main() {
    let array = [0; 5];
    for i in 0..10 {
        println!("{}", array[i]);
    }
}

Note that the output is:

thread 'main' panicked at src/main.rs:4:24:
index out of bounds: the len is 5 but the index is 5

Rust detected the error at runtime and issued a panic, rather than segfaulting and performing potentially undefined behavior. Contrast this with the code from cpp/array_bounds:

#include <iostream>

int main() {
    int a[3] = {1, 2, 3};
    for (int i = 0; i < 10; i++) {
        std::cout << a[i] << std::endl;
    }
    return 0;
}

On my Linux workstation, it outputs:

1
2
3
-506178560
318461025
1
0
4925130
0
1651076199

That's not a good sign for security and memory safety!

Vectors

Vectors in Rust are a lot like vectors in C++: they store a capacity and size, and a pointer to an area of contiguous heap memory. When capacity is exceeded, they double in size.

Here are some vector examples:

fn main() {
    // The `vec!` macro helps with assignment using array syntax
    let my_vec = vec![1, 2, 3, 4, 5];
    let my_vec = vec![0; 5];

    // `push_back` has become `push`
    let mut my_vec = Vec::new();
    my_vec.push(1);
    my_vec.push(2);

    println!("{my_vec:?}");
    println!("{}", my_vec[1]);
    println!("{:?}", my_vec.get(2)); // Safer access.
    println!("{}", my_vec[2]); // Safe: Panics but doesn't do anything dangerous
}

Again, the default accessor will panic rather than at-best segfaulting.

Now that we've covered a lot of basic Rust syntax, let's dive back into the ecosystem---and the promises of safety.

Safety

Rust advertises heavily that it offers memory safety as a primary feature. Bjarne Stroustrup has repeatedly pointed out that modern C++ has many similar safety options---but they are opt in rather than opt out. Rust defaults to the safe code path; C++ defaults to the blazing fast with no seat-belts path.

This section is focused on Rust's memory safety guarantees from a C++ perspective.

We've seen a few examples of Rust offering memory safety already:

  • You have to explicitly convert between types. i32 and u32 can't be directly assigned without a conversion.
  • Overflow behavior is explicit: you opt in to wrapping, saturation, etc. The checked_op series of functions make it easy to test for overflow, division by zero and similar at runtime.
  • Accessing beyond the bounds of an array or vector panics safely---that is it doesn't trigger potentially dangerous behavior.

Let's look at another one:

No Null Pointers

Rust does not have a null pointer.* So the common C convention of returning a pointer, or nullptr if the operation failed, doesn't exist. Instead, Rust has the Option type.

    • Rust actually does have several nullable types, but they are all tagged as unsafe!

The Option type is a sum type. Just like a tagged union, it contains exactly one value---and is equal to the size of the largest option in memory. Options can be either:

  • None - they contain no data.
  • Some(T) - they contain a T.

If you could just use an Option like a pointer, we wouldn't have gained any safety. So instead, you have to explicitly access its contents---the compiler is forcing you to perform the null pointer check.

For example:

This example is in projects/part2/no_null

fn frobnicator() -> Option<i32> {
    Some(3)
}

fn main() {
    // Panic if the option is equal to None
    let a = frobnicator().unwrap();
    
    // Panic with a nice error message if the option is equal to None
    let a = frobnicator().expect("The frobnicator is broken!");

    // Just check to see if we got a value at all
    let a = frobnicator();
    if a.is_some() { 
        // Do something
    }
    if a.is_none() {
        // Do something
    }

    // Use "match" for pattern matching
    let a = frobnicator();
    match a {
        None => {
            // Do Something
        }
        Some(a) => {
            // a now refers to the contents of a
            // Do something
        }
    }

    // Use "if let" for single-arm pattern matching
    let a = frobnicator();
    if let Some(a) = a {
        // Do something
    }
}

So yes, that's more typing. On the other hand, C++ doesn't issue any warnings for the following program:

#include <iostream>

struct A {
    int a;
};

A * frobniactor() {
    return nullptr;
}

int main() {
    A * a = frobniactor();
    std::cout << "a is " << a->a << std::endl;
    return 0;
}

The program crashes at runtime with Segmentation Fault.

Summary

So Rust protects you from one of the most common issues. You won't encounter null pointer issues in safe code, bounds-checking and all of the related security issues are protected by default. Type conversion being explicit makes it hard to accidentally change type and lose data (which can be extra "fun" in old C code with pointers of mixed types!), and overflow behavior being opt-in reduces the risk of accidentally overflowing a type.

The Borrow Checker

The borrow checker gets a bad name from people who run into it and discover "I can't do anything!". The borrow checker does take a bit of getting used to - but in the medium term it really helps.

I went through a cycle going from C++ to Rust, and many people I've talked to went through the same:

  • First week or two: I hate the borrow checker! This is awful! I can't do anything!
  • Next: I see how to work within what it wants, I can live with this
  • Then: Wow, I'm writing Rust-like C++ and Go now - and my code is failing less frequently.

The good news is that if you are familiar with Modern C++, you've run into a lot of the same issues that the borrow checker helps with. Let's work through some examples that show how life with Rust is different.

Immutable by Default

This one trips a few people up when they start with Rust. This won't compile:

fn main() {
    let i = 5;
    i += 1;
}

Variables are immutable by default. In C++ terms, you just tried to write:

int main() {
    const i = 5;
    i += 1;
    return 0;
}

You can make i mutable and it works as you'd expect:

fn main() {
    let mut i = 5;
    i += 1;
}

In other words: C++ and Rust have exactly the opposite defaults. In C++, everything is mutable unless you const it. Rust, everything is immutable unless you mut it.

You could simply declare everything to be mutable. The linter will regularly remind you that things can be immutable. It's considered good Rust style to minimize mutability, so you aren't surprised by mutations.

Move by Default

Quick show of hands. Who knows what std::move does? Who really likes std::move?

This one surprises everyone. The following code does what you'd expect:

fn do_it(a: i32) {
    // Do something
}

fn main() {
    let a = 42;
    do_it(a);
    println!("{a}");
}

So why doesn't this work?

fn do_it(a: String) {
    // Do something
}

fn main() {
    let a = String::from("Hello");
    do_it(a);
    println!("{a}");
}

So why did this work with i32? i32 is a primitive - and implements a trait named Copy. Types can only implement Copy if they are equal to or smaller than a register---it's actually faster to just copy them than to use a pointer to their value. This is the same as C++ copying primitive types. When you work with a complex type (String and C++'s std::string are very similar; a size, a heap-allocated buffer of characters. In Rust's case they are UTF-8).

The error message borrow of moved value, with a long explanation isn't as helpful as you might like.

The key is: Rust is move by default, and Rust is more strict about moving than C++. Here's what you wrote in C++ terms:

#include <string>

void do_it(std::string s) {
    // Do something
}

int main() {
    std::string s = "Hello";
    do_it(std::move(s));
    // s is now in a valid but unspecified state
    return 0;
}

What happens if you use s? Nobody knows, it's undefined behavior. std::move in C++ converts an object to an xvalue---a type that has "been moved out of", and may not may not be in a valid state. Rust takes this to the logical conclusion, and prevents access to a "moved out of" type.

Moving Values Around

If you want to, you can move variables in and out of functions:

fn do_it(a: String) -> String {
    // Do something
    a
}

fn main() {
    let a = String::from("Hello");
    let a = do_it(a);
    println!("{a}");
}

This code is valid. Moving will generate memcpy that is usually removed by compiler optimizations, and LLVM applies the same returned-value optimizations as C++ for returning from a function.

Usually, I recommend moving out of a variable if you are genuinely done with it. Conceptually, you are giving ownership of the object to another function - it's not yours anymore, so you can't do much with it.

This is conceptually very similar to using unique_ptr in C++. The smart pointer owns the contained data. You can move it between functions, but you can't copy it.

Destructors and Moving

In C++, you can have move constructors---and moving structures around can require some thought as move constructors fire. Rust simplifies this. Moving a structure does not fire any sort of constructor. We haven't talked about destructors yet, so let's do that.

In Rust, destructors are implemented by a trait named Drop. You an add Drop to your own types. Let's use this to illustrate the lifetime of a type as we move it around:

The code is in projects/part2/destructors

struct MyStruct {
    s: String
}

impl Drop for MyStruct {
    fn drop(&mut self) {
        println!("Dropping: {}", self.s);
    }
}

fn do_it(a: MyStruct) {
    println!("do_it called");
}

fn move_it(a: MyStruct) -> MyStruct {
    println!("move_it called");
    a
}

fn main() {
    let a = MyStruct { s: "1".to_string() };
    do_it(a);
    // a no longer exists

    let b = MyStruct { s: "2".to_string() };
    let b = move_it(b);
    println!("{}", b.s);
}

As you can see, Drop is called when the structure ceases to be in scope:

  • do_it runs, and receives ownership of the object. The destructor fires as soon as the function exits.
  • move_it runs, and the object remains in-scope. The destructor fires when the program exits.

RAII is central to Rust's safety model. It's used everywhere. I try to remember to credit C++ with its invention every time I mention it!

Borrowing (aka References)

So with that in mind, what if you don't want to move your data around a lot (and pray that the optimizer removes as many memcpy calls as possible)? This introduces borrowing. Here's a very simple function that takes a borrowed parameter:

fn do_it(s: &String) {
    println!("{s}");
}

fn main() {
    let s = "42".to_string();
    do_it(&s);
}

Predictably, this prints 42. The semantics are similar to C++: you indicate a borrow/reference with &. Unlike C++, you have to indicate that you are passing a reference at both the call-site and the function signature---there's no ambiguity (which helps to avoid accidental passing by value/copying). This is the same as the following C++:

#include <string>
#include <iostream>

void do_it(const std::string &s) {
    std::cout << s << std::endl;
}

int main() {
    std::string s = "42";
    do_it(s);
    return 0;
}

Once again, notice that the reference is implicitly immutable.

If you want a mutable borrow---permitted to change the borrowed value---you have to indicate so.

fn do_it(s: &mut String) {
    s.push_str("1");
}

fn main() {
    let mut s = String::from("42");
    do_it(&mut s);
    println!("{s}");
}

Notice that you are:

  • Making s mutable in the let mut declaration. You can't mutably lend an immutable variable.
  • Explicitly decorating the lend as &mut at the call-site.
  • Explicitly borrowing as mutable in the parameters ((s: &mut String)).

Rust doesn't leave any room for ambiguity here. You have to mean it when you allow mutation!

Why Mutability Matters

The borrow checker enforces a very strict rule: a variable can only be borrowed mutably once at a time. You can have as many immutable borrows as you want---but only one current effective owner who can change the variable. This can take a little bit of getting used to.

So this is invalid code:

fn main() {
    let mut i: i32 = 1;
    let ref_i = &mut i;
    let second_ref_i = &mut i;
    println!("{i}");
    println!("{ref_i}");
    println!("{second_ref_i}");
}

The print statements are included to prevent the optimizer from realizing that variables are unused and silently removing them.

For example, this is an example of some code that triggers borrow-checker rage:

fn main() {
    let mut data = vec![1,2,3,4,5];
    for (idx, value) in data.iter().enumerate() {
        if *value > 3 {
            data[idx] = 3;
        }
    }
    println!("{data:?}");
}

Look at the error message:

error[E0502]: cannot borrow `data` as mutable because it is also borrowed as immutable
 --> src/main.rs:5:13
  |
3 |     for (idx, value) in data.iter().enumerate() {
  |                         -----------------------
  |                         |
  |                         immutable borrow occurs here
  |                         immutable borrow later used here
4 |         if *value > 3 {
5 |             data[idx] = 3;
  |             ^^^^ mutable borrow occurs here

Using an iterator (with .iter()) immutably borrows each record in the vector in turn. But when we index into data[idx] to change the value, we're mutably borrowing. Since you can't have a mutable borrow and other borrows, this is invalid.

You have to be careful to limit access. You could rewrite this code a few ways. The most Rustacean way is probably:

This is a good thing. Changing an underlying structure while you iterate it risks iterator invalidation.

Option 1: The Rustacean Iterators Way

fn main() {
    let mut data = vec![1,2,3,4,5];
    data.iter_mut().filter(|d| **d > 3).for_each(|d| *d = 3);
    println!("{data:?}");
}

This is similar to how you'd do it with ranges3 or the C++20 ranges feature. You are pipelining:

  • You obtain a mutable iterator (it will pass an &mut reference to each entry in turn).
  • You filter the target records with a predicate. |d| **d > 3 is a closure (lambda function) - d is the parameter, which will arrive as &&mut because the iterator takes a reference (&mut) and the filter then passes a reference to the reference. (Good news: the compiler clean that up. I still think its ugly!)
  • Then you run for_each on the remaining entries.

That's great for problems that naturally fit into an iterator solution.

Option 2: Do the two-step

Another option is to separate the operations:

fn main() {
    let mut data = vec![1,2,3,4,5];
    let mut to_fix = Vec::new();
    for (idx, value) in data.iter().enumerate() {
        if *value > 3 {
            to_fix.push(idx);
        }
    }
    for idx in to_fix { // Note: no .iter(). We're *moving* through each entry, invalidating the vector!
        data[idx] = 3;
    }
    println!("{data:?}");
}

This is pretty typical: you "beat" the borrow checker by breaking your task down into specific stages. In this case, we avoided a potential iterator invalidation. We also made it a lot easier for the compiler to perform static analysis and prevent data races.

Dangling Pointers

The borrow checker prevents a lot of dangling pointer and reference errors. For example:

fn main() {
    let s = String::from("Hello");
    let s_ref = &s;
    std::mem::drop(s);
    println!("{s_ref}");
}

Dropping s terminates its existence (it's the same as delete, it still calls destructors). Trying to print s after it is dropped is a compiler error: s no longer exists. Try the same in C++ and you don't get any warning by default (most static analysis will catch this):

#include <iostream>

int main() {
    std::string * s = new std::string("Hello");
    delete s;
    std::cout << *s << std::endl;
}

Summary

The borrow checker does take some getting used to, but it's surprising how long you can go without running it into if you go with idiomatic, straight-forward code. It's especially hard coming from C++, which allows you to get by with a lot.

In this section, we've covered:

  • Move by default, and Rust curing all "use after move" errors.
  • Explicit borrowing, and no more "oops, I copied by value by mistake".
  • Explicit mutability, to avoid surprises.
  • The "one mutable access at a time" rule, which prevents hidden bugs like iterator invalidation.
  • No more dangling pointers/references --- but still no garbage collector.

Now let's look at the second half of the borrow checker, lifetimes.

Lifetimes

The borrow checker not only tracks borrows, it attaches a lifetime to every borrow.

In very early versions of Rust, you had to annotate every reference with a lifetime. Be glad you don't have to do this anymore! Code could look like this:

fn do_it<'a>(s: &'a String) {
    println!("{s}");
}

fn main() {
    let s = String::from("Hello");
    do_it(&s);
}

This is still valid Rust, but in most cases Rust is able to deduce an "anonymous lifetime" for reference usage. Let's look at the new code:

  • do_it<'a> introduces a new lifetime, named a. You can name lifetimes whatever you want, but it's common to use short names.
  • In the arguments, s: &'a String states that the borrowed String adheres to lifetime a.

What's really happening here? Rust is tracking that when you call do_it, a lifetime is created. The lifetime must exceed the lifetime of the object being pointed at. Not doing so is a compiler error.

Escaping References

In Go, this is a really common idiom. The Go compiler will detect that you're referencing a local variable (via escape analysis), hoist it to the heap without telling you, and let you have your reference.

This compiles in C++:

#include <iostream>
using namespace std;

int& bar()
{
    int n = 10;
    return n;
}

int main() {
    int& i = bar();
    cout<<i<<endl;
    return 0;
}

The code does generate a warning, but it actually functioned on 2 of the 3 systems I tried it on! Rust is not so forgiving:

fn do_it() -> &String {
    let s = String::from("Hello");
    &s
}

fn main() {
    let s = do_it();
}

Rust starts by telling you that you need a lifetime specifier, and suggests a special lifetime called 'static. Static is a special lifetime in which you are promising that a reference will live forever, and Rust can not worry about it. So let's try that:

fn do_it() -> &'static String {
    let s = String::from("Hello");
    &s
}

fn main() {
    let s = do_it();
}

It still doesn't compile, this time with the correct error: cannot return a reference to local variable.

The borrow checker prevents this problem.

Returning References

What if you actually do want to return a valid reference? This function won't compile without lifetime specifiers.

fn largest<'a>(a: &'a i32, b: &'a i32) -> &'a i32 {
    if a > b {
        &a
    } else {
        &b
    }
}

fn main() {
    let a = 1;
    let b = 2;
    let ref_to_biggest = largest(&a, &b);
    println!("{ref_to_biggest}");
}

You have to clarify to Rust that the function can assume that both references will share a lifetime with the function output. So now for the returned reference to remain valid, both inputs also have to remain valid. (In this example, we're using a type that would be better off being copied anyway!)

Keeping References

Life starts to get complicated when you want to keep references around. Rust has to validate the lifetimes of each of these references.

struct Index {
    selected_string: &String
}

fn main() {
    let strings = vec![
        String::from("A"),
        String::from("B"),
    ];
    let index = Index {
        selected_string: &strings[1]
    };
    println!("{}", index.selected_string);
}

This fails to compile, but the compiler error tells you what needs to be done. So we apply its suggestions:

struct Index<'a> {
    selected_string: &'a String
}

fn main() {
    let strings = vec![
        String::from("A"),
        String::from("B"),
    ];
    let index = Index {
        selected_string: &strings[1]
    };
    println!("{}", index.selected_string);
}

And that works! You've tied the structure to the lifetime of the references it holds. If the strings table goes away, then the Index is invalid. Rust won't let this compile:

struct Index<'a> {
    selected_string: &'a String
}

fn main() {
    let index = {
        let strings = vec![
            String::from("A"),
            String::from("B"),
        ];
        let index = Index {
            selected_string: &strings[1]
        };
        index
    };
    println!("{}", index.selected_string);
}

The error message helpfully explains that strings does not live long enough---which is true. This is the primary purpose of the borrow checker: dangling references become a compile-time error, rather than a long head-scratching session at runtime.

Reference Counting - Borrow Checker Escape Hatch

Now that we've covered a lot of ground with the borrow checker, moving and lifetimes---let's breathe a sigh of relief knowing that there are some escape hatches if you need them. They do come at a cost, but it's a manageable one.

Move-by-default and borrowing assume ownership. This is conceptually similar to a unique_ptr in C++: the unique_ptr has ownership of the data it is holding (and handles clean-up for you). C++ also has shared_ptr to handle those times that ownership is murky, and you just want to be sure that the object goes away when nobody is using it anymore.

Rust has Rc (for "reference counted") as a wrapper type for this. (There's also Arc - atomic reference counted - for multi-threaded situations).

You can turn any variable into a reference-counted variable (on the heap) by wrapping it in Rc:

This is in projects/part2/refcount

use std::rc::Rc;

struct MyStruct {}

impl Drop for MyStruct {
    fn drop(&mut self) {
        println!("Dropping");
    }
}

fn move_it(n: Rc<MyStruct>) {
    println!("Moved");
}

fn ref_it(n: &MyStruct) {
    // Do something
}

fn main() {
    let shared = Rc::new(MyStruct{});
    move_it(shared.clone());
    ref_it(&shared);
}

So we take a reference, move a clone (the Rc type is designed to have clone() called whenever you want a new shared pointer to the original)---and the data is only dropped once. It is shared between all the functions. You can use this to spread data widely between functions.

You can't mutate the contents of an Rc without some additional help. We're going to talk about synchronization protection next, in Data Races.

Data-Race Protection

Rust makes the bold claim that it offers "fearless concurrency" and no more data-races (within a program; it can't do much about remote calls). That's a very bold claim, and one I've found to be true so far---I'm much more likely to contemplate writing multi-threaded (and async) code in Rust now that I understand how it prevents me from shooting myself in the foot.

An Example of a Data Race

Here's a little modern C++ program with a very obvious data-racing problem (it's in the cpp/data_race directory):

#include <thread>
#include <iostream>

int main() {
    int counter = 0;
    std::thread t1([&counter]() {
        for (int i = 0; i < 1000000; ++i) {
            ++counter;
        }
    });
    std::thread t2([&counter]() {
        for (int i = 0; i < 1000000; ++i) {
            ++counter;
        }
    });
    t1.join();
    t2.join();

    std::cout << counter << std::endl;

    return 0;
}

The program compiled and ran without any warnings (although additional static analysis programs would probably flag this).

The program fires up two threads. Each loops, incrementing a counter. It joins the threads, and prints the result. The predictable result is that every time I run it, I get a different result: 1015717, 1028094, 1062030 from my runs.

This happens because incrementing an integer isn't a single-step operation:

  1. The CPU loads the current counter value, into a register.
  2. The CPU increments the counter.
  3. The CPU writes the counter back into memory.

There's no guaranty that the two threads won't perform these operations while the other thread is also doing part of the same operation. The result is data corruption.

Let's try the same thing in Rust. We'll use "scoped threads" (we'll be covering threading in a later session) to make life easier for ourselves. Don't worry about the semantics yet:

fn main() {
    let mut counter = 0;
    std::thread::scope(|scope| {
        let t1 = scope.spawn(|| {
            for _ in 0 .. 1000000 {
                counter += 1;
            }
        });
        let t2 = scope.spawn(|| {
            for _ in 0 .. 1000000 {
                counter += 1;
            }
        });
        let _ = t1.join();
        let _ = t2.join(); // let _ means "ignore" - we're ignoring the result type
    });
    println!("{counter}");
}

And now you see the beauty behind the "single mutabile access" rule: the borrow checker prevents the program from compiling, because the threads are mutably borrowing the shared variable. No data race here!

Atomics

If you've used std::thread, you've probably also run into atomic types. An atomic operation is guaranteed to be completed in one CPU operation, and optionally be synchronized between cores. The following C++ program makes use of an std::atomic_int to always give the correct result:

#include <thread>
#include <iostream>
#include <atomic>

int main() {
    std::atomic_int counter = 0;
    std::thread t1([&counter]() {
        for (int i = 0; i < 1000000; ++i) {
            ++counter;
        }
    });
    std::thread t2([&counter]() {
        for (int i = 0; i < 1000000; ++i) {
            ++counter;
        }
    });
    t1.join();
    t2.join();

    std::cout << counter << std::endl;

    return 0;
}

Rust gives you a similar option:

This code is in projects/part2/atomics

use std::sync::atomic::Ordering::Relaxed;
use std::sync::atomic::AtomicU32;

fn main() {
    let counter = AtomicU32::new(0);
    std::thread::scope(|scope| {
        let t1 = scope.spawn(|| {
            for _ in 0 .. 1000000 {
                counter.fetch_add(1, Relaxed);
            }
        });
        let t2 = scope.spawn(|| {
            for _ in 0 .. 1000000 {
                counter.fetch_add(1, Relaxed);
            }
        });
        let _ = t1.join();
        let _ = t2.join(); // let _ means "ignore" - we're ignoring the result type
    });
    println!("{}", counter.load(Relaxed));
}

So Rust and C++ are equivalent in functionality. Rust is a bit more pedantic---making you specify the ordering (which are taken from the C++ standard!). Rust's benefit is that the unsafe version generates an error---otherwise the two are very similar.

Why Does This Work?

So how does Rust know that it isn't safe to share an integer---but it is safe to share an atomic? Rust has two traits that self-implement (and can be overridden in unsafe code): Sync and Send.

  • A Sync type can be modified - it has a synchronization primitive.
  • A Send type can be sent between threads - it isn't going to do bizarre things because it is being accessed from multiple places.

A regular integer is neither. An Atomic integer is both.

Rust provides atomics for all of the primitive types, but does not provide a general Atomic wrapper for other types. Rust's atomic primitives are pretty much a 1:1 match with CPU intrinsics, which don't generally offer sync+send atomic protection for complicated types.

Mutexes

If you want to provide similar thread-safety for complex types, you need a Mutex. Again, this is a familiar concept to C++ users.

Using a Mutex in C++ works like this:

#include <iostream>
#include <thread>
#include <mutex>

int main() {
    std::mutex mutex;
    int counter = 0;
    std::thread t1([&counter, &mutex]() {
        for (int i = 0; i < 1000000; ++i) {
            std::lock_guard<std::mutex> guard(mutex);
            ++counter;
        }
    });
    std::thread t2([&counter, &mutex]() {
        for (int i = 0; i < 1000000; ++i) {
            std::lock_guard<std::mutex> guard(mutex);
            ++counter;
        }
    });
    t1.join();
    t2.join();

    std::cout << counter << std::endl;

    return 0;
}

Notice how using the Mutex is a two-step process:

  1. You declare the mutex as a separate variable to the data you are protecting.
  2. You create a lock_guard by initializing the lock with lock_guard's constructor, taking the mutex as a parameter.
  3. The lock is automatically released when the guard leaves scope, using RAII.

This works, and always gives the correct result. It has one inconvenience that can lead to bugs: there's no enforcement that makes you remember to use the lock. You can get around this by building your own type and enclosing the update inside it---but the compiler won't help you if you forget. For example, commenting out one of the mutex locks won't give any compiler warnings.

Let's build the same thing, in Rust. The Rust version is a bit more complicated:

This code is in projects/part2/mutex

use std::sync::{Arc, Mutex};

fn main() {
    let counter = Arc::new(Mutex::new(0));
    std::thread::scope(|scope| {
        let my_counter = counter.clone();
        let t1 = scope.spawn(move || {
            for _ in 0 .. 1000000 {
                let mut lock = my_counter.lock().unwrap();
                *lock += 1;
            }
        });

        let my_counter = counter.clone();
        let t2 = scope.spawn(move || {
            for _ in 0 .. 1000000 {
                let mut lock = my_counter.lock().unwrap();
                *lock += 1;
            }
        });
        let _ = t1.join();
        let _ = t2.join(); // let _ means "ignore" - we're ignoring the result type
    });
    let lock = counter.lock().unwrap();
    println!("{}", *lock);
}

Let's work through what's going on here:

  1. let counter = Arc::new(Mutex::new(0)); is a little convoluted.
    1. Mutexes in Rust wrap the data they are protecting, rather than being a separate entity. This makes it impossible to forget to lock the data---you don't have access to the interior without obtaining a lock.
    2. Mutex only provides the Sync trait---it can be safely accessed from multiple locations, but it doesn't provide any safety for sending the data between threads.
    3. To gain the Send trait, we also wrap the whole thing in an Arc. Arc is "atomic reference count"---it's just like an Rc, but uses an atomic for the reference counter. Using an Arc ensures that there's only a single counter, with safe access to it from the outside.
    4. Note that counter isn't mutable---despite the fact that it is mutated. This is called interior mutability. The exterior doesn't change, so it doesn't have to be mutable. The interior can be changed via the Arc and the Mutex---which is protected by the Sync+Send requirement.
  2. Before each thread is created, we call let my_counter = counter.clone();. We're making a clone of the Arc, which increments the reference count and returns a shared pointer to the enclosed data. Arc is designed to be cloned every time you want another reference to it.
  3. When we start the thread, we use the let t1 = scope.spawn(move || { pattern. Notice the move. We're telling the closure not to capture references, but instead to move captured variables into the closure. We've made our own clone of the Arc, and its the only variable we are referencing---so it is moved into the thread's scope. This ensures that the borrow checker doesn't have to worry about trying to track access to the same reference across threads (which won't work). Sync+Send protections remain in place, and it's impossible to use the underlying data without locking the mutex---so all of the protections are in place.
  4. let mut lock = my_counter.lock().unwrap(); locks the mutex. It returns a Result, so we're unwrapping it (we'll talk about why later). The lock itself is mutable, because we'll be changing its contents.
  5. We access the interior variable by dereferencing the lock: *lock += 1;

So C++ wins slightly on ergonomics, and Rust wins on preventing you from making mistakes!

Summary

Rust's data race protection is very thorough. The borrow-checker prevents multiple mutable accesses to a variable, and the Sync+Send system ensures that variables that are accessed in a threaded context can both be sent between threads and safely mutated from multiple locations. It's extremely hard to create a data race in safe Rust (you can use the unsafe tag and turn off protections if you need to)---and if you succeed in making one, the Rust core team will file it as a bug.

All of these safety guarantees add up to create an environment in which common bugs are hard to create. You do have to jump through a few hoops, but once you are used to them---you can fearlessly write concurrent code knowing that Rust will make the majority of multi-threaded bugs a compilation error rather than a difficult debugging session.

Opt-Out vs Opt-In Safety

Sometimes you are writing performance critical code, and need to avoid the added overhead of Rust's safety checks. It's also worth noting that C++ has safe versions of many of the things that Rust protects against---but they are opt-in, rather than opt-out. Rust reverses the defaults: you have to opt-out of the safe path if you need to.

Example: Opting Out of Range Checking

If you are genuinely certain that a vector will always contain the element you are looking for, you can opt out of the range check. It's a relatively tiny performance change, but it can make a difference:

fn main() {
    let my_vec = vec![1, 2, 3, 4, 5];
    let entry = unsafe { my_vec.get_unchecked(2) };
    println!("{entry}");
}

Before you start using unchecked functions everywhere, do some profiling to make sure that this really is the bottleneck! Wherever you can, prefer safe code.

Notice that you had to use the unsafe scope tag---otherwise the program won't compile. This is common to Rust: code that may violate the safety guarantees has to be wrapped in unsafe. This acts as a flag for other developers to check that part of the code. There's nothing wrong with having some unsafe code---if you use it responsibly and with appropriate care and attention. Pretty much every Rust program has unsafe in it somewhere; the standard library is full of unsafe code. Once you leave Rust's domain and call into the operating system, you are leaving Rust's area of control---so the exterior call is inherently "unsafe". That doesn't mean its bad, it just means that you have to be careful.

So what happens if we access a non-existent part of the vector?

fn main() {
    let my_vec = vec![1, 2, 3, 4, 5];
    let entry = unsafe { my_vec.get_unchecked(5) };
    println!("{entry}");
}

In my case, I see the output 0. There isn't a zero in the vector. We're reading past the end of the vector, just like we did in C++. The unsafe tag has let you do bypass Rust's memory safety guaranty.

Example of What Not to Do: Turn off safety and enjoy a data race

On the other end of the spectrum, the unsafe tag really does let you do bad things. You can recreate the counter example with unsafe code and have exactly the same problem you started with:

The code for this is in projects/part2/unsafe_threading

fn main() {
    static mut COUNTER: u32 = 0;
    std::thread::scope(|scope| {
        let t1 = scope.spawn(|| {
            for _ in 0 .. 1000000 {
                unsafe {
                    COUNTER += 1;
                }
            }
        });
        let t2 = scope.spawn(|| {
            for _ in 0 .. 1000000 {
                unsafe {
                    COUNTER += 1;
                }
            }
        });
        let _ = t1.join();
        let _ = t2.join(); // let _ means "ignore" - we're ignoring the result type
    });
    unsafe {
        println!("{COUNTER}");
    }
}

Please don't do this. This is just for illustration purposes!

Safety Wrap-Up

So we've taken a deep-dive into Rust's safety promises. Rust offers a significantly safer experience than other systems languages, while remaining deterministic and without a garbage collector:

  • Type conversion is explicit, and offers protections against inadvertent data loss when you cast an i32 into an i16.
  • Overflow behavior is explicit, and trapped at runtime in Debug mode.
  • Checked arithmetic lets you easily act on common overflow, division by zero and similar issues rather than crashing.
  • Out-of-bounds access is checked, preventing common bugs and security vulnerabilities.
  • No null pointers, and therefore no null pointer references.
  • Move, reference and copy are explicit. You can't accidentally pass-by-value.
  • Immutable by default makes it harder to accidentally apply side-effects in your code.
  • RAII makes it difficult to accidentally leak memory or other resources.
  • The borrow checker prevents aliasing and potentially confusing results.
  • The borrow checker prevents subtle bugs like iterator invalidation, albeit occasionally at the cost of making an operation difficult.
  • Lifetime protection prevents dangling pointers, use-after-free and use-after-move.
  • Explicit Send protection ensures that when you access data between threads, the data won't be inadvertently copied, is safe to send between thread contexts.
  • Explicit Sync protection makes data-races practically impossible.
  • You can opt-out of the extra safety checks with the unsafe tag---and should be very careful doing so.

Workspaces, Crates, Programs, Libraries and Modules

Let's talk about some terminology:

  • A crate is a Rust package. It can either be a program or a library---it's a package of code managed by Cargo.
  • A program is an executable program. A crate produces a program if it has a main.rs file, and usually a main function (you can change the main function name, but it does need an entry point)
  • A library is a crate with a lib.rs file. It compiles as a static library by default, you can override this if you need dynamic libraries (Rust is very much oriented towards self-contained statically linked systems).
  • A module is a unit-of-work for the compiler. Programs and libraries are divided into modules.
  • A workspace is a Cargo helper that lets you include multiple crates in one environment with a shared compilation target directory and better incremental compilation.

This is quite unlike C++'s system. #include is almost a cut-and-paste; the new C++20 modules system is a bit more similar--but I had troubles getting it to work consistently across platforms.

Workspaces

The example code uses a workspace, and I'd encourage you to do the same. Workspaces are a great mechanism for storing related code together.

Let's create a workspace.

  1. cd to your parent directory.
  2. Create a new Rust project with cargo new my_workspace.
  3. cd into my_workspace.
  4. Edit src/main.rs to change "Hello, World!" to something like "You probably intended to run a workspace member". This is optional, but helps avoid confusion.
  5. While in my_workspace, create a new project. cargo new hello.
  6. Edit my_workspace/Cargo.toml:
[workspace]
members = [ "hello" ]

Now change directory to my_workspace/hello and run the program with cargo run.

Take a look at my_workspace and you will see that a target directory has appeared. Within a workspace, all compiler artifacts are shared. For large projects, this can save a huge amount of disk space. It can also save on re-downloading dependencies, and will only recompile portions of the workspace that have changed.

While working on Hands-on Rust, I initially had 55 projects in separate crates without a workspace. I noticed that my book's code folder was using nearly 6 gigabytes of disk space, which was crazy. So I added a workspace, and that shrunk to a few hundred megabytes. Every single project was downloading all of the dependencies and building them separately.

Workspaces are safe to upload to github or your preferred Git repo. You can even access dependencies within a workspace remotely (we'll cover that in dependencies).

Libraries

Let's workshop through creating our first library. Keep the my_workspace and hello projects.

Change directory back to the workspace root (my_workspace/). Create a new library project;

cargo new hello_library --lib

Notice the --lib flag. You are creating a library.

Open my_workspace/Cargo.toml and add hello_library as a workspace member:

[workspace]
members = [ "hello", "hello_library" ]

Now open hello_library/src/lib.rs. Notice that Rust has auto-generated an example unit test system. We'll cover that in unit tests shortly. For now, delete it all and replace with the following code:

#![allow(unused)]
fn main() {
pub fn say_hello() {
    println!("Hello, world!");
}
}

The pub marks the function as "public"---available from outside the current module. Since it is in lib.rs, it will be exported in the library.

Now open hello/Cargo.toml and we'll add a dependency:

[dependencies]
hello_libary = { path = "../hello_library" }

And open hello/src/main.rs and we'll use the dependency. Replace the default code with:

use hello_library::say_hello;

fn main() {
    say_hello();
}

Congratulations! You've made your first statically linked library.

Modules and Access

Rust can subdivide code into modules, which can both be and contain public and private (private being the default). Coming from C++, I found this a little confusing. You can also create modules in-place (as namespaces) or in separate files. This can be confusing, so let's work through some examples.

Inline Module (Namespace)

Open hello_library/src/lib.rs. Let's add a private module:

#![allow(unused)]
fn main() {
mod private {
    fn hi() {
        println!("Say Hi!");
    }
}

pub fn say_hello() {
    println!("Hello, world!");
}
}

If you try to use private::hi() in your hello/src/main.rs program---it won't work. The module and the function are both private:

use hello_library::say_hello;

fn main() {
    say_hello();
    say_hello_library::private::hi(); // Will not compile
}

You can fix this by changing the module to be public:

#![allow(unused)]
fn main() {
pub mod private {
    fn hi() {
        println!("Say Hi!");
    }
}

pub fn say_hello() {
    println!("Hello, world!");
}
}

And it still doesn't work! That's because making a module public only exposes the public members of the module. So you also need to decorate the function as public:

#![allow(unused)]
fn main() {
pub mod private {
    pub fn hi() {
        println!("Say Hi!");
    }
}

pub fn say_hello() {
    println!("Hello, world!");
}
}

So that allows you to make a public namespace---and include private parts in the namespace that aren't exposed to the world. What if you want to write a function in a module, and expose it in a different namespace?

#![allow(unused)]
fn main() {
pub mod private {
    pub fn hi() {
        println!("Say Hi!");
    }
}

pub use private::hi;

pub fn say_hello() {
    println!("Hello, world!");
}
}

The use statement---importing something into the current namespace---can also be decorated with pub to re-export that import. You can use this with dependencies or your modules. (It's common to make a prelude module and import all of the most-likely to be useful functions and types into it for re-rexport). Now your program can refer to hello_library::hi directly.

File-based modules

If you're working in a team, it's usually a good idea to not all be trying to edit the same file at once. There are other advantages to using multiple files:

  • Rust can compile multiple files at the same time.
  • Organizing your code with files makes it a lot easier to find things.
  • You can use conditional compilation to include different files based on compilation constraints.

Let's make a one-file module. In hello_library/src create a new file named goodbye.rs. In that file, write:

#![allow(unused)]
fn main() {
pub fn bye() {
    println!("Goodbye");
}
}

Simply having the file doesn't make it do anything, or part of your project. In hello_library/src/lib.rs add a line to include the module:

#![allow(unused)]
fn main() {
mod goodbye;
}

The module is now private, even though the bye function is public! You will be able to access bye elsewhere in your library, but not from consumer applications. You can use the same mechanisms as for inline modules to change that. pub mod exports it as a hello_library::goodbye (the filename is the namespace). Or you can pub use goodbye::bye.

Directory modules

The final type of module places the module in a directory. The directory must contain a mod.rs file to act as the module root---and can include other files or inline modules as above.

Create a new directory, hello_library/src/dirmod. In that directory, create mod.rs:

#![allow(unused)]
fn main() {
pub fn dir_hello() {
    println!("Hello from dir module");
}
}

Now in hello_library/src/lib.rs include the new module:

#![allow(unused)]
fn main() {
pub mod dirmod;
}

You can now access the module in your hello project, with hello_library::dirmod::dir_hello().

Unit Tests

You saw an example unit test when you created a library. Rust/Cargo has a built-in unit testing system. Let's explore it a bit.

Let's build a very simple example, and examine how it works:

The code for this is in projects/part2/unit_test

#![allow(unused)]
fn main() {
fn double(n: i32) -> i32 {
    n * 2
}

#[cfg(test)] // Conditional compilation: only build in `test` mode
mod test { // Create a module to hold the tests
    use super::*; // Include everything from the parent module/namespace

    #[test] // This is a test, we want to include in our unit test runs
    fn two_times() {
        assert_eq!(4, double(2)); // Assert that 2*2 = 4
        assert!(5 != double(2)); // Assert that it doesn't equal 5
    }
}
}

You can run tests for the current project with cargo test. You can append --all to include all projects in the current workspace.

We'll talk about more complicated tests later.

Dependencies

Cargo includes dependency management, as opposed to having to integrate vcpkg, conan, etc.

We've used dependencies already to link a library from the same workspace. Adding other dependencies follows a similar pattern.

Finding Dependencies

You can search the available public crate repo (crates.io) with cargo search <term>. For example, searching for serde (a crate we'll be using later) gives the following result:

$ cargo search serde
serde = "1.0.193"                       # A generic serialization/deserialization framework
sideko_postman_api = "1.0.0"            # Rust API bindings - spostman_api
discord_typed_interactions = "0.1.0"    # suppose you're working with discord slash commands and you want statically typed reques…
serde_json_experimental = "0.0.0"       # A JSON serialization file format
serde_valid = "0.16.3"                  # JSON Schema based validation tool using with serde.
alt_serde_json = "1.0.61"               # A JSON serialization file format
serde_json = "1.0.108"                  # A JSON serialization file format
serde_jsonc = "1.0.108"                 # A JSON serialization file format
serde_partiql = "1.1.65"                # A PartiQL data model serialization file format
deserr = "0.6.1"                        # Deserialization library with focus on error handling
... and 5301 crates more (use --limit N to see more)

It's often more productive to use Google or crates.io directly to see what's available. The number of crates is growing rapidly, and they are of varied quality. It's worth doing a little research before picking one!

Adding Crates to Your Project

You can either use cargo search and find the appropriate information and add a crate by hand to your Cargo.toml:

[dependencies]
serde = "1.0.193"

Or you can use cargo add serde to add the crate to your Cargo.toml.

Feature Flags

Rust crates can have feature flags that enable functionality. For example, when using serde most of the time you will also use the derive feature flag to enable #[derive(Serialize)] type macros that make life much easier.

You'd either edit Cargo.toml to read:

[dependencies]
serde = { version = "1.0.193", features = [ "derive" ] }

Or run cargo add serde -F derive.

Updating Crates

You can update to the latest versions with cargo update.

"Vendoring" Crates

For repeatable builds (or working offline), you can run cargo vendor and follow the instructions. It downloads the source for all of your dependencies, and provides a snippet to add to your Cargo.toml to use the local versions.

Other Crate Sources

You can connect a crate via a Git repo (and optionally add a branch= specifier):

[dependencies]
bracket-lib = { git = "https://github.com/amethyst/bracket-lib.git" }

You can use also use a path, like we did for libraries.

Viewing dependencies

cargo tree will show you all of your dependencies for a project, nested with their dependencies. It can get a bit excessive, sometimes!

Cargo.lock

Cargo maintains a file named Cargo.lock. The Rust project recommend not including this in your git repo---but many people recommend the opposite!

Cargo.lock lists exact versions of every crate that was used by the entire build process. If you have the same Cargo.lock, you will download the exact same files (if they haven't been withdrawn).

Benchmarking

Cargo has built-in benchmarking, but using it requires the nightly unstable code channel. I generally don't recommend relying on nightly code! If you are writing performance-critical code, benchmarking is essential. Fortunately, Rust makes it relatively straightforward to include benchmarks with a bit of boilerplate.

Quick and Dirty Benchmarks

This example is in project/simple_bench

A quick and dirty way to benchmark operations is to use Instant and Duration:

use std::time::Instant;

fn main() {
    let now = Instant::now();
    let mut i = 0;
    for j in 0 .. 1_000 {
        i += j*j;
    }
    let elapsed = now.elapsed();
    println!("Time elapsed: {} nanos", elapsed.as_nanos());
    println!("{i}");
}

Criterion

This project is in projects/part2/criterion_bench

In Cargo.toml, add:

[dev-dependencies]
criterion = { version = "0.4", features = [ "html_reports" ] }

[[bench]]
name = "my_benchmark"
harness = false

[dev-dependencies] is new! This is a dependency that is only loaded by development tools, and isn't integrated into your final program. No space is wasted.

Create <project>/benches/my_benchmark.rs:

#![allow(unused)]
fn main() {
use criterion::{black_box, criterion_group, criterion_main, Criterion};

fn fibonacci(n: u64) -> u64 {
    match n {
        0 => 1,
        1 => 1,
        n => fibonacci(n-1) + fibonacci(n-2),
    }
}

fn criterion_benchmark(c: &mut Criterion) {
    c.bench_function("fib 20", |b| b.iter(|| fibonacci(black_box(20))));
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
}

Run cargo bench and see the result.

Go to target/criterion and you have a full HTML report with statistics.

Flamegraphs

It pretty much requires Linux (and the perf infrastructure), but it's worth looking at Cargo Flamegraphs if you are developing on that platform. It's an easy wrapper around perf for generating flamegraphs to find your hotspots.

Workshop: Build a Basic Rust System

In this workshop, we're going to collaboratively build a login system---gradualling introducing Rust concepts. The goal is to create a useful system.

  1. We'll start by setting up a project with a workspace, a library and an executable.
  2. We'll read input from stdin, and wrap it in a convenient funciton.
  3. That will let us use a basic function "if name =" type of login system.
  4. We'll dive into Rust enumerations, which are quite unlike enum in other languages.
  5. We'll explore storing login information in structures, arrays and vectors---and dabble with iterator functions.
  6. Serialization and deserialization to load/save password files.
  7. Hashing passwords---or how to use dependencies to make life easier.
  8. We'll use clap, a framework for CLI functions to build a CRUD controller for our password manager.

The code for this is presented as a series of projects for each stage. It is in projects/part3.

Setup

Create a parent project (the example code will use the same one we're already using). Inside the parent project, create a workspace. The workspace will contain two members:

  • login_lib --- a library
  • login --- a console application

The login project needs to depend upon login_lib.

Console Text Input

We're going to start by adding a helper function to our library that reads text input from the console.

Let's create a function that will read a line of text from the console.

#![allow(unused)]
fn main() {
pub fn read_line() -> String {
    let mut input = String::new();
    std::io::stdin().read_line(&mut input).expect("Failed to read line");
    input
}
}

What's the expect? Accessing standard input might fail - so Rust is returning a Result type. We're going to look at those later. For now, we're just going to crash if it fails. You can also use unwrap, but expect lets you specify an error message.

Now let's test it:

fn main() {
    let input = login_lib::read_line();
    println!("You typed: [{input}]");
}

Notice how there's an extra carriage return. Reading input keeps the control characters. This is rarely what you want, so let's trim it:

#![allow(unused)]
fn main() {
pub fn read_line() -> String {
    let mut input = String::new();
    std::io::stdin().read_line(&mut input).expect("Failed to read line");
    input.trim().to_string()
}
}

Now let's test it:

fn main() {
    let input = login_lib::read_line();
    println!("You typed: [{input}]");
}

Bingo - trimmed text input.

World's Simplest Login System

The code for this is in the auth and login projects.

Now that we have a library and an application that uses it, let's build the world's most primitive login system.

In the library:

#![allow(unused)]
fn main() {
pub fn login(username: &str, password: &str) -> bool {
    username == "admin" && password == "password"
}
}

And we'll test it:

#[test]
fn test_login() {
    assert!(login("admin", "password"));
    assert!(!login("admin", "wrong"));
    assert!(!login("wrong", "password"));
}

That looks good. But we haven't checked for case. The password should be case-sensitive, but do we really care if the username is Herbert or herbert?

#![allow(unused)]
fn main() {
pub fn login(username: &str, password: &str) -> bool {
    username.to_lowercase() == "admin" && password == "password"
}
}

Now, let's go to the application and use it:

fn main() {
    let mut tries = 0;
    loop {
        println!("Enter your username:");
        let username = read_line();
        println!("Enter your password:");
        let password = read_line();
        if login(&username, &password) {
            println!("Welcome, {username}!");
            break;
        } else {
            println!("Login failed.");
            tries += 1;
            if tries >= 3 {
                println!("Too many failed attempts. Exiting.");
                break;
            }
        }
    }
}

Enumerations

You probably want more options than just "you are allowed in" and "you aren't permitted". You want to know why you aren't permitted. You want to know if the user is locked out, or if they have the wrong password, or if they are a new user and need to register. If the login succeeds, you want to know if they are an admin or a regular user.

Enumerations in Rust are very powerful---they are "algebraic data types" that can capture a lot of data. They are also "sum types"---meaning they only contain one of the types they are defined to contain.

Basic Enumerations

The code for this example is in login_lib_enum and login_enum.

Let's start with the most basic enumeration, which should be familiar from other languages:

#![allow(unused)]
fn main() {
pub enum LoginAction {
    Admin,
    User,
    Denied,
}
}

Now we can update the login function to return this enumeration:

#![allow(unused)]
fn main() {
pub fn login(username: &str, password: &str) -> LoginAction {
    let username = username.to_lowercase();
    if username == "admin" && password == "password" {
        LoginAction::Admin
    } else if username == "bob" && password == "password" {
        LoginAction::User
    } else {
        LoginAction::Denied
    }
}
}

And we can update the application to use it:

#![allow(unused)]
fn main() {
let mut tries = 0;
loop {
    println!("Enter your username:");
    let username = read_line();
    println!("Enter your password:");
    let password = read_line();
    match login(&username, &password) {
        LoginAction::Admin => {
            println!("Welcome {username}, you are an admin.");
            break;
        }
        LoginAction::User => {
            println!("Welcome {username}, you are a regular user.");
            break
        }
        LoginAction::Denied => {
            println!("Login failed.");
            tries += 1;
            if tries >= 3 {
                println!("Too many failed attempts. Exiting.");
                break;
            }
        }
    }
}
}

match is exhaustive: not matching a pattern will fail to compile. You can use _ as a catch-all.

Let's add a unit test to the library:

#![allow(unused)]
fn main() {
#[test]
fn test_enums() {
    assert_eq!(login("admin", "password"), LoginAction::Admin);
    assert_eq!(login("bob", "password"), LoginAction::User);
    assert_eq!(login("admin", "wrong"), LoginAction::Denied);
    assert_eq!(login("wrong", "password"), LoginAction::Denied);
}
}

And everything goes red in the IDE! That's because enumerations don't support comparison by default. Fortunately, it's easy to fix. Let's support debug printing while we're at it:

#![allow(unused)]
fn main() {
#[derive(PartialEq, Debug)]
pub enum LoginAction {
    Admin,
    User,
    Denied,
}
}

#[derive] is a procedural macro that writes code for you.

Enumerations with Data

The code for this section is in login_enum_data and login_lib_enum_data.

Let's clean up our enumerations a bit, and store some data in them:

#![allow(unused)]
fn main() {
#[derive(PartialEq, Debug)]
pub enum LoginAction {
    Granted(LoginRole),
    Denied,
}

#[derive(PartialEq, Debug)]
pub enum LoginRole {
    Admin,
    User,
}

pub fn login(username: &str, password: &str) -> LoginAction {
    let username = username.to_lowercase();
    if username == "admin" && password == "password" {
        LoginAction::Granted(LoginRole::Admin)
    } else if username == "bob" && password == "password" {
        LoginAction::Granted(LoginRole::User)
    } else {
        LoginAction::Denied
    }
}
}

Now we can update the application to use the new data:

#![allow(unused)]
fn main() {
match login(&username, &password) {
    LoginAction::Granted(LoginRole::Admin) => {
        println!("Welcome {username}, you are an admin.");
        break;
    }
    LoginAction::Granted(LoginRole::User) => {
        println!("Welcome {username}, you are a regular user.");
        break
    }
    LoginAction::Denied => {
        println!("Login failed.");
        tries += 1;
        if tries >= 3 {
            println!("Too many failed attempts. Exiting.");
            break;
        }
    }
}
}

Notice how match lets you peer inside multiple levels of enumeration. This type of pattern matching is very useful.

Optional Users

The code for this is found in login_lib_enum_option and login_enum_option.

Maybe we want the login system to know that a user doesn't exist. You might want to offer suggestions, or an option to create a new user. You can do this with an Option. Options are an enumeration that either contain Some(data) or None. They use generics to store whatever type you want to put inside them - but they are a sum type, you are storing one or the other, never both.

#![allow(unused)]
fn main() {
pub fn login(username: &str, password: &str) -> Option<LoginAction> {
    let username = username.to_lowercase();

    if username != "admin" && username != "bob" {
        return None;
    }

    if username == "admin" && password == "password" {
        Some(LoginAction::Granted(LoginRole::Admin))
    } else if username == "bob" && password == "password" {
        Some(LoginAction::Granted(LoginRole::User))
    } else {
        Some(LoginAction::Denied)
    }
}
}

Now we can update the login program to know if a user doesn't exist:

#![allow(unused)]
fn main() {
match login(&username, &password) {
    Some(LoginAction::Granted(LoginRole::Admin)) => {
        println!("Welcome {username}, you are an admin.");
        break;
    }
    Some(LoginAction::Granted(LoginRole::User)) => {
        println!("Welcome {username}, you are a regular user.");
        break
    }
    Some(LoginAction::Denied) => {
        println!("Login failed.");
        tries += 1;
        if tries >= 3 {
            println!("Too many failed attempts. Exiting.");
            break;
        }
    }
    None => {
        println!("User does not exist.");
        break;
    }
}
}

match allows for very deep pattern matching. You usually don't need to nest match statements.

Structures

You probably don't really want to have to write an if statement covering every user in your enterprise. Instead, you want to store usernames and passwords in a structure, along with the user's role.

Basic Structures

The code for this is found in login_lib_struct and login_struct.

You can define a simple structure like this:

#![allow(unused)]
fn main() {
pub struct User {
    pub username: String,
    pub password: String,
    pub role: LoginRole,
}
}

Structs aren't object-oriented, but they share some commonality with objects from other languages. You can define methods on them, and you can define associated functions (functions that are called on the type, not on an instance of the type). Let's make a constructor:

#![allow(unused)]
fn main() {
impl User {
    pub fn new(username: &str, password: &str, role: LoginRole) -> User {
        User {
            username: username.to_lowercase(),
            password: password.to_string(),
            role,
        }
    }
}
}

Array of Structures

Let's create a function that creates an array of users:

#![allow(unused)]
fn main() {
pub fn get_users() -> [User; 2] {
    [
        User::new("admin", "password", LoginRole::Admin),
        User::new("bob", "password", LoginRole::User),
    ]
}
}

Arrays can never change in size, so with an array you are stuck with two users. Arrays do have the advantage of remaining on the stack, making them very fast to access.

Let's modify the login function to use this array:

#![allow(unused)]
fn main() {
pub fn login(username: &str, password: &str) -> Option<LoginAction> {
    let users = get_users();
    if let Some(user) = users.iter().find(|user| user.username == username) {
        if user.password == password {
            return Some(LoginAction::Granted(user.role));
        } else {
            return Some(LoginAction::Denied);
        }
    }
    None
}
}

if let works just like match, but for a single case. You can also write match users.iter().find(|user| user.username == username && user.password == password) { Some(user) => LoginAction::from(user.role), None => LoginAction::Denied } if you prefer.

This doesn't compile. Enumerations aren't copyable by default, because there's no guaranty that the contents are copyable. Add #[derive(Clone)] to the Role enumeration to make it clonable, and return a clone of the role:

#![allow(unused)]
fn main() {
pub fn login(username: &str, password: &str) -> Option<LoginAction> {
    let users = get_users();
    if let Some(user) = users.iter().find(|user| user.username == username) {
        if user.password == password {
            return Some(LoginAction::Granted(user.role.clone()));
        } else {
            return Some(LoginAction::Denied);
        }
    }
    None
}
}

We can test this with the login program, which hasn't changed.

Vectors

Vectors are like arrays, but can change size. They are stored on the heap, so they are slower than arrays, but they are still fast. They are the most common collection type in Rust. Vectors guarantee that everything will be stored contiguously. They also allocate spare space - so you are using more memory than you need, but you don't have to reallocate as often. Vectors double in size every time you run out of capacity---which makes them fast, but you can waste a lot of memory if you don't need it.

The code for this example is in login_lib_vec and login_vec.

Let's create a vector of users:

#![allow(unused)]
fn main() {
pub fn get_users() -> Vec<User> {
    vec![
        User::new("admin", "password", LoginRole::Admin),
        User::new("bob", "password", LoginRole::User),
    ]
}
}

The vec! macro is a helper that moves a list of entries in array format into a vector. You can also do:

#![allow(unused)]
fn main() {
pub fn get_users() -> Vec<User> {
    let mut users = Vec::new();
    users.push(User::new("admin", "password", LoginRole::Admin));
    users.push(User::new("bob", "password", LoginRole::User));
    users
}
}

Now the great part is that the login function doesn't need to change. Iterators are standardized across most collection types, so you can use the same code for arrays and vectors.

Vector Growth

Tip: if you know how big a vector should be, you can create it with Vec::with_capacity(n) to avoid reallocation.

The code for this section is in vector_growth.

Let's create a quick side project to see how vectors grow. Create a new project with cargo new vector_growth. Don't forget to update the workspace! Add this to src/main.rs:

fn main() {
    let mut my_vector = Vec::new();
    for _ in 0..20 {
        my_vector.push(0);
        println!("Size: {}, Capacity: {}", my_vector.len(), my_vector.capacity());
    }
}

This shows:

Size: 1, Capacity: 4
Size: 2, Capacity: 4
Size: 3, Capacity: 4
Size: 4, Capacity: 4
Size: 5, Capacity: 8
Size: 6, Capacity: 8
Size: 7, Capacity: 8
Size: 8, Capacity: 8
Size: 9, Capacity: 16
Size: 10, Capacity: 16
Size: 11, Capacity: 16
Size: 12, Capacity: 16
Size: 13, Capacity: 16
Size: 14, Capacity: 16
Size: 15, Capacity: 16
Size: 16, Capacity: 16
Size: 17, Capacity: 32
Size: 18, Capacity: 32
Size: 19, Capacity: 32
Size: 20, Capacity: 32

Now imagine that you are downloading 1,000,000 items from a database. You want to be careful that you aren't using 2,000,000 capacity slots when you only need 1,000,000. You can use Vec::shrink_to_fit() to reduce the capacity to the size of the vector. This is a hint to the compiler, so it may not actually shrink the vector. You can also use Vec::reserve(n) to reserve n slots in the vector. This is a hint to the compiler, so it may not actually reserve the slots.

Collecting from Iterators

You can collect from an iterator into a vector. This is useful if you want to filter or map a vector. For example, let's say that you want to get all of the users with a role of User. You can do this:

#![allow(unused)]
fn main() {
let users: Vec<User> = get_users().into_iter().filter(|u| u.role == Role::User).collect();
}

Deleting from a Vector---using Retain

You can delete vector entries with retain. This will delete all users except for "kent". Retain takes a function---closure---that returns true if an entry should be kept.

#![allow(unused)]
fn main() {
users.retain(|u| u.username == "kent");
}

Deleting from a Vector---using Remove

You can delete vector entries with remove. This will delete the first user. Remove takes an index.

#![allow(unused)]
fn main() {
users.remove(0);
}

Deleting from a Vector---using Drain

Drain is a special type of delete. It will delete everything, and give it to you as an iterator on the way out. This is useful if you want to delete everything, but you want to do something with the data before you delete it.

#![allow(unused)]
fn main() {
let deleted_users: Vec<User> = users.drain(..).collect();
}

Or more usefully:

#![allow(unused)]
fn main() {
let deleted_users: Vec<User> = users.drain(..).for_each(|user| println!("Deleting {user:?}"));
}

Vectors really are a swiss-army knife: they can do almost anything. They are fast, and they are easy to use. They are the most common collection type in Rust.

HashMaps (aka Dictionaries or Maps)

The code for this is in login_lib_hashmap and login_hashmap.

Vectors are great, but they order data exactly as is it was inserted. Using find requires that Rust read each record in turn, and check to see if its the record you were looking for. That's really, really fast---often faster than other techniques thanks to read-ahead cache in modern CPUs---but it can be difficult for quickly searching large data sets. Vectors also allow duplicate entries.

If you've used Dictionary types in other languages, this is the same thing.

First of all, HashMap isn't a type included in the default namespace. You have to use it. At the top of lib.rs in authentication, add:

#![allow(unused)]
fn main() {
use std::collections::HashMap;
}

For convenience later, let's decorate the User structure with Clone and Debug:

#![allow(unused)]
fn main() {
#[derive(Clone, Debug)]
pub struct User {
    pub username: String,
    pub password: String,
    pub action: LoginAction,
}
}

Now let's change get_users to create a HashMap. We'll use the username as the key:

#![allow(unused)]
fn main() {
pub fn get_users() -> HashMap<String, User> {
    let mut users = HashMap::new();
    users.insert("admin".to_string(), User::new("admin", "password", LoginRole::Admin));
    users.insert("bob".to_string(), User::new("bob", "password", LoginRole::User));
    users
}
}

We also need to change the login function. We can take advantage of HashMap's fast search by using get:

#![allow(unused)]
fn main() {
pub fn login(username: &str, password: &str) -> Option<LoginAction> {
    let users = get_users();

    if let Some(user) = users.get(username) {
        if user.password == password {
            Some(LoginAction::Granted(user.role.clone()))
        } else {
            Some(LoginAction::Denied)
        }
    } else {
        None
    }
}
}

The rest of the program operates the same way. We can run it with cargo run and see that it works the same way as the vector version.

HashMap versus Vector

HashMap is fast, but in a lot of cases it isn't as fast as a vector. When inserting into a vector, the following occurs:

graph TD
    A[Insert] --> B[Check Capacity]
    B --> C(Expand Capacity)
    B --> D
    C --> D(Append Item)

Compare this with a HashMap insert:

graph TD
    A[Insert] --> B[Generate Hash Key]
    B --> C[Check Capacity]
    C --> D(Expand Capacity)
    D --> E
    C --> E(Append Item)

That's a whole additional operation, and generating a hash can be a slow process---especially if you are using a cryptographically sound hashing algorithm.

Let's do a quick benchmark program to see the difference:

This is available as hash_vec_bench.

use std::collections::HashMap;
const ELEMENTS: usize = 1_000_000;

fn main() {
    let mut my_vector = Vec::new();
    let now = std::time::Instant::now();
    for i in 0..ELEMENTS {
        my_vector.push(i);
    }
    let elapsed = now.elapsed();
    println!("Inserting {ELEMENTS} elements into a vector  took {} usecs", elapsed.as_micros());
    
    let mut my_hashmap = HashMap::new();
    let now = std::time::Instant::now();
    for i in 0..ELEMENTS {
        my_hashmap.insert(i, i);
    }
    let elapsed = now.elapsed();
    println!("Inserting {ELEMENTS} elements into a HashMap took {} usecs", elapsed.as_micros());
}

Running this in regular compile (debug), I get:

Inserting 1000000 elements into a vector  took 19066 usecs
Inserting 1000000 elements into a HashMap took 447122 usecs

Running in release mode with cargo run --release enables optimizations. This gets rid of some of the error-checking code, and makes the code run faster. I get:

Inserting 1000000 elements into a vector  took 5632 usecs
Inserting 1000000 elements into a HashMap took 68942 usecs

So you can see that inserting into a HashMap is a lot slower. But what about searching? Let's add a search to the benchmark:

This is found in the hash_vec_search project.

use std::collections::HashMap;
const ELEMENTS: usize = 1_000_000;

fn main() {
    let mut my_vector = Vec::new();
    for i in 0..ELEMENTS {
        my_vector.push(i);
    }

    let mut my_hashmap = HashMap::new();
    for i in 0..ELEMENTS {
        my_hashmap.insert(i, i);
    }

    // Nearly the worst case
    let element_to_find = ELEMENTS - 2;

    let now = std::time::Instant::now();
    let result = my_vector.iter().find(|n| **n == element_to_find);
    println!("{result:?}");
    let elapsed = now.elapsed();
    println!("Vector search took {} usecs", elapsed.as_micros());
    
    let now = std::time::Instant::now();
    let result = my_hashmap.get(&element_to_find);
    println!("{result:?}");
    let elapsed = now.elapsed();
    println!("HashMap search took {} usecs", elapsed.as_micros());
}

Running in regular/debug mode:

Some(999998)
Vector search took 9413 usecs
Some(999998)
HashMap search took 110 usecs

In release mode (cargo run --release):

Some(999998)
Vector search took 1054 usecs
Some(999998)
HashMap search took 107 usecs

So release mode massively improves vector performance, and only slightly improves HashMap performance. But the HashMap is still much faster for searching.

Takeaway: Use HashMap when you are searching larger amounts of data, and Vec when searching isn't your primary task.

Serialization / Deserialization

You probably don't want to hand-type your list of users and recompile every time users change! You might use a local passwords file, or even a database. In this section, we'll look at how to serialize and deserialize data to and from a file.

The code for this is in login_lib_json and login_json.

Dependencies

Serde is the de-facto standard serialization/deserialization library. It's very flexible, and can be used to serialize to and from JSON, XML, YAML, and more. We'll use JSON here.

The first thing to do is to add some dependencies to your auth project.

You need the serde crate, with the feature derive. Run:

cargo add serde -F derive

You also need serde_json:

cargo add serde_json

These commands make your Cargo.toml file look like this:

[package]
name = "login_lib_json"
version = "0.1.0"
edition = "2021"

[dependencies]
serde = { version = "1.0.163", features = ["derive"] }
serde_json = "1.0.96"

Making Data Serializable

Import the Serialize and Deserialize macros:

#![allow(unused)]
fn main() {
use serde::{Serialize, Deserialize};
}

Then decorate your types with #[derive(Serialize, Deserialize)]:

#![allow(unused)]
fn main() {
#[derive(PartialEq, Debug, Serialize, Deserialize)]
pub enum LoginAction {
    Granted(LoginRole),
    Denied,
}

#[derive(PartialEq, Debug, Clone, Serialize, Deserialize)]
pub enum LoginRole {
    Admin,
    User,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct User {
    pub username: String,
    pub password: String,
    pub role: LoginRole,
}
}

The macros write all the hard code for you. The only requirement is that every type you are including must also support Serialize and Deserialize. You can implement traits and write the serialization by hand if you prefer - but it's very verbose.

Now let's change our "get_users" system to work with a JSON file.

Serializing to JSON

First, we add a get_default_users function. If there isn't a users file, we'll use this to make one:

#![allow(unused)]
fn main() {
fn get_default_users() -> HashMap<String, User> {
    let mut users = HashMap::new();
    users.insert("admin".to_string(), User::new("admin", "password", LoginRole::Admin));
    users.insert("bob".to_string(), User::new("bob", "password", LoginRole::User));
    users
}
}

Next, let's change the get_users function to look for a users.json file and see if it exists:

#![allow(unused)]
fn main() {
pub fn get_users() -> HashMap<String, User> {
    let users_path = Path::new("users.json");
    if users_path.exists() {
        // Load the file
        HashMap::new()
    } else {
        // Create a file and return it
        let users = get_default_users();
        let users_json = serde_json::to_string(&users).unwrap();
        std::fs::write(users_path, users_json).unwrap();
        users
    }
}
}

That's all there is to creating a JSON file! We use serde_json::to_string to convert our users HashMap into a JSON string, and then write it to the file. Run the program, and users.json will appear:

{"bob":{"username":"bob","password":"password","role":"User"},"admin":{"username":"admin","password":"password","role":"Admin"}}

Deserializing from JSON

Let's extend the get_users function to read from users.json if it exists:

#![allow(unused)]
fn main() {
pub fn get_users() -> HashMap<String, User> {
    let users_path = Path::new("users.json");
    if users_path.exists() {
        // Load the file
        let users_json = std::fs::read_to_string(users_path).unwrap();
        let users: HashMap<String, User> = serde_json::from_str(&users_json).unwrap();
        users
    } else {
        // Create a file and return it
        let users = get_default_users();
        let users_json = serde_json::to_string(&users).unwrap();
        std::fs::write(users_path, users_json).unwrap();
        users
    }
}
}

Equally simple - you load the file, deserialize it with serde_json::from_str, and you're done! You can now edit the JSON file, and your changes will be loaded when a user tries to login.

Let's change admin's password to password2 and test it.

Hashing Passwords

The code for this section is in login_lib_hash and login_hash.

You probably don't want to be saving passwords in plain text. It's high on the list of "programming oopsies" that lead to security issues.

Instead, you should hash the password. Hashing is a one-way function that takes a string and returns a fixed-length string. It's not possible to reverse the process, so you can't get the original password back from the hash.

Dependencies

We want to add another dependency, this time on the sha2 crate. You can either run cargo add sha2 or edit the Cargo.toml file yourself:

[package]
name = "login_lib_hash"
version = "0.1.0"
edition = "2021"

[dependencies]
serde = { version = "1.0.163", features = ["derive"] }
serde_json = "1.0.96"
sha2 = "0"

Now open lib.rs and add a new function to actually hash passwords:

#![allow(unused)]
fn main() {
pub fn hash_password(password: &str) -> String {
    use sha2::Digest;
    let mut hasher = sha2::Sha256::new();
    hasher.update(password);
    format!("{:X}", hasher.finalize())
}
}

Hashing Passwords as Users are Added

We're creating the users.json file as needed, but we're not hashing the passwords. Let's fix that. If you have a users.json file, delete it - so we can start afresh.

Change the User constructor to automatically hash the password it is given:

#![allow(unused)]
fn main() {
impl User {
    pub fn new(username: &str, password: &str, role: LoginRole) -> User {
        User {
            username: username.to_lowercase(),
            password: hash_password(password),
            role,
        }
    }
}
}

If you run the program, your login will fail - but you'll see that the users.json file now has a hashed password:

{
    "admin": {
        "username": "admin",
        "password": "5E884898DA28047151D0E56F8DC6292773603D0D6AABBDD62A11EF721D1542D8",
        "role": "Admin"
    },
    "bob": {
        "username": "bob",
        "password": "5E884898DA28047151D0E56F8DC6292773603D0D6AABBDD62A11EF721D1542D8",
        "role": "User"
    }
}

That's definitely harder to guess! Now we can update the login function to hash the incoming password and compare it to the stored hash:

#![allow(unused)]
fn main() {
pub fn login(username: &str, password: &str) -> Option<LoginAction> {
    let users = get_users();
    let password = hash_password(password);

    if let Some(user) = users.get(username) {
        if user.password == password {
            Some(LoginAction::Granted(user.role.clone()))
        } else {
            Some(LoginAction::Denied)
        }
    } else {
        None
    }
}
}

We've added one line - replacing the password with a hashed version. Run the login program now, and it should work.

Building a Login Manager App

We've already built a moderately useful login system: it can read users from a JSON file, creating a default if necessary. Logins are checked, passwords are hashed, and different login roles work. Let's spend the rest of our time together building a login_manager application that provides a command-line interface to our login system.

Creating a New Project

Create a new login_manager project:

cargo new login_manager

Open the parent Cargo.toml and add login_manager to the workspace.

Now add the auth library to your login_manager's Cargo.toml file:

[dependencies]
auth = { path = "../auth" }

Creating a CLI

The de-facto standard approach to building CLI applications is provided by a crate named clap. Add it with:

cargo add clap -F derive

Clap does a lot, and the "derive" feature adds some useful macros to reduce the amount of typing we need to do.

Let's create a minimal example and have a look at what Clap is doing for us:

use clap::{Parser, Subcommand};

#[derive(Parser)]
#[command()]
struct Args {
  #[command(subcommand)]
  command: Option<Commands>,
}

#[derive(Subcommand)]
enum Commands {
  /// List all users.
  List,
}


fn main() {
    let cli = Args::parse();
    match cli.command {
        Some(Commands::List) => {
            println!("All Users Goes Here\n");
        }
        None => {
            println!("Run with --help to see instructions");
            std::process::exit(0);
        }
    }    
}

This has added a surprising amount of functionality!

cargo run on its own emits Run with --help to see instructions. Clap has added --help for us.

Running cargo and then passing command-line arguments through uses some slightly strange syntax. Let's give --help a go:

cargo run -- --help

Usage: login_manager.exe [COMMAND]

Commands:
  list  List all users
  help  Print this message or the help of the given subcommand(s)

Options:
  -h, --help  Print help

You an even ask it for help about the list feature:

List all users

Usage: login_manager.exe list

Options:
  -h, --help  Print help

Now, let's implement the list command.

fn list_users() {
    println!("{:<20}{:<20}", "Username", "Login Action");
    println!("{:-<40}", "");

    let users = get_users();
    users
        .iter()
        .for_each(|(_, user)| {
            println!("{:<20}{:<20?}", user.username, user.role);
        });
}

fn main() {
    let cli = Args::parse();
    match cli.command {
        Some(Commands::List) => list_users(),
        None => {
            println!("Run with --help to see instructions");
            std::process::exit(0);
        }
    }    
}

Now running cargo run -- list gives us:

Username            Login Action        
----------------------------------------
admin               Admin
bob                 User

Adding Users

We're going to need a way to save the users list, so in the auth library let's add a function:

#![allow(unused)]
fn main() {
pub fn save_users(users: &HashMap<String, User>) {
    let users_path = Path::new("users.json");
    let users_json = serde_json::to_string(&users).unwrap();
    std::fs::write(users_path, users_json).unwrap();
}
}

This is the same as what we did before---but exposed as a function.

Let's add an "add" option. It will have parameters, you need to provide a username, password and indicate if the user is an administrator:

#![allow(unused)]
fn main() {
#[derive(Subcommand)]
enum Commands {
  /// List all users.
  List,
  /// Add a user.
  Add {
    /// Username
    username: String,

    /// Password
    password: String,

    /// Optional - mark as an admin
    #[arg(long)]
    admin: Option<bool>,
  }
}
}

Add a dummy entry to the match statement:

#![allow(unused)]
fn main() {
Some(Commands::Add { username, password, admin }) => {},
}

And run cargo run -- add --help to see what Clap has done for us:

Add a user

Usage: login_manager.exe add [OPTIONS] <USERNAME> <PASSWORD>

Arguments:
  <USERNAME>  Username
  <PASSWORD>  Password

Options:
      --admin <ADMIN>  Optional - mark as an admin [possible values: true, false]
  -h, --help           Print help

Now we can implement the add command:

fn add_user(username: String, password: String, admin: bool) {
    let mut users = get_users();
    let role = if admin {
        LoginRole::Admin
    } else {
        LoginRole::User
    };
    let user = User::new(&username, &password, role);
    users.insert(username, user);
    save_users(&users);
}

fn main() {
    let cli = Args::parse();
    match cli.command {
        Some(Commands::List) => list_users(),
        Some(Commands::Add { username, password, admin }) => 
            add_user(username, password, admin.is_some()),
        None => {
            println!("Run with --help to see instructions");
            std::process::exit(0);
        }
    }    
}

And now you can run cargo run -- add fred password and see the new user in the list.

{
    "fred": {
        "username": "fred",
        "password": "5E884898DA28047151D0E56F8DC6292773603D0D6AABBDD62A11EF721D1542D8",
        "role": "User"
    },
    "admin": {
        "username": "admin",
        "password": "5E884898DA28047151D0E56F8DC6292773603D0D6AABBDD62A11EF721D1542D8",
        "role": "Admin"
    },
    "bob": {
        "username": "bob",
        "password": "5E884898DA28047151D0E56F8DC6292773603D0D6AABBDD62A11EF721D1542D8",
        "role": "User"
    }
}

Let's add one more thing. Warn the user if a duplicate occurs:

#![allow(unused)]
fn main() {
fn add_user(username: String, password: String, admin: bool) {
    let mut users = get_users();
    if users.contains_key(&username) {
        println!("{username} already exists");
        return;
    }
}

Deleting Users

Let's add a delete command. This will take a username and remove it from the list:

#![allow(unused)]
fn main() {
#[derive(Subcommand)]
enum Commands {
    /// List all users.
    List,
    /// Add a user.
    Add {
        /// Username
        username: String,

        /// Password
        password: String,

        /// Optional - mark as an admin
        #[arg(long)]
        admin: Option<bool>,
    },
    /// Delete a user
    Delete {
        /// Username
        username: String,
    },
}
}

As expected, --help and cargo run -- delete --help` have been updated.

Now let's implement the deletion:

#![allow(unused)]
fn main() {
fn delete_user(username: &str) {
    let mut users = get_users();
    if users.contains_key(username) {
        users.remove(username);
        save_users(&users);
    } else {
        println!("{username} does not exist");
    }
}
}

And add it to the command matcher:

#![allow(unused)]
fn main() {
Some(Commands::Delete { username }) => delete_user(&username),
}

You can now remove fred from the list with cargo run -- delete fred. Check that he's gone with cargo run -- list:

Username            Login Action
----------------------------------------
bob                 User
admin               Admin

Changing Passwords

You've got the Create, Read and Delete of "CRUD" - let's add some updating!

A command to change the user's password is in order. This will take a username and a new password:

#![allow(unused)]
fn main() {
enum Commands {
    /// List all users.
    List,
    /// Add a user.
    Add {
        /// Username
        username: String,

        /// Password
        password: String,

        /// Optional - mark as an admin
        #[arg(long)]
        admin: Option<bool>,
    },
    /// Delete a user
    Delete {
        /// Username
        username: String,
    },
    /// Change a password
    ChangePassword {
        /// Username
        username: String,

        /// New Password
        new_password: String,
    },
}
}

And let's implement it:

#![allow(unused)]
fn main() {
fn change_password(username: &str, password: &str) {
    let mut users = get_users();
    if let Some(user) = users.get_mut(username) {
        user.password = auth_login_manager::hash_password(password);
        save_users(&users);
    } else {
        println!("{username} does not exist");
    }
}
}

And add it to the match:

#![allow(unused)]
fn main() {
Some(Commands::ChangePassword { username, new_password }) => {
    change_password(&username, &new_password)
}
}

Go ahead and test changing a password.

Threaded Concurrency

System threads are the most basic unit of concurrency. Rust has excellent threading support---and makes it really difficult to corrupt your data with a race condition.

Create Your First Thread

This uses the first_thread code, in code/02_threads.

Create a new project - with a workspace

Looking back at the workspaces class from last week, it's a great idea to have a workspace. Let's create one:

cargo new LiveWeek2

Now edit Cargo.toml to include a workspace:

[workspace]
members = []

Now change directory to the LiveWeek2 directory and create a new project named FirstThread:

cd LiveWeek2
cargo new FirstThread

And add the project to the workspace:

[workspace]
members = [
    "FirstThread"
]

Your First Thread

In main.rs, replace the contents with the following:

fn hello_thread() {
    println!("Hello from thread!");
}

fn main() {
    println!("Hello from main thread!");

    let thread_handle = std::thread::spawn(hello_thread);
    thread_handle.join().unwrap();
}

Now run the program:

Hello from main thread!
Hello from thread!

So what's going on here? Let's break it down:

  1. The program starts in the main thread.
  2. The main thread prints a message.
  3. We create a thread using std::thread::spawn and tell it to run the function hello_thread.
  4. The return value is a "thread handle". You can use these to "join" threads---wait for them to finish.
  5. We call join on the thread handle, which waits for the thread to finish.

What happens if we don't join the thread?

Run the program a few times. Sometimes the secondary thread finishes, sometimes it doesn't. Threads don't outlive the main program, so if the main program exits before the thread finishes, the thread is killed.

Spawning Threads with Parameters

This uses the thread_closures code, in code/02_threads.

The spawn function takes a function without parameters. What if we want to pass parameters to the thread? We can use a closure:

fn hello_thread(n: u32) {
    println!("Hello from thread {n}!");
}

fn main() {
    let mut thread_handles = Vec::new();
    for i in 0 .. 5 {
        let thread_handle = std::thread::spawn(move || hello_thread(i));
        thread_handles.push(thread_handle);
    }
    thread_handles.into_iter().for_each(|h| h.join().unwrap());
}

Notice three things:

  • We're using a closure---an inline function that can capture variables from the surrounding scope.
  • We've used the shorthand format for closure: || code - parameters live in the || (there aren't any), and a single statement goes after the ||. You can use complex closures with a scope: |x,y| { code block }.
  • The closure says move. Remember when we talked about ownership? You have to move variables into the closure, so the closure gains ownership of them. The ownership is then passed to the thread. Otherwise, you have to use some form of synchronization to ensure that data is independently accessed---to avoid race conditions.

The output will look something like this (the order of the threads will vary):

Hello from thread 0!
Hello from thread 2!
Hello from thread 1!
Hello from thread 4!
Hello from thread 3!

In this case, as we talked about last week in Rust Fundamentals integers are copyable. So you don't have to do anything too fancy to share them.

Returning Data from Threads

See the code thread_return in code/02_threads.

The thread handle will return any value returned by the thread. It's generic, so it can be of any type (that supports sync+send; we'll cover that later). Each thread has its own stack, and can make normal variables inside the thread---and they won't be affected by other threads.

Let's build an example:

fn do_math(i: u32) -> u32 {
    let mut n = i+1;
    for _ in 0 .. 10 {
        n *= 2;
    }
    n
}

fn main() {
    let mut thread_handles = Vec::new();
    for i in 0..10 {
        thread_handles.push(std::thread::spawn(move || {
            do_math(i)
        }));
    }

    for handle in thread_handles {
        println!("Thread returned: {}", handle.join().unwrap());
    }
}

This returns:

Thread returned: 1024
Thread returned: 2048
Thread returned: 3072
Thread returned: 4096
Thread returned: 5120
Thread returned: 6144
Thread returned: 7168
Thread returned: 8192
Thread returned: 9216
Thread returned: 10240

Notice that each thread is doing its own math, and returning its own value. The join function waits for the thread to finish, and returns the value from the thread.

Dividing Workloads

The code for this is in divide_workload, in the code/02_threads folder.

We can use threads to divide up a workload. Let's say we have a vector of numbers, and we want to add them all up. We can divide the vector into chunks, and have each thread add up its own chunk. Then we can add up the results from each thread.

fn main() {
    const N_THREADS: usize = 8;

    let to_add: Vec<u32> = (0..5000).collect(); // Shorthand for building a vector [0,1,2 .. 4999]
    let mut thread_handles = Vec::new();
    let chunks = to_add.chunks(N_THREADS);

    // Notice that each chunk is a *slice* - a reference - to part of the array.    
    for chunk in chunks {
        // So we *move* the chunk into its own vector, taking ownership and
        // passing that ownership to the thread. This adds a `memcpy` call
        // to your code, but avoids ownership issues.
        let my_chunk = chunk.to_owned();

        // Each thread sums its own chunk. You could use .sum() for this!
        thread_handles.push(std::thread::spawn(move || {
            let mut sum = 0;
            for i in my_chunk {
                sum += i;
            }
            sum
        }));
    }

    // Sum the sums from each thread.
    let mut sum = 0;
    for handle in thread_handles {
        sum += handle.join().unwrap();
    }
    println!("Sum is {sum}");
}

There's a lot to unpack here, so I've added comments:

  1. We use a constant to define how many threads we want to use. This is a good idea, because it makes it easy to change the number of threads later. We'll use 8 threads, because my laptop happens to have 8 cores.
  2. We create a vector of numbers to add up. We use the collect function to build a vector from an iterator. We'll cover iterators later, but for now, just know that collect builds a vector from a range. This is a handy shorthand for turning any range into a vector.
  3. We create a vector of thread handles. We'll use this to join the threads later.
  4. We use the chunks function to divide the vector into chunks. This returns an iterator, so we can use it in a for loop. Chunks aren't guaranteed to be of equal size, but they're guaranteed to be as close to equal as possible. The last chunk will be smaller than the others.
  5. Now we hit a problem:
    • chunks is a vector owned by the main thread.
    • Each chunk is a slice --- a borrowed reference --- to part of the vector.
    • We can't pass a borrowed reference to a thread, because the thread might outlive the main thread. There's no guarantee that the order of execution will ensure that the data is destroyed in a safe order.
    • Instead, we use to_owned which creates an owned copy of each chunk. This is a memcpy operation, so it's not free, but it's safe.

This is a common pattern when working with threads. You'll often need to move data into the thread, rather than passing references.

Moving chunks like this works fine, but if you are using threads to divide up a heavy workload with a single answer --- there's an easier way!

The ThreadBuilder Pattern

The code for this is in the thread_builder example, in the code/02_threads directory.

Sometimes, you want more control over the creation of a thread. Rust implements a builder pattern to help you with this.

Let's build a quick example:

use std::thread;

fn my_thread() {
    println!("Hello from a thread named {}", thread::current().name().unwrap());
}

fn main() {
    thread::Builder::new()
        .name("Named Thread".to_string())
        .stack_size(std::mem::size_of::<usize>() * 4)
        .spawn(my_thread).unwrap();
}

We've named the thread. This doesn't actually do much, but it's nice to have a name for the thread when debugging. You can reference the current thread name in log messages to help you figure out what's going on, and some debuggers will display the thread name.

We've also set the stack size. This is the amount of memory that the thread will have available for its stack. The default is 2MB, but you can set it to whatever you want. In this case, we've set it to 4 times the size of a pointer, which is 32 bits on a 32-bit system and 64 bits on a 64-bit system. This is a pretty small stack, but it's enough for this example.

Setting the stack size is useful if you are running a lot of threads, and want to reduce the memory overhead of each thread. If you have a lot of threads, and they're not doing much, you can reduce the stack size to save memory. If you don't allocate enough stack, you're thread will crash when you try to use more stack than you've allocated. Most of the time---you don't need to set this!

Scoped Threads

The code for this is in scoped_threads, in the code/02_threads folder.

In the previous example we divided our workload into chunks and then took a copy of each chunk. That works, but it adds some overhead. Rust has a mechanism to assist with this pattern (it's a very common pattern): scoped threads.

Let's build an example:

use std::thread;

fn main() {
    const N_THREADS: usize = 8;

    let to_add: Vec<u32> = (0..5000).collect();
    let chunks = to_add.chunks(N_THREADS);
    let sum = thread::scope(|s| {
        let mut thread_handles = Vec::new();

        for chunk in chunks {
            let thread_handle = s.spawn(move || {
                let mut sum = 0;
                for i in chunk {
                    sum += i;
                }
                sum
            });
            thread_handles.push(thread_handle);
        }

        thread_handles
            .into_iter()
            .map(|handle| handle.join().unwrap())
            .sum::<u32>()
    });
    println!("Sum is {sum}");
}

This is quite similar to the previous example, but we're using scoped threads. When you use thread::scope you are creating a thread scope. Any threads you spawn with the s parameter are guaranteed to end when the scope ends. You can still treat each scope just like a thread.

Because the threads are guaranteed to terminate, you can safely borrow data from the parent scope. This is a lifetime issue: a normal thread could keep running for a long time, past the time the scope that launched it ends---so borrowing data from that scope would be a bug (and a common cause of crashes and data corruption in other languages). Rust won't let you do that. But since you have the guarantee of lifetime, you can borrow data from the parent scope without having to worry about it.

This pattern is perfect for when you want to fan out a workload to a set of calculation threads, and wait to combine them into an answer.

Sharing Data with Read/Write Locks

It's a really common pattern to have some data that changes infrequently, mostly accessed by worker threads---but occasionally, you need to change it.

The code for this is in rwlock in the code/02_threads directory.

We're going to use once_cell in this example, so add it with cargo add once_cell. once_cell is on its way into the standard library.

Let's build a simple example of this in action:

use std::sync::RwLock;
use once_cell::sync::Lazy;

static USERS: Lazy<RwLock<Vec<String>>> = Lazy::new(|| RwLock::new(build_users()));

fn build_users() -> Vec<String> {
    vec!["Alice".to_string(), "Bob".to_string()]
}

// Borrowed from last week!
pub fn read_line() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("Failed to read line");
    input.trim().to_string()
}

fn main() {
    std::thread::spawn(|| {
        loop {
            println!("Current users (in a thread)");
            let users = USERS.read().unwrap();
            println!("{users:?}");
            std::thread::sleep(std::time::Duration::from_secs(3));
        }
    });

    loop {
        println!("Enter a name to add to the list (or 'q' to quit):");
        let input = read_line();
        if input == "q" {
            break;
        }
        let mut users = USERS.write().unwrap();
        users.push(input);
    }
}

Notice that we've used the Lazy pattern we talked about last week: the static variable is only initialized when someone looks at it.

We've wrapped the list of names in an RwLock. This is like a Mutex, but you can either read or write to it. You can have multiple readers, but only one writer.

Uncontested read is very fast. Pausing for a write is very slightly slower than a Mutex, but not by much.

Deadlocks

One particularly nasty side-effect of locking your data is a deadlock. A Mutex or Read/Write lock blocks until access is obtained. If you try to lock a mutex twice (or obtain write access to an RwLock twice)---even in the same thread---you'll deadlock. The thread will block forever, waiting for itself to release the lock.

Rust can't prevent deadlocks. It provides mechanisms to help avoid them, but you have to use them correctly.

Deadlocking Example

The code for this section is in the deadlocks directory in the code/02_threads directory.

Here's a simple way to completely lock up your program:

use std::sync::Mutex;

fn main() {
    let my_shared = Mutex::new(0);

    let lock = my_shared.lock().unwrap();
    let lock = my_shared.lock().unwrap();
}

The program never stops! My trying to acquire the Mutex twice, the thread deadlocks. It's waiting for itself to release the lock.

Try Locking

One way to avoid deadlocks is to use try_lock instead of lock:

use std::sync::Mutex;

static MY_SHARED : Mutex<u32> = Mutex::new(0);

fn main() {

    if let Ok(_lock) = MY_SHARED.try_lock() {
        println!("I got the lock!");

        // Try again, but this time, the lock is already taken
        if let Ok(_lock) = MY_SHARED.try_lock() {
            println!("I got the lock!");
        } else {
            println!("I couldn't get the lock!");
        }

    } else {
        println!("I couldn't get the lock!");
    }
}

The downside here is that try_lock doesn't wait. It either succeeds or it fails. If it fails, you can try again later, but you have to be careful not to try too often. If you try too often, you'll end up with a busy loop---hitting the CPU over and over again.

It's better to not write any deadlocks!

Explicitly Dropping Locks

Locks are "scope guarded". They implement Drop, which we'll talk about in the memory management class. If you're used to C++, it's the RAII---Resource Acquisition Is Initialization---pattern.

When a lock goes out of scope, it is automatically released. So you don't need to worry about releasing a lock in normal code---it's done for you.

Sometimes, you want to explicitly get rid of a lock. Maybe you're in a function that locks to check something, and later locks again to do something else. You can explicitly drop the lock to release it early. You may want to do this if you're going to be doing a lot of work between locks, and you don't want to hold the lock for that long.

#![allow(unused)]
fn main() {
let lock = MY_SHARED.lock().unwrap();
std::mem::drop(lock);
let lock = MY_SHARED.lock().unwrap();
}

That's ugly, but it works. Calling "drop" invalidates the first lock---you can no longer use that variable.

Cleaning Up Locks with Scopes

A prettier way to do the same thing is to manually introduce a scope. (Note that if you are using a scope, you might want to look at a function for readability!). This is the same as the previous example, but the "drop" is implicit because of the scope:

#![allow(unused)]
fn main() {
// Using a scope to drop a lock
{
    let _lock = MY_SHARED.lock().unwrap();
    println!("I got the lock!");
}
let _lock = MY_SHARED.lock().unwrap();
println!("I got the lock again!");
}

Mutex Poisoning

The code for this is in the mutex_poisoning directory in the code/02_threads directory.

If a thread crashes/panics while holding a lock, the lock becomes poisoned. This is a safety feature of Rust: since the thread crashed, you can't be sure that the contents of the lock is safe. So the lock is poisoned, and any attempt to lock it will fail.

Let's have a look:

use std::sync::Mutex;

static DATA: Mutex<u32> = Mutex::new(0);

fn poisoner() {
    let mut lock = DATA.lock().unwrap();
    *lock += 1;
    panic!("And poisoner crashed horribly");
}

fn main() {
    let handle = std::thread::spawn(poisoner);
    println!("Trying to return from the thread:");
    println!("{:?}", handle.join());
    println!("Locking the Mutex after the crash:");
    let lock = DATA.lock();
    println!("{lock:?}");
}

This gives the following output:

Trying to return from the thread:
thread '<unnamed>' panicked at 'And poisoner crashed horribly', 02_threads\mutex_poisoning\src\main.rs:8:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Err(Any { .. })
Locking the Mutex after the crash:
Err(PoisonError { .. })

So what's happening here?

  1. The thread runs and explicitly panics.
  2. Panicking in a thread doesn't crash the whole program (unless you've setup a panic handler to do so).
  3. Joining a thread that crashed returns an error, telling you that the thread crashed.
  4. Since the thread crashed while it had the Mutex locked, the Mutex is poisoned.
  5. Any attempt to lock the Mutex will fail.

So the glib answer to avoiding this is "don't crash". Handle errors, be a good program. That's great, but the real world doesn't always work that way. Sometimes, you have to deal with crashes.

Recovering from Poisoning

In a lot of cases, since you've experienced a failure the correct thing to do is crash! It's often safer to crash than to propagate bad data.

This is one time that unwrap() is great---your Mutex is dangerous, so unwrap it and crash:

#![allow(unused)]
fn main() {
let lock = DATA.lock().unwrap();
}

Again, in most cases I don't really recommend it - but the PoisonError type contains a locked mutex that you can use to obtain the data. There's absolutely no guaranty that the data is in good shape, depending upon why the thread crashed. So be careful!

#![allow(unused)]
fn main() {
// Let's try to save the day by recovering the data from the Mutex
let recovered_data = lock.unwrap_or_else(|poisoned| {
    println!("Mutex was poisoned, recovering data...");
    poisoned.into_inner()
});
println!("Recovered data: {recovered_data:?}");
}

Sharing Data with Lock-Free Structures

We've covered using locking to safely share data, and atomics to safely share some data types without locks---but there's a third choice. Some data structures are "lock free", and can be shared between threads without locks or atomics. This is a very advanced topic---we'll touch on more of it in a couple of weeks. For now, let's look at a couple of pre-made crates that can help us share data without locks.

DashMap and DashSet

If you have data that fits well into a HashMap or HashSet, DashMap is a great choice. It's a lock-free hash map that can be shared between threads. It has its own interior locking, and uses a "generational" system (similar to Java's garbage collector) for memory management. It's a great choice for a lot of use cases.

See the lockfree_map project in the code/02_threads directory.

Let's add dashmap to our Cargo.toml with cargo add dashmap. We'll use once_cell again for initialization (cargo add once_cell)./

This code is in the lockfree_map example in code/02_threads.

Then we'll write a program to use it:

use std::time::Duration;
use dashmap::DashMap;
use once_cell::sync::Lazy;

static SHARED_MAP: Lazy<DashMap<u32, u32>> = Lazy::new(DashMap::new);

fn main() {
    for n in 0..100 {
        std::thread::spawn(move || {
            loop {
                if let Some(mut v) = SHARED_MAP.get_mut(&n) {
                    *v += 1;
                } else {
                    SHARED_MAP.insert(n, n);
                }
            }
        });
    }

    std::thread::sleep(Duration::from_secs(5));
    println!("{SHARED_MAP:#?}");
}

This sleeps for 5 seconds while 100 threads insert or update data. There are no locks, and no atomics. It's all done with a lock-free data structure.

You can use DashMap and DashSet just like you would a regular HashMap and HashSet, with the exception that iterators are a little different. Instead of iterating on a tuple of (key, value), you access just values and call the key() function to obtain the key.

Let's extend the example to show this:

#![allow(unused)]
fn main() {
for v in SHARED_MAP.iter() {
    println!("{}: {}", v.key(), v.value());
}
}

Parking Threads

The code for this is in parking, in the code/02_threads directory.

We've talked about how advantageous it can be to create a thread and reuse it. Sometimes, you want to run a threaded task occasionally---but still in the background---and with as little latency as possible. One way to do this is to "park" the thread, which means to put it to sleep until it's needed again. Parked threads consume almost no CPU time, and can be woken up very quickly.

Let's build an example of parking threads:

fn read_line() -> String {
    // <- Public function
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("Failed to read line");
    input.trim().to_string()
}

fn parkable_thread(n: u32) {
    loop {
        std::thread::park();
        println!("Thread {n} is awake - briefly!");
    }
}

fn main() {
    let mut threads = Vec::new();
    for i in 0..10 {
        let thread = std::thread::spawn(move || parkable_thread(i));
        threads.push(thread);
    }

    loop {
        println!("Enter a thread number to awaken, or q to quit");
        let input = read_line();
        if input == "q" {
            break;
        }
        if let Ok(number) = input.parse::<u32>() {
            if number < threads.len() as u32 {
                threads[number as usize].thread().unpark();
            }
        }
    }
}

Notice that any thread can call park, and suspend the currently executing thread. In this example, we park 10 threads and let the user choose to wake them up from the keyboard.

This can be very useful if you have a monitor going that detects an event, and wakes up the relevant thread when it detects that the thread is needed. It has the downside that you aren't sending any data to the thread. We'll talk about that next.

Sending Data Between Threads with Channels

Parking a thread is great, but you often need to tell a thread why you woke it up, or give it some data to work with. This is where channels come in.

If you're used to Go, channels should sound familiar. They are very similar to Go's channels. A few differences:

  • Rust Channels are strongly typed. So you can use a sum type/enum to act like a command pattern.
  • Rust Channels are bounded by size, and will block if you try to send data to a full channel.
  • Rust Channels are unidirectional. You can't send data back to the sender. (You can make another channel)
  • You can't forget to close a channel. Once a channel is out of scope, the "drop" system (we'll talk about that in a couple of weeks) will close the channel for you.

Multi-Producer, Single Consumer Channels

See the mpsc project in the code/02_threads directory.

The most basic type of channel is the MPSC channel: any number of producers can send a message to a single consumer. Let's build a simple example:

use std::sync::mpsc;

enum Command {
    SayHello, Quit
}

fn main() {
    let (tx, rx) = mpsc::channel::<Command>();

    let handle = std::thread::spawn(move || {
        while let Ok(command) = rx.recv() {
            match command {
                Command::SayHello => println!("Hello"),
                Command::Quit => {
                    println!("Quitting now");
                    break;
                }
            }
        }
    });

    for _ in 0 .. 10 {
        tx.send(Command::SayHello).unwrap();
    }
    println!("Sending quit");
    tx.send(Command::Quit).unwrap();
    handle.join().unwrap();
}

This is a relatively simple example. We're only sending messages to one thread, and not trying to send anything back. We're also not trying to send anything beyond a simple command. But this is a great pattern---you can extend the Command to include lots of operations, and you can send data along with the command. Threads can send to other threads, and you can clone the tx handle to have as many writers as you want.

We're going to build on the channel system after the break.

Channels and Ownership

Channels are an easy way to send data between threads, but ownership becomes a question.

Trying to pass a reference into a channel becomes problematic fast. Unless you can guarantee that the calling thread will outlive the data---and retain the data in a valid state---you can't pass a reference. The "lifetime checker" part of the borrow checker will complain.

The easiest approach is to move the data. The data arrives in one thread, which owns it. Rather than cloning, we move the data into the channel. The channel then owns the data, and can move it to another thread if needs-be. There's never any question of ownership, it's always clear who owns the data.

Let's look at an example:

use std::sync::mpsc;

// Not copyable or clone-able
struct MyData {
    data: String,
    n: u32,
}

pub fn read_line() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("Failed to read line");
    input.trim().to_string()
}

fn main() {
    let (tx, rx) = mpsc::channel::<MyData>();

    std::thread::spawn(move || {
        while let Ok(data) = rx.recv() {
            println!("--- IN THE THREAD ---");
            println!("Message number {}", data.n);
            println!("Received: {}", data.data);
        }
    });

    let mut n = 0;
    loop {
        println!("Enter a string");
        let input = read_line();
        let data_to_move = MyData {
            data: input,
            n,
        };
        n += 1;

        tx.send(data_to_move).unwrap();
    }
}

This pattern is also fast. Moving data generates a memcpy command behind the scenes, but most of the time the optimizer is able to remove it.

Let's benchmark it:

use std::sync::mpsc;

// Not copyable or clone-able
struct MyData {
    start: std::time::Instant,
}

pub fn read_line() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("Failed to read line");
    input.trim().to_string()
}

fn main() {
    let (tx, rx) = mpsc::channel::<MyData>();

    std::thread::spawn(move || {
        while let Ok(data) = rx.recv() {
            let elapsed = data.start.elapsed();
            println!("--- IN THE THREAD ---");
            println!("Message passed in {} us", elapsed.as_micros());
        }
    });

    loop {
        println!("Enter a string");
        let _ = read_line();
        let data_to_move = MyData {
            start: std::time::Instant::now(),
        };

        tx.send(data_to_move).unwrap();
    }
}

On my development box, it averages 17 us per message. That's pretty fast. Definitely enough that if you are doing some serious work, you can afford to move the data.

Sending Functions to Threads

We've focused on sending commands indicating that there's work to do. But what about sending whole functions? We can do that too!

The code for this is in sending_functions in the code/02_threads folder.

use std::sync::mpsc;

type Job = Box<dyn FnOnce() + Send + 'static>;

fn hi_there() {
    println!("Hi there!");
}

fn main() {
    let (tx, rx) = mpsc::channel::<Job>();
    let handle = std::thread::spawn(move || {
        while let Ok(job) = rx.recv() {
            job();
        }
    });

    let job = || println!("Hello from the thread!");
    let job2 = || {
        for i in 0..10 {
            println!("i = {i}");
        }
    };
    tx.send(Box::new(job)).unwrap();
    tx.send(Box::new(job2)).unwrap();
    tx.send(Box::new(hi_there)).unwrap();
    tx.send(Box::new(|| println!("Inline!"))).unwrap();
    handle.join().unwrap();
}

There's a bit to unwrap here:

  • What is a Box? A Box is a smart pointer to an area of the heap. So the function pointer is placed inside a smart pointer, and then sent to the thread. The thread then takes ownership of the smart pointer, and when it's done with it, the smart pointer is dropped, and the function pointer is dropped with it. Without a box, you run into lifetime issues. You'll learn all about Boxes in a couple of weeks.
  • What about dyn? dyn is a special marker indicating that the contents is dynamic. In this case, it's a dynamic function pointer. It doesn't necessarily point to just one function, and the exact form of the function is dynamic.
  • How about FnOnce? This is a function that indicates that it will run once, and won't try to change the world around it. You quickly get into trouble when you need scope capture when you are passing function pointers around.
  • What about Send? This indicates that the function pointer can be sent to another thread. This is a special marker trait that indicates that the function pointer is safe to send to another thread.

Send and Sync

Types in Rust can implement the Sync and Send traits. Sync means that a type is synchronized, and can be safely modified. Mutex is a good example of a Sync type. Send means that a type can be sent to another thread. In this case, we're requiring that the function can be safely sent between threads.

Most of the time, Sync and Send are figured out for you. If anything in a structure isn't sync or send, the structure won't be. (You can override it if you really need to!)

If you're moving data along channels and run into a Sync or Send error it's usually a clue that you need to add protection---like a Mutex---around the data.

We'll look at Send and Sync later.

Sending Commands and Functions

You can mix and match commands and functions using the same channel. Rust Enumerations can hold functions, just like any other type. Let's fix the issue of the program never quitting:

This is the sending_commands_and_functions example in code/02_threads.

use std::sync::mpsc;

type Job = Box<dyn FnOnce() + Send + 'static>;

enum Command {
    Run(Job),
    Quit,
}

fn main() {
    let (tx, rx) = mpsc::channel::<Command>();
    let handle = std::thread::spawn(move || {
        while let Ok(command) = rx.recv() {
            match command {
                Command::Run(job) => job(),
                Command::Quit => break,
            }
        }
    });

    let job = || println!("Hello from the thread!");
    tx.send(Command::Run(Box::new(job))).unwrap();
    tx.send(Command::Quit).unwrap();
    handle.join().unwrap();
}

Setting Thread Affinity

This is the thread_affinity example in code/02_threads.

Warning: It's hotly debated whether you should do this! Common wisdom today is that it's usually better to let the OS scheduler determine where to run threads. Sometimes, though, you need to control where a thread runs. For some high-performance code, it can help---you can avoid delays while data travels between CPUs. For other code, it can hurt---you can cause delays while data travels between CPUs. It's a trade-off, and you need to understand your code and your hardware to make the right decision.

Rust doesn't include native/standard-library source for setting thread affinity to a CPU. The mechanism varies by platform, and it's not a common task.

The core_affinity crate provides a relatively simple mechanism to set thread affinity. Let's add it:

cargo add core_affinity

Now, let's use it:

fn main() {
    let core_ids = core_affinity::get_core_ids().unwrap();

    let handles = core_ids.into_iter().map(|id| {
        std::thread::spawn(move || {
            // Pin this thread to a single CPU core.
            let success = core_affinity::set_for_current(id);
            if success {
                println!("Hello from a thread on core {id:?}");
            } else {
                println!("Setting thread affinity for core {id:?} failed");
            }
        })
    }).collect::<Vec<_>>();
    
    for handle in handles.into_iter() {
        handle.join().unwrap();
    }
}

Thread Priority

The code for this is in the thread_priorities directory in code/02_threads.

This is another controversial topic! Most of the time, leave thread priority alone. The OS scheduler is pretty good at figuring out what to do. Sometimes, though, you need to control thread priority. A thread that always has a lot of work to do can benefit from being given a high priority. A thread that does a lot of waiting can benefit from being given a low priority.

Conversely, a thread that doesn't do much---but has a high priority---will waste lots of CPU time checking to see if it still idle!

Pitfalls

  • You can wind up with priority inversion by mistake. If a high priority task in some way depends on a low-priority task, despite being high-priority---the thread is effectively bounded by the lower-priority task.
  • You can generate "starvation"---a high priority thread that activates regularly with nothing to do. This wastes CPU time.
  • If you set everything to high priority, everything is effectively normal priority!

Setting Thread Priority

Thread priority is not included in the Rust standard library. It's platform-specific, right down to the priority names! Add a crate to help you:

cargo add thread_priority

Here's an example:

use std::{sync::atomic::AtomicU32, time::Duration};
use thread_priority::*;

static LOW_COUNT: AtomicU32 = AtomicU32::new(0);
static MEDIUM_COUNT: AtomicU32 = AtomicU32::new(0);
static HIGH_COUNT: AtomicU32 = AtomicU32::new(0);

fn low_prio() {
    set_current_thread_priority(ThreadPriority::Min).unwrap();
    loop {
        LOW_COUNT.fetch_add(1, std::sync::atomic::Ordering::Relaxed);
        std::thread::yield_now();
    }
}

fn regular_prio() {
    loop {
        MEDIUM_COUNT.fetch_add(1, std::sync::atomic::Ordering::Relaxed);
        std::thread::yield_now();
    }
}

fn main() {
    std::thread::spawn(low_prio);
    std::thread::spawn(regular_prio);

    std::thread::sleep(Duration::from_secs(10));

    println!("Low    : {:>10}", LOW_COUNT.load(std::sync::atomic::Ordering::Relaxed));
    println!("Medium : {:>10}", MEDIUM_COUNT.load(std::sync::atomic::Ordering::Relaxed));
}

On my system this prints:

Low    :   99406604
Medium :   99572604

The differences are very small. They become a little more pronounced when you do a lot of work in your thread. It's not going to make a massive difference, modern OS schedulers do a LOT of work to maintain fairness.

Combining CPU Priority With Affinity

In my experience, this is most useful when combined with affinity. A high priority thread on a core (not core 0!) is likely to keep that core mostly to itself.

Making it Easy with Rayon

A library named "Rayon" is the gold-standard for easy thread-based concurrency in Rust. It actually uses another crate (crossbeam) under the hood, but it provides a much simpler interface for the most common use cases. Rayon can help you with a lot of tasks. Let's work through using it.

Parallel Iterators

This example is in the rayon_par_iter directory, in the 02_threads directory.

Let's start by adding Rayon to the project:

cargo add rayon

Probably the nicest addition Rayon bring is par_iter. The majority of things you can do with an iterator, you can auto-parallelize with par_iter. For example:

use rayon::prelude::*;

fn main() {
    let numbers: Vec<u64> = (0 .. 1_000_000).collect();
    let sum = numbers.par_iter().sum::<u64>();
    println!("{sum}");
}

Rayon creates a thread-pool (1 thread per CPU), with a job queue. The queue implements work-stealing (no idle threads), and supports "sub-tasks" - a task can wait for another task to complete. It really is as simple as using par_iter() (for an iterator of references), par_iter_mut() (for an iterator of mutable references), or into_par_iter() (for an iterator of values that moves the values).

Let's do another test, this time with nested tasks. We'll use a really inefficient function for finding prime numbers:

#![allow(unused)]
fn main() {
fn is_prime(n: u32) -> bool {
    (2 ..= n/2).into_par_iter().all(|i| n % i != 0 )
}
}

And we can parallelize a search of a million numbers as follows:

#![allow(unused)]
fn main() {
// Print primes below 1,000
let now = Instant::now();
let numbers: Vec<u64> = (2 .. 1_000_000).collect();
let elapsed = now.elapsed();
let mut primes: Vec<&u64> = numbers.par_iter().filter(|&n| is_prime(*n as u32)).collect();
primes.sort();
println!("{primes:?}");
println!("It took {} ms to find {} primes", elapsed.as_millis(), primes.len());
}

The result on my PC is: It took 4 ms to find 78498 primes. In other words, it took longer to sort and print the result than it did to calculate it.

Fortunately, Rayon can parallelize the sort, too:

#![allow(unused)]
fn main() {
primes.par_sort_unstable();
}

That knocked a few more milliseconds off the time.

Async/Await Overview

This is mostly theory, but we'll be writing some code in the next section.

Last week, we talked about System Threads. System threads are managed by the operating system and are preemptively multi-tasked. What does that really mean?

  • A thread can be interrupted at any time by the OS scheduler.
  • The OS scheduler is relatively heavy-weight: it has to save the state of the current thread, load the state of the next thread, and then resume execution.
  • You have limited control over when tasks are scheduled.

An async model is cooperatively multi-tasked, and may run in just one thread---or it may split tasks across many threads. What does that really mean?

  • A task can only be interrupted when it yields control. (The executor process might still be interrupted by the OS scheduler.)
  • Tasks are really light-weight: they just contain the execution stack (local variables and function calls), and any data indicating how to resume the task (for example---when a network operation completes).

When to use Async or System Threads?

System ThreadsAsync
Long-running tasksShort-running tasks
CPU-bound tasksI/O-bound tasks
Tasks that need to run in parallelTasks that need to run concurrently
Tasks that need minimal, predictable latencyTasks that can take advantage of latency to do other things - and improve thoughput

As Bill puts it: "Async takes advantage of latency". When you have a lot to do, and much of it involves waiting for something else to finish---a database operation, the network, a file system operation---then async is a great fit. It's also a great fit for short-running tasks that need to run concurrently.

Don't use Async when your task will consume lots of CPU and there may be a long pause between logical points to yield to other async tasks. You'll slow everything down.

Don't use System Threads to spawn one per network client and spend most of the time waiting for I/O. You'll run out of memory or thread allocations.

Do mix and match the two to really unlock the power of Rust's concurrency model.

Rust and Async/Await

NodeJS, Go, C#, Python and others all implement an opinionated and batteries included approach to Async/Await. You await a "future" or "promise"---depending upon the language---and the language framework handles the rest.

C++ and Rust both took a more agnostic approach. They provide the building blocks, but leave it up to you to assemble them into a framework. This is a good thing, because it means you can build a framework that fits your needs, and not the needs of the language designers. It also allows for competition between frameworks, driving innovation and fitness for purpose.

The downside is that it's a bit more work to get started. But that's what this course is for!

Hello Async/Await

There's a simple rule to remember in async/await land:

  • Async functions can execute non-async functions (and do all the time).
  • Non-async functions cannot execute async functions, except with the help of an executor.

Futures

The code for this is in code/03_async/hello_async_futures.

Let's build a really simple example:

async fn say_hello() {
    println!("Hello, world!");
}

fn main() {
    let x = say_hello();
}

This doesn't do anything. Even though say_hello() looks like it's calling the "say_hello" function---it's not. The type hint in Visual Studio Code gives it away: impl Future<Output = ()>. This is a future. It represents a task that hasn't been executed yet. You can pass it to an executor to run it, but you can't run it directly.

So let's add an executor. We'll start by using one of the simplest executors out there---a proof of concept more than a real executor. It's called block_on and it's in the futures crate. We'll start by adding the crate:

cargo add futures

Now, we'll use the simplest bridge between synchronous and asynchronous code: block_on. This is a function that takes a future and runs it to completion. It's not a real executor, but it's good enough for our purposes:

use futures::executor::block_on;

async fn say_hello() {
    println!("Hello, world!");
}

fn main() {
    let _x = say_hello();
    block_on(say_hello());
}

The futures crate has implemented a simple executor, which provides the ability to "block" on an async function. It runs the function---and any async functions it calls---to completion.

Let's add a second async function, and call it from the first:

use futures::executor::block_on;

async fn say_hello() {
    println!("Hello, world!");
    second_fn().await;
}

async fn second_fn() {
    println!("Second function");
}

fn main() {
    let _x = say_hello();
    block_on(say_hello());
}

Notice that once you are inside an async context, it's easier to call the next async function. You just call it and add .await at the end. No need to block again. The "await" keyword tells the executor to run an async task (returned as a future) and wait until it's done.

This is the building block of async/await. You can call async functions from other async functions, and the executor will run them to completion.

What's Actually Happening Here?

When you call block_on, the futures crate sets up an execution context. It's basically a list of tasks. The first async function is added to the list and runs until it awaits. Then it moves to the back of the list, and a new task is added to the list. Once the second function completes, it is removed from the task list---and execution returns to the first task. Once there are no more tasks, this simple executor exits.

In other words, you have cooperative multitasking. You can await as many things as you want. This particular executor doesn't implement a threaded task pool (unless you ask for it)---it's a single threaded job.

Other ways to run tasks

The code for this is in code/03_async/hello_async_spawn_futures.

Join is used to launch multiple async functions at once:

use futures::executor::block_on;
use futures::join;

async fn do_work() {
    // Start two tasks at once
    join!(say_hello(), say_goodbye());
}

async fn say_hello() {
    println!("Hello world!");
}

async fn say_goodbye() {
    println!("Goodbye world!");
}

fn main() {
    block_on(do_work());
}

You can return data from async functions:

use futures::executor::block_on;
use futures::join;

async fn do_work() {
    // Start two tasks at once
    join!(say_hello(), say_goodbye());

    // Return data from an await
    println!("2 * 5 = {}", double(5).await);

    // Return data from a join
    let (a, b) = join!(double(2), double(4));
    println!("2*2 = {a}, 2*4 = {b}");
}

async fn say_hello() {
    println!("Hello world!");
}

async fn say_goodbye() {
    println!("Goodbye world!");
}

async fn double(n: i32) -> i32 {
    n * 2
}

fn main() {
    block_on(do_work());
}

You can even join a vector of futures:

#![allow(unused)]
fn main() {
let futures = vec![double(1), double(2), double(3), double(4)];
let results = join_all(futures).await;
println!("Results: {results:?}");
}

Async functions can call non-async functions

Add a function:

#![allow(unused)]
fn main() {
fn not_async() {
    println!("This is not async");
}
}

You can call it in do_work just like a normal function: not_async();.

That's a lot of the basics of using async/await in Rust. Everything we've done is single-threaded, and isn't super-useful---but with the basics in place, we can start to build more complex applications.

Different Executors

There's a lot of different executors available, so you can choose the one that fits what you need.

ExecutorDescription
FuturesCore abstractions, proof-of-concept executor
ExecutorA minimal executor in 100 lines of code. Works with everything. Very few features. Can run on embedded, web-assembly, without the standard library. A great starting point if you need to implement your own for an embedded or WASM project.
Async-StdA standard library for async Rust. As Async abstractions are finished, they are implemented in this library. Also includes a decently performing executor. Seeks maximum compatibility.
TokioPerformance-oriented async executor and library. Continually updated, forms the core of Axum, Tungstenite (for web sockets) and other libraries. Includes tracing and telemetry support. The typical choice for enterprise applications.

For the rest of this class, we'll use Tokio.

Getting Started with Tokio

Tokio is an async executor for Rust, that implements just about everything you need for Enterprise usage. It is a bit of a kitchen-sink project, tamed by using multiple optional dependencies. It is also the most popular async executor for Rust.

Single-Threaded Usage

Tokio supports multiple threads, but can be used as a single-threaded executor. In some cases, this is what you want---if the bulk of your system is taken up by CPU intensive tasks, you may only want to dedicate some resources to running async tasks. You may also want to use Tokio as a single-threaded executor for testing purposes, or for tools that need to be kept small.

See the 03_async\tokio_single_thread_manual code example.

To use the Tokio executor, you need to add it to your project. It supports a lot of options; for now, we'll use the "full" feature to get access to everything:

cargo add tokio -F full

Let's build a very simple single-threaded async app:

use tokio::runtime;

async fn hello() {
    println!("Hello from async");
}

fn main() {
    let rt = runtime::Builder::new_current_thread()
        .enable_all()
        .build()
        .unwrap();

    rt.block_on(hello());
}

Notice that this is just like the code we created for futures---only using a different runtime. Under the hood, async programs always have a top-level block_on or equivalent to hosts the async session.

You don't have to only use Tokio! You can spawn threads before you block on the async session, and use them for CPU intensive tasks. You can even run executors in threads and have multiple async sessions running independently (or communicating through channels).

Let's Make It Easier

We're switching to the 03_async/tokio_single_thread_macro example code.

Tokio includes some helper macros to avoid typing the boilerplate every time. If you don't need very specific control over the executor, you can use the #[tokio::main] macro to create a single-threaded executor:

async fn hello() {
    println!("Hello from async");
}

#[tokio::main(flavor = "current_thread")]
async fn main() {
    hello().await;
}

That's reduced your code size down to 8 lines! The #[tokio::main] macro will create a Tokio runtime, and block on the async session you provide. We've added a flavor = "current_thread" parameter to ask Tokio to run in a single thread.

Multi-threaded Tokio - the long form with options

Tokio can also run in multi-threaded mode. It's very sophisticated:

  • It spawns one thread per CPU by default---you can control this.
  • Each thread has its own "task list".
  • Each thread has its own "reactor" (event loop).
  • Each thread supports "work stealing"---if the thread has nothing to do, and other threads are blocking on a task, they can "steal" tasks from other threads. This makes it harder to stall your program.
  • You can configure the number of threads, and the number of "reactors" (event loops) per thread.

You usually don't need to set all the options, but here's a set of what you can change if you need to:

use std::sync::atomic::{AtomicUsize, Ordering};
use tokio::runtime;

async fn hello() {
    println!("Hello from async");
}

fn thread_namer() -> String {
    static ATOMIC_ID: AtomicUsize = AtomicUsize::new(0);
    let id = ATOMIC_ID.fetch_add(1, Ordering::SeqCst);
    format!("my-pool-{id}")
}

fn main() {
    let rt = runtime::Builder::new_multi_thread()
        // YOU DON'T HAVE TO SPECIFY ANY OF THESE
        .worker_threads(4)  // 4 threads in the pool
        .thread_name_fn(thread_namer) // Name the threads. 
                                     // This helper names them "my-pool-#" for debugging assistance.
        .thread_stack_size(3 * 1024 * 1024) // You can set the stack size
        .event_interval(61) // You can increase the I/O polling frequency
        .global_queue_interval(61) // You can change how often the global work thread is checked
        .max_blocking_threads(512) // You can limit the number of "blocking" tasks
        .max_io_events_per_tick(1024) // You can limit the number of I/O events per tick
        // YOU CAN REPLACE THIS WITH INDIVIDUAL ENABLES PER FEATURE
        .enable_all()
        // Build the runtime
        .build()
        .unwrap();

    rt.block_on(hello());
}

In other words, if you need the control. It's there. Most of the time, you'll not need to change any of this. Just like single-threaded, you can mix/match with system threads and multiple executors if you need to (multiple executors can get messy!).

Tokio Multi-Threaded - Macro Style

That's a lot of boilerplate. If you don't need to reconfigure how everything works. This makes for a very simple,readable main.rs:

async fn hello() {
    println!("Hello from async");
}

#[tokio::main]
async fn main() {
    hello().await;
}

Tokio Futures

Most of the async/await code you've created works exactly the same in Tokio. However, Tokio tends to "take over" a bit and has its own syntax for a few things. It also offers a few options that you won't necessarily find elsewhere.

See the 03_async/tokio_await code.

Await

The .await system is completely unchanged:

async fn hello() {
    println!("Hello from async");
}

#[tokio::main]
async fn main() {
    hello().await;
}

Joining

Tokio provides a join! macro that you can use just like you did with futures::join!:

#![allow(unused)]
fn main() {
let result = tokio::join!(double(2), double(3));
println!("{result:?}");
}

If you have a vector of futures, Tokio does not provide a join_all macro! You can import the futures crate and use that one (it will still run inside Tokio---it uses whatever the current executor is).

Add the futures crate as well with cargo add futures. Then:

#![allow(unused)]
fn main() {
// You can still use futures join_all
let futures = vec![double(2), double(3)];
let result = futures::future::join_all(futures).await;
println!("{result:?}");
}

Alternatively, you can use Tokio's native JoinSet type. The code looks like this:

#![allow(unused)]
fn main() {
// Using Tokio JoinSet
let mut set = JoinSet::new();
for i in 0..10 {
    set.spawn(double(i));
}
while let Some(res) = set.join_next().await {
    println!("{res:?}");
}
}

Notice that every res returned is a Result. Even though your function didn't return a result, it wrapped everything in Ok.

You can also drop the join set and it will automatically cancel any pending futures for you.

I personally tend ot use the futures version unless I really need the extra control.

Spawning

What if you want to start an async task, and not wait for it to complete? Tokio provides the spawn! macro for this purpose. It's just like a thread spawn, but it adds as async task to the task pool (which may or may not be on another actual thread). Let's try this example:

async fn ticker() {
    for i in 0..10 {
        println!("tick {i}");
    }
}

#[tokio::main]
async fn main() {
    tokio::spawn(ticker());
    hello().await;
}

(We're keeping the hello() function from before). Run it, and notice that the answers appear in a different order---the threading system is distributing work. It's also not guaranteed that ticker will complete, because the program isn't waiting for it to finish. Let's try another approach:

#[tokio::main]
async fn main() {
    let _ = tokio::join!(
        tokio::spawn(hello()), 
        tokio::spawn(ticker()),
    );
    println!("Finished");
}

Now you are sending hello and ticker into separate tasks, and waiting for all of them. The program will wait for everything to finish.

Threading

Notice that the previous program gives different results each time. That's because you are in the multi-threaded executor. let's try the single-threaded mode:

#![allow(unused)]
fn main() {
#[tokio::main(flavor = "current_thread")]
}

You always get the same result:

Hello from async
(8, 12)
[8, 12]
tick 0
tick 1
tick 2
tick 3
tick 4
tick 5
tick 6
tick 7
tick 8
tick 9
Ok(0)
Ok(4)
Ok(8)
Ok(12)
Ok(16)
Ok(20)
Ok(24)
Ok(28)
Ok(32)
Ok(36)
Finished

What's going on here?

The hello task starts first. Printing doesn't yield---there's no await. So it always runs that. Calling double does yield, so it goes into the task pool. It runs the double commands, which also allows the next joined task to arrive in the task pool. The ticker runs. Note that ticker doesn't await anywhere, so it always runs as one large blob. Then the JoinSet runs, which is yielding for each call.

That's a pitfall of async programming. If ticker were doing something complicated, in a single-threaded environment---or a really busy task pool with threads---you are effectively "locking up" the other tasks until it completes.

Fortunately, you can also explicitly "yield" control:

#![allow(unused)]
fn main() {
async fn ticker() {
    for i in 0..10 {
        println!("tick {i}");
        tokio::task::yield_now().await;
    }
}
}

yield_now is telling Tokio that you are done for now, allow other tasks to run. Just like a thread, when your task resumes---it's stack will be restored, and it will continue as before. This is a good way to make sure that you don't lock up the task pool. It also slows your computation down!

Run it now - notice that the ticks run once and then last. yield_now moves the task to the back of the queue, so it will run again when it's ready.

yield_now is useful if you must do something CPU intensive in your async task. If possible, send your big task over to a thread. We'll look at that in a bit.

Blocking Tasks

We saw that you can stall the thread pool by doing computation---other tasks only get to run when you yield, await or otherwise give up control (by awaiting an IO task, for example).

Heavy CPU usage is "blocking" the task from yielding to other tasks. This is a problem if you want to do CPU intensive work in an async context.

Sleeping

The code for this example is in the 03_async\tokio_thread_sleep directory.

Let's look at another way to wreak havoc! Add tokio (cargo add tokio -F full) and futures (cargo add futures) and run the following code:

use std::time::Duration;

async fn hello_delay(task: u64, time: u64) {
    println!("Task {task} has started");
    std::thread::sleep(Duration::from_millis(time));
    println!("Task {task} is done.");
}

#[tokio::main]
async fn main() {
    let mut futures = Vec::new();
    for i in 0..5 {
        futures.push(hello_delay(i, 500 * i));
    }
    futures::future::join_all(futures).await;
}

Despite using a multi-threaded runtime, and even though we've used join_all---the tasks run in-order with no sharing:

Task 0 has started
Task 0 is done.
Task 1 has started
Task 1 is done.
Task 2 has started
Task 2 is done.
Task 3 has started
Task 3 is done.
Task 4 has started
Task 4 is done.

This is happening because std::thread::sleep is blocking the whole thread, not just the task. It may even put the executor to sleep, if the task fires on the same thread as the executor---which is likely, because tasks never get the chance to be moved.

This is worse than high CPU usage blocking the task, because it can block the whole thread pool! With high CPU usage and a work-stealing pool, you'd normally expect the other tasks to be able to run on other threads.

Sleeping is a common requirement, and Tokio includes an async timing system for this reason. You can use Tokio's sleep instead to get what you expect:

use std::time::Duration;

async fn hello_delay(task: u64, time: u64) {
    println!("Task {task} has started");
    //std::thread::sleep(Duration::from_millis(time));
    tokio::time::sleep(Duration::from_millis(time)).await;
    println!("Task {task} is done.");
}

#[tokio::main]
async fn main() {
    let mut futures = Vec::new();
    for i in 0..5 {
        futures.push(hello_delay(i, 500 * i));
    }
    futures::future::join_all(futures).await;
}

Now the output looks reasonable:

Task 0 has started
Task 1 has started
Task 2 has started
Task 3 has started
Task 4 has started
Task 0 is done.
Task 1 is done.
Task 2 is done.
Task 3 is done.
Task 4 is done.

But What if I Need to Block?

You might actually want to perform a blocking operation---I/O to a device that doesn't have an async interface, a CPU intensive task, or something else that can't be done asynchronously. Tokio implements spawn_blocking for this.

The code for this is in 03_async\tokio_spawn_blocking.

use std::time::Duration;
use tokio::task::spawn_blocking;

async fn hello_delay(task: u64, time: u64) {
    println!("Task {task} has started");
    let result = spawn_blocking(move || {
        std::thread::sleep(Duration::from_millis(time));
    }).await;
    println!("Task {task} result {result:?}");
    println!("Task {task} is done.");
}

#[tokio::main]
async fn main() {
    let mut futures = Vec::new();
    for i in 0..5 {
        futures.push(hello_delay(i, 500 * i));
    }
    futures::future::join_all(futures).await;
}

Notice that spawn_blocking returns a Result, containing whatever is returned from your blocking task.

Blocking tasks create a thread, and run the task on that thread. If you await it, your task will be suspended---and control given to other tasks---until its done. This way, you aren't blocking the tasks queue and can do your heavy lifting in a thread. The downside is that you have created a system thread, which is more expensive than a lightweight task.

If you specified a maximum number of blocking threads in your runtime builder, threads will wait until there is a free thread to run on. If you didn't, Tokio will create a new thread for each blocking task.

What Happens if I Don't Await a Blocking Task?

If you just want your blocking task to "detach" (run independently) and neither block the current task nor return a result, you can use spawn_blocking without the await:

use std::time::Duration;
use tokio::task::spawn_blocking;

async fn hello_delay(task: u64, time: u64) {
    println!("Task {task} has started");
    spawn_blocking(move || {
        std::thread::sleep(Duration::from_millis(time));
    });
    println!("Task {task} is done.");
}

#[tokio::main]
async fn main() {
    let mut futures = Vec::new();
    for i in 0..5 {
        futures.push(hello_delay(i, 500 * i));
    }
    futures::future::join_all(futures).await;
}

This will lead to an instant output of:

Task 0 has started
Task 0 is done.
Task 1 has started
Task 1 is done.
Task 2 has started
Task 2 is done.
Task 3 has started
Task 3 is done.
Task 4 has started
Task 4 is done.

Followed by a delay while the threads finish.

This is useful is you want to do something in the background and don't need the result immediately. You can always store the result in a shared data structure or send it over a channel if you need it later.

Unit Testing with Tokio

You've seen before how easy Rust makes it to include unit tests in your project. Tokio makes it just as easy to test asynchronous code.

As a reminder, here's a regular---synchronous---unit test:

fn main() {
}

#[cfg(test)]
mod test {
    #[test]
    fn simple_test() {
        assert_eq!(2 + 2, 4);
    }
}

Run the test with cargo test, and you prove that your computer can add 2 and 2:

running 1 test
test test::simple_test ... ok

The Problem with Async Tests

The problem with using the regular #[test] syntax is that you can't use async functions from a synchronous context. It won't compile:

fn main() {
}

async fn double(n: i32) -> i32 {
    n * 2
}

#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn simple_test() {
        assert_eq!(2 + 2, 4);
    }

    #[test]
    fn will_not_work() {
        // This will not work because we are not in an async context
        let result = double(2);
        assert_eq!(result, 4);
    }
}

Option 1: Build a context in each test

The long-form solution is to build an async context in each test. For example:

#![allow(unused)]
fn main() {
#[test]
fn the_hard_way() {
    let rt = tokio::runtime::Builder::new_current_thread()
        .enable_all()
        .build()
        .unwrap();

    assert_eq!(rt.block_on(double(2)), 4);
}
}

When you start doing this in every test, you wind up with a huge set of unit tests---and a lot of boilerplate. Fortunately, Tokio provides a better way.

Option 2: Use tokio-test

Tokio provides an alternative test macro---like like the tokio::main macro---for your tests. It adds the boilerplate to build an async context for you:

#![allow(unused)]
fn main() {
#[tokio::test]
async fn the_easy_way() {
    assert_eq!(double(2).await, 4);
}
}

The tokio::test macro creates a full multi-threaded runtime for you. It's a great way to get started with Tokio testing. You can use it as a full async context---awaiting, joining, spawning.

Option 3: Single-threaded Tokio Test

If you are executing in a single-threaded environment, you also want to test single-threaded. Testing single-threaded can also be a good way to catch those times you accidentally blocked.

To test single-threaded, use the tokio::test macro with the single_thread feature:

#![allow(unused)]
fn main() {
#[tokio::test(flavor = "current_thread")]
async fn single_thread_tokio() {
    assert_eq!(double(2).await, 4);
}
}

Taming Multi-Threaded Tokio

Rust unit tests already run in a threaded context (multiple tests execute at once)---creating a thread pool encompassing every CPU your system has for a test is probably overkill. You can also decorate the tokio::test macro with the multi_thread feature to create a multi-threaded Tokio runtime with a limited number of threads:

#![allow(unused)]
fn main() {
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn tamed_multi_thread() {
    assert_eq!(double(2).await, 4);
}
}

Async File I/O

So you've come a long way: you can make a multi-threaded executor, spawn thousands of tasks and have them all perform asynchronously without context switches. You can even write unit tests for your async code and handle errors. That's great, but it hasn't gained you a lot: you can do all of that in synchronous code.

Where async really shines is handling IO. It's inevitable that there will be pauses while interacting with external devices. Async allows you to pause and allow other tasks to run. That's why Async shines for things like web servers.

For this example, grab the code/03_async/buffered_reader/warandpeace.txt file from the main GitHub repo. This is the entirety of War and Peace (Tolstoy), in a text file. It's a great example of a huge file!

Buffered File I/O

Let's pretend that War and Peace is some corporate data that we actually want to work on. It's a big text file and we want to read it in and do some processing on it. We'll start with a synchronous version of the code:

// Taken from: https://doc.rust-lang.org/rust-by-example/std_misc/file/read_lines.html
fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>>
where P: AsRef<Path>, {
    let file = File::open(filename)?;
    Ok(io::BufReader::new(file).lines())
}

fn main() {
    let now = std::time::Instant::now();
    let mut line_count = 0;
    if let Ok(lines) = read_lines("warandpeace.txt") {
        lines.for_each(|line| {
            if let Ok(line) = line {
                if !line.trim().is_empty() {
                    line_count += 1;
                }
            }
        });
    }
    println!("Read {} lines in {:.3} seconds", line_count, now.elapsed().as_secs_f32());
}

This is using some code from Rust By Example that I use a lot. The read_lines function is just too convenient to not copy! It uses a BufReader - so it's not loading the entire file into memory (which is a bad idea in many server contexts; read_to_string is actually substantially faster if you don't mind the memory overhead). It then uses the lines helper to iterate through each line.

On my computer (with a fast SSD), it completes in 0.031 seconds. That's fast, but it can be an eternity if you are blocking a thread---there's no yielding, and the the BufReader isn't asynchronous. Imagine your server is processing hundreds of these files at once!

Now let's try that in an async context. Don't forget to include Tokio and Anyhow:

cargo add tokio -F full
cargo add anyhow

Now let's change things up a little. We'll pass a filename into the function, and use Tokio:

use std::fs::File;
use std::io::{self, BufRead};
use std::path::Path;

// Taken from: https://doc.rust-lang.org/rust-by-example/std_misc/file/read_lines.html
fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>>
where P: AsRef<Path>, {
    let file = File::open(filename)?;
    Ok(io::BufReader::new(file).lines())
}

async fn line_count(filename: String) -> anyhow::Result<usize> {
    let now = std::time::Instant::now();
    let mut line_count = 0;
    if let Ok(lines) = read_lines(filename) {
        lines.for_each(|line| {
            if let Ok(line) = line {
                if !line.trim().is_empty() {
                    line_count += 1;
                }
            }
        });
    }
    println!("Read {} lines in {:.3} seconds", line_count, now.elapsed().as_secs_f32());
    Ok(line_count)
}

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    println!("Reading {filename}...");
    let now = std::time::Instant::now();
    let (c1, c2) = tokio::join!(
        line_count("warandpeace.txt".to_string()),
        line_count("warandpeace.txt".to_string())
    );
    println!("Total lines: {}", c1? + c2?);
    println!("In {:.3} seconds", now.elapsed().as_secs_f32());
    Ok(())
}

On my PC, I get the following output:

Reading warandpeace.txt...
Read 52162 lines in 0.032 seconds
Reading warandpeace.txt...
Read 52162 lines in 0.032 seconds
Total lines: 104324
In 0.066 seconds

Going async hasn't done us any good at all: the tasks block while the file i/o occurs, giving no performance improvement at all. The second task can't even start until the first one finishes. This is because the BufReader is not asynchronous.

Async File I/O

Tokio provides a fair portion of the standard library in asynchronous form. Let's build an async version of the same function:

#![allow(unused)]
fn main() {
async fn async_line_count(filename: String) -> anyhow::Result<usize> {
    use tokio::io::AsyncBufReadExt;
    use tokio::io::BufReader;
    use tokio::fs::File;
}

So we've started by declaring an async function, and using the Tokio versions of the same types we were using before. Now let's open the file:

#![allow(unused)]
fn main() {
    println!("Reading {filename}...");
    let now = std::time::Instant::now();
    let mut line_count = 0;

    let file = File::open(filename).await?;
}

So now we do the same as before - indicate that we're doing something, initialize the timer and start counting lines. Then we open the file. The Tokio version is async - so you have to await the result (or spawn it, or join it, etc.). We're also using ? to propagate errors, just like the synchronous version.

Now let's implement the buffered reader in async code. We'll keep it one function because it works a little differently:

#![allow(unused)]
fn main() {
    let reader = BufReader::new(file);
    let mut lines = reader.lines(); // Create a stream of lines
}

Instead of lines returning an iterator (like calling .iter() on vectors), it returns a stream. A stream is basically an async iterator---you await each iteration. Reading the next buffered section is implemented as an async operation, so the executor can yield if there's work to do while it's reading.

Let's read each entry and count it if the line isn't empty:

#![allow(unused)]
fn main() {
    while let Some(line) = lines.next_line().await? {
        if !line.trim().is_empty() {
            line_count += 1;
        }
    }

    println!("Read {} lines in {:.3} seconds", line_count, now.elapsed().as_secs_f32());
    Ok(line_count)
}
}

Now in main we can call the async version:

#![allow(unused)]
fn main() {
let now = std::time::Instant::now();
let (c1, c2) = tokio::join!(
    async_line_count("warandpeace.txt".to_string()),
    async_line_count("warandpeace.txt".to_string())
);
println!("Total lines: {}", c1? + c2?);
println!("In {:.3} seconds", now.elapsed().as_secs_f32());
}

We could keep adding readers, use different files, etc. and it would all work. The output is:

Reading warandpeace.txt...
Reading warandpeace.txt...
Read 52162 lines in 0.154 seconds
Read 52162 lines in 0.154 seconds
Total lines: 104324
In 0.155 seconds

The two readers have run concurrently, so you've read War And Peace twice without blocking!

In all honesty, I haven't personally managed to read it once yet.

So when you are using async code, it's a good idea to use the async versions of the operations you are performing. You could also have used spawn_blocking to wrap the synchronous code in a thread.

Basic Network IO

There's synchronous versions of most network calls in the Rust Standard Library, but networking really lends itself to async: there's always going to be latency between calls (even if you have an enormous fiber feed!)

The code for the first section is in 03_async/weather.

Making a REST call

Add three crates:

cargo add tokio -F full
cargo add reqwest -F json
cargo add anyhow

There are two popular crates to use for making HTTP calls: reqwest and hyper. reqwest is a higher-level crate that uses hyper under the hood. We'll use reqwest here.

Let's perform a very basic request to lookup the weather around my house (that's lat/lon for the city center):

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    const URL: &str = "https://api.open-meteo.com/v1/forecast?latitude=38.9517&longitude=-92.3341&current_weather=true";
    let response = reqwest::get(URL).await?;
    println!("{}", response.text().await?);

    Ok(())
}

Notice that getting the response is an async call---we have to await it. Getting the body is also an async call. It may not be completely ready by the time we call text(), so we have to await that too.

The result is JSON:

{"latitude":38.95111,"longitude":-92.335205,"generationtime_ms":0.16498565673828125,"utc_offset_seconds":0,"timezone":"GMT","timezone_abbreviation":"GMT","elevation":216.0,"current_weather":{"temperature":30.3,"windspeed":7.2,"winddirection":162.0,"weathercode":1,"is_day":1,"time":"2023-05-30T18:00"}}

You should remember how to parse JSON from the first class. Let's add serde support to our project:

cargo add serde -F derive

Let's build some strong types to represent the result:

#![allow(unused)]
fn main() {
use serde::Deserialize;

#[derive(Deserialize, Debug)]
struct Weather {
    latitude: f64,
    longitude: f64,
    current_weather: CurrentWeather,
}

#[derive(Deserialize, Debug)]
struct CurrentWeather {
    temperature: f64,
    windspeed: f64,
}
}

Notice that we're just ignoring some fields altogether. That's ok. You can also make lines an Option<String> (or other type) if they may or may not be present.

Now we can use Reqwest's json feature to give us a strongly typed result:

#![allow(unused)]
fn main() {
const URL: &str = "https://api.open-meteo.com/v1/forecast?latitude=38.9517&longitude=-92.3341&current_weather=true";
let response = reqwest::get(URL).await?;
let weather: Weather = response.json().await?;
println!("{weather:#?}");
}

Right now, my weather looks like this:

Weather {
    latitude: 38.95111,
    longitude: -92.335205,
    current_weather: CurrentWeather {
        temperature: 30.3,
        windspeed: 7.2,
    },
}

That's all there is to making a basic HTTP(s) REST request! It's async, so it won't block your program.

If you find yourself dealing with less-structured JSON that doesn't readily lend itself to a strong type, Serde has your back. You can deserialize to a serde_json::Value type:

Run cargo add serde_json and then change the deserializer to:

#![allow(unused)]
fn main() {
let weather: serde_json::Value = response.json().await?;
}

This gives you a big collection of Serde::Value types that you can parse with iteration and matching:

Object {
    "current_weather": Object {
        "is_day": Number(1),
        "temperature": Number(30.3),
        "time": String("2023-05-30T18:00"),
        "weathercode": Number(1),
        "winddirection": Number(162.0),
        "windspeed": Number(7.2),
    },
    "elevation": Number(216.0),
    "generationtime_ms": Number(0.16701221466064453),
    "latitude": Number(38.95111),
    "longitude": Number(-92.335205),
    "timezone": String("GMT"),
    "timezone_abbreviation": String("GMT"),
    "utc_offset_seconds": Number(0),
}

Making a Simple TCP Server

The code for this is in 03_async/tcp_echo.

(Once again, don't forget to add Tokio and Anyhow to your project!)

Let's create a very simple TCP server that simply echoes back anything you type. We'll use the tokio::net::TcpListener type to listen for connections, and tokio::net::TcpStream to handle the connection.

We'll start by using some of the types we need:

#![allow(unused)]
fn main() {
use tokio::{net::TcpListener, spawn, io::{AsyncReadExt, AsyncWriteExt}};
}

Then we'll build a main function that creates a "TCP Listener" to listen for new connections:

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let listener = TcpListener::bind("127.0.0.1:8123").await?;

That's enough to listen for new connections on localhost, port 8123. You could use :: for IPv6.

Now we're going to loop forever and accept new connections:

#![allow(unused)]
fn main() {
    loop {
        let (mut socket, address) = listener.accept().await?;
        spawn(async move {
            println!("Connection from {address:?}");
            // We're in a new task now. The task is connected
            // to the incoming connection. We moved socket (to talk to the other end)
            // and address (to print it's address) into this task.
            //
            // Because we used `spawn`, the task is added to the tasks pool and the
            // outer function continues to listen for new connections.
        });
    }
}

Now we'll fill in the blanks:

use tokio::{
    io::{AsyncReadExt, AsyncWriteExt},
    net::TcpListener,
    spawn,
};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let listener = TcpListener::bind("127.0.0.1:8123").await?;

    loop {
        let (mut socket, address) = listener.accept().await?;
        spawn(async move {
            println!("Connection from {address:?}");
            let mut buf = vec![0; 1024];
            loop {
                let n = socket
                    .read(&mut buf)
                    .await
                    .expect("failed to read data from socket");

                if n == 0 {
                    return;
                }

                socket
                    .write_all(&buf[0..n])
                    .await
                    .expect("failed to write data to socket");
            }
        });
    }
    //Ok(())
}

We then:

  1. Initialize a buffer.
  2. Loop forever.
  3. Read from the socket into the buffer.
  4. If we read 0 bytes, the connection is closed, so we return.
  5. Otherwise, we write the buffer back to the socket.

If you telnet to localhost:8123 and type some text, you'll see it echoed back to you.

That's not very useful, but it gives you one of the basic structures for accepting TCP connections yourself. We'll build a much better example later, and the final class of this course will build something useful!

Making a Simple TCP Client

Let's build a client that connects to this server and verifies that it receives what it sent.

The code for this is in 03_async/tcp_echo_client.

Once again, add tokio and anyhow crates!

We'll start by creating a main function that connects to the server:

use tokio::{net::TcpStream, io::{AsyncWriteExt, AsyncReadExt}};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let mut stream = TcpStream::connect("127.0.0.1:8123").await?;
    println!("Connected to the server!");

The TcpStream provides an async low-level interface to a TCP stream. You can read/write bytes to it, managing buffers is up to you.

Now that we're connected, let's send "Hello World" to the server:

#![allow(unused)]
fn main() {
    // Send "Hello World"
    stream.write_all(b"Hello World!").await?;
}

Notice the b"Hello World!". The b prefix means "this string is actually an array of bytes" - it's a handy bit of syntax sugar. it accepts a slice (reference to an array or vector) of bytes from anything.

Now let's read the response:

#![allow(unused)]
fn main() {
    // Read the response
    let mut buf = vec![0; 1024];
    let bytes_read = stream.read(&mut buf).await?;
    println!("Response: {}", String::from_utf8_lossy(&buf[..bytes_read]));

    Ok(())
}
}

Notice that we're using from_utf8_lossy to build the string. Rust strings aren't just byte streams like C---they are full UTF-8 unicode (a char can be more than one byte). If you try to build a string from a byte stream that isn't valid UTF-8, you'll get an error. from_utf8_lossy will replace invalid UTF-8 with a ? character.

And when we run it (remember to start the server!):

Connected to the server!
Response: Hello World!

So that gives you the most basic scenarios to start working quickly: you can call REST services (reqwest also supports the other HTTP verbs), and you can build TCP servers and clients. We'll do more later!

Async Channels

You should remember from last week that we talked about MPSC channels for communicating between threads. If not, I've linked to the curriculum and you have the code we created.

Sync Channels

You can use threaded channels. A common approach for batch processing is to use an async system to receive network request, and then send them to dedicated system threads for heavy processing.

The code for this section is in 03_async/sync_channel.

Don't forget to add Tokio!

This provides a very convenient way to send data into a system that is using system threads:

use std::{time::Duration, sync::mpsc};

enum Command {
    Print(String),
}

#[tokio::main]
async fn main() {
    // Spawn a command thread for "heavy lifting"
    let (tx, rx) = mpsc::channel::<Command>();

    std::thread::spawn(move || {
        while let Ok(command) = rx.recv() {
            match command {
                Command::Print(s) => println!("{s}"),
            }
        }
    });

    // Launch the async sender
    let mut counter = 0;
    loop {
        tokio::time::sleep(Duration::from_secs(1)).await;
        tx.send(Command::Print(format!("Hello {counter}"))).unwrap();
        counter += 1;
    }
}

You are creating a threaded process that waits for commands, and prints them out. You also create an async runtime, and repeatedly send commands to the threaded process.

This isn't very useful---but it can be connected to a network server that has to do heavy batch processing, and suddenly you have the best of both worlds: your threaded task is doing the heavy lifting. You could even use Rayon with a limited-size thread pool to control how many cores you are using, and reserve some (even one) for Tokio.

Replying to Sync Channels

But what about getting the result back into the Async world?

The code for this is in 03_async/sync_channel_reply.

Tokio also implements MPSC channels. They behave a lot like their treaded brethren, but they are async. Sending requires an await, receiving requires an await. They are very efficient on the async side of the fence.

Now here's a neat trick.

Once you have a Tokio runtime, you can get a handle to it at any time and use that inside synchronous code to launch async tasks inside the executor!

This lets you bridge the divide between the threaded world and the async world. You can indeed have your cake and eat it.

Starting with the previous example (copied into a new entry in the work repo), we add a Tokio channel for replies back into the async world:

#![allow(unused)]
fn main() {
// Spawn a TOKIO Async channel for replies
let (tx_reply, mut rx_reply) = tokio::sync::mpsc::channel::<String>(10);
}

This is exactly like creating a threaded version, but we're using the Tokio variant. Tokio also requires that channels be bounded---the number of messages that can sit in the queue awaiting processing. That's the 10.

Now we can modify our system thread to obtain a handle to Tokio, and use it to spawn an async reply:

#![allow(unused)]
fn main() {
let handle = tokio::runtime::Handle::current();
std::thread::spawn(move || {
    while let Ok(command) = rx.recv() {
        match command {
            Command::Print(s) => {
                // Make our very own copy of the transmitter
                let tx_reply = tx_reply.clone();
                handle.spawn(async move {
                    tx_reply.send(s).await.unwrap();
                });
            },
        }
    }
});
}

Lastly, we add an async process (running in the background) to receive the replies:

#![allow(unused)]
fn main() {
// Launch a Tokio process to receive replies from thread-land
tokio::spawn(async move {
    while let Some(reply) = rx_reply.recv().await {
        println!("{reply}");
    }
});
}

Here's the full code:

use std::{time::Duration, sync::mpsc};

enum Command {
    Print(String),
}

#[tokio::main]
async fn main() {
    // Spawn a command thread for "heavy lifting"
    let (tx, rx) = mpsc::channel::<Command>();

    // Spawn a TOKIO Async channel for replies
    let (tx_reply, mut rx_reply) = tokio::sync::mpsc::channel::<String>(10);

    let handle = tokio::runtime::Handle::current();
    std::thread::spawn(move || {
        while let Ok(command) = rx.recv() {
            match command {
                Command::Print(s) => {
                    // Make our very own copy of the transmitter
                    let tx_reply = tx_reply.clone();
                    handle.spawn(async move {
                        tx_reply.send(s).await.unwrap();
                    });
                },
            }
        }
    });

    // Launch a Tokio process to receive replies from thread-land
    tokio::spawn(async move {
        while let Some(reply) = rx_reply.recv().await {
            println!("{reply}");
        }
    });

    // Launch the async sender
    let mut counter = 0;
    loop {
        tokio::time::sleep(Duration::from_secs(1)).await;
        tx.send(Command::Print(format!("Hello {counter}"))).unwrap();
        counter += 1;
    }
}

It runs as before, but you've got a really good template here:

  • You spawn system threads, using everything you learned last week.
  • Since system threads are perfect for CPU-bound workloads, you don't have to worry about yielding, spawning blocking tasks, or anything like that. You just receive a message telling you to do something, and you hit it as hard as you can.
  • Meanwhile, Tokio remains entirely async---giving fast network or other IO access.

This is a popular pattern for batch processing. Another service tells your program (often over the network, but it could be a channel or anything else) that there's some heavy processing ready to do. You send the CPU-bound workload off into a thread pool (often using Rayon) and send a message back when it is done.

Tokio Broadcast Channels

The code for this is in 03_async/broadcast.

Tokio provides a type of channel that regular Rust doesn't have: the broadcast channel. This is a channel that can have multiple receivers. It's a bit like a Vec of channels, but it's more efficient. It's relatively easy to use:

#[tokio::main]
async fn main() {
    let (tx, mut rx) = tokio::sync::broadcast::channel::<String>(16);

    for n in 0..16 {
        let mut messages = tx.subscribe();
        tokio::spawn(async move {
            while let Ok(msg) = messages.recv().await {
                println!("{n}: {msg}");
            }
        });
    }

    tx.send("hello".to_string()).unwrap();
    while let Ok(msg) = rx.recv().await {
        println!("main: {msg}");
    }
}

This example will never terminate! But if you need to send a message to a lot of tasks at once, this is a great way to do it.

Shared State (Tokio)

You may remember dealing with global variables last week. There are async versions of the same primitives, but it's not always clear which you should use when.

Atomic Variables

Atomic variables are completely untouched by async. So everything you learned in last week's class on atomics applies to async land, too. They are still high-performance, great ways to share data when you can.

Mutexes

The code for this is in 03_async/async_mutex.

(Add once_cell to your project with cargo add)

You can still use a system mutex in async land:

use std::sync::Mutex;

static COUNTER: Mutex<u32> = Mutex::new(0);

async fn increment() {
    let mut counter = COUNTER.lock().unwrap();
    *counter += 1;
}

#[tokio::main]
async fn main() {
    tokio::join!(increment(), increment(), increment());
    println!("COUNTER = {}", *COUNTER.lock().unwrap());
}

If you don't have much contention, this is still a high-performance option. The Tokio documentation even recommends it in many cases. BUT - and there's always a but - it has some issues in async land. There are two issues:

  • If the mutex is contested, you can block a whole thread while you wait.
  • You can't pass a standard-library mutex between async tasks.

Let's look at the second problem:

use std::sync::Mutex;

static COUNTER: Mutex<u32> = Mutex::new(0);

async fn add_one(n: u32) -> u32 {
    n + 1
}

async fn increment() {
    let mut counter = COUNTER.lock().unwrap();
    *counter = add_one(*counter).await;
}

#[tokio::main]
async fn main() {
    tokio::join!(increment(), increment(), increment());
    println!("COUNTER = {}", *COUNTER.lock().unwrap());
}

Notice that this compiles and runs. Clippy gives a very serious sounding warning, though:

cargo clippy
warning: this `MutexGuard` is held across an `await` point
  --> 03_async\async_mutex\src\main.rs:10:9
   |
10 |     let mut counter = COUNTER.lock().unwrap();
   |         ^^^^^^^^^^^
   |
   = help: consider using an async-aware `Mutex` type or ensuring the `MutexGuard` is dropped before calling await
note: these are all the `await` points this lock is held through
  --> 03_async\async_mutex\src\main.rs:10:5
   |
10 | /     let mut counter = COUNTER.lock().unwrap();
11 | |     *counter = add_one(*counter).await;
12 | | }
   | |_^
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_lock
   = note: `#[warn(clippy::await_holding_lock)]` on by default

warning: `async_mutex` (bin "async_mutex") generated 1 warning

What this means is that the regular MutexGuard type you get from calling lock() assumes that you are in a threaded world. It is unaware of a serious danger: the mutex remains locked when you await, and can cause a deadlock by accessing COUNTER from another task.

So Tokio provides an async version that you can use instead when you need to use Mutexes inside an async context. It works very similarly, but you have to await your locks:

//use std::sync::Mutex;
use tokio::sync::Mutex;
use once_cell::sync::Lazy;

//static COUNTER: Mutex<u32> = Mutex::new(0);
static COUNTER: Lazy<Mutex<u32>> = Lazy::new(|| Mutex::new(0));

async fn add_one(n: u32) -> u32 {
    n + 1
}

async fn increment() {
    //let mut counter = COUNTER.lock().unwrap();
    let mut counter = COUNTER.lock().await;
    *counter = add_one(*counter).await;
}

#[tokio::main]
async fn main() {
    tokio::join!(increment(), increment(), increment());
    //println!("COUNTER = {}", *COUNTER.lock().unwrap());
    println!("COUNTER = {}", *COUNTER.lock().await);
}

Aside: What's with the Lazy?

This is a great opportunity to expand a bit on what we talked about before with initializing statics.

This is valid:

#![allow(unused)]
fn main() {
use std::sync::Mutex;
static COUNTER: Mutex<u32> = Mutex::new(0)
}

This isn't:

#![allow(unused)]
fn main() {
use tokio::sync::Mutex;
static COUNTER: Mutex<u32> = Mutex::new(0);
}

Why would that be? static functions can only be initialized with a function that is marked as const.

You can provide any const function for initialization:

#![allow(unused)]
fn main() {
use std::sync::Mutex;
static CONST_MUTEX : Mutex<i32> = Mutex::new(new_mutex());

const fn new_mutex() -> i32 {
    5 * 12
}
}

But you can't use a function that isn't constant. Tokio's mutex new isn't constant, so you can't use it directly. But you can use Lazy (from once_cell and very soon the standard library) to add a layer of indirection that calls a non-const function through a closure:

#![allow(unused)]
fn main() {
static CONST_MUTEX : Lazy<Mutex<String>> = Lazy::new(|| Mutex::new("Hello".to_string()));
}

RwLock

Read/Write locks have exactly the same change. You can use the tokio version just like a standard library rwlock, but you have to await your read() and write() calls.

Selecting Futures

There's one more type of spawning option to consider. It's a complicated one, so it got its own section. It's called select!. It's a macro that lets you wait for the first of several futures to complete---and then automatically cancels the other futures.

Implementing Timeouts

The code for this is in 03_async/select_timeout.

Let's start with a simple example. We'll spawn two futures. One will sleep for 1 second, and the other will sleep for 2 seconds. We'll use select! to wait for the first one to complete. Then we'll print a message and exit.

use std::time::Duration;
use tokio::time::sleep;

async fn do_work() {
    // Pretend to do some work that takes longer than expected
    sleep(Duration::from_secs(2)).await;
}

async fn timeout(seconds: f32) {
    // Wait for the specified number of seconds
    sleep(Duration::from_secs_f32(seconds)).await;
}

#[tokio::main]
async fn main() {
    tokio::select! {
        _ = do_work() => println!("do_work() completed first"),
        _ = timeout(1.0) => println!("timeout() completed first"),
    }
}

The syntax is based on a match statement, but with an additional step. The format is: (value) = (function) => (what to do if it finishes first)

You can change the timeout to determine which will finish first. If you set it to 3 seconds, then do_work() will finish first. If you set it to 0.5 seconds, then timeout() will finish first.

Note that the other future is cancelled---but if you've done work that has side-effects (say saving to a file) the work that has already been performed will not be undone.

Receiving from Multiple Channels

The code for this is in 03_async/select_channels.

An easy way for an async function to be subscribed to multiple channels is to obtain the receivers, and then select! whichever one has data. Here's an example:

use tokio::sync::{mpsc, broadcast};

async fn receiver(mut rx: mpsc::Receiver<u32>, mut broadcast_rx: broadcast::Receiver<u32>) {
    loop {
        tokio::select! {
            Some(n) = rx.recv() => println!("Received message {n} on the mpsc channel"),
            Ok(n) = broadcast_rx.recv() => println!("Received message {n} on the broadcast channel"),
        }
    }
}

#[tokio::main]
async fn main() {
    let (tx, rx) = mpsc::channel::<u32>(1);
    let (broadcast_tx, broadcast_rx) = broadcast::channel::<u32>(1);

    tokio::spawn(receiver(rx, broadcast_rx));

   for count in 0 .. 10 {
        if count % 2 == 0 {
            tx.send(count).await.unwrap();
        } else {
            broadcast_tx.send(count).unwrap();
        }
        tokio::time::sleep(std::time::Duration::from_secs(1)).await;
    }
}

Note that if you have a continuous stream in the MPSC channel, the broadcast channel may take a while to fire! This pattern is good for sending "quit" messages and other control data---but only if it doesn't have to be instantaneous.

Pinning, Boxing and Recursion

Terminology

"Pinning" means storing an async data type in a fixed position in system memory, so that it can safely be used by multiple async tasks. Let's start with a an example that doesn't work.

"Boxing" means putting a variable onto the heap inside a smart pointer (that cleans it up when it drops out of scope).

Recursion means calling a function from within itself. This is a common way to implement a Fibonacci Sequence, for example.

Async Recursion

The code for this is in 03_async/recursion.

Wikipedia tells me that a Fibonacci Sequence is characterized by the fact that every number after the first two is the sum of the two preceding ones. Let's build a simple synchronous Fibonacci function:

fn fibonacci(n: u32) -> u32 {
    match n {
        0 => 0,
        1 => 1,
        _ => fibonacci(n-1) + fibonacci(n-2)
    }
}

#[tokio::main]
async fn main() {
    println!("fibonacci(10) = {}", fibonacci(10));
}

This works fine. What if we want to make it asynchronous?

#![allow(unused)]
fn main() {
async fn async_fibonacci(n: u32) -> u32 {
    match n {
        0 => 0,
        1 => 1,
        _ => fibonacci(n-1).await + fibonacci(n-2).await
    }
}
}

That looks great, right? Oh dear - it won't compile! The error message is:

error[E0733]: recursion in an `async fn` requires boxing
 --> 03_async\recursion\src\main.rs:9:37
  |
9 | async fn async_fibonacci(n: u32) -> u32 {
  |                                     ^^^ recursive `async fn`
  |
  = note: a recursive `async fn` must be rewritten to return a boxed `dyn Future`
  = note: consider using the `async_recursion` crate: https://crates.io/crates/async_recursion

If you go digging into async_recursion, you'll learn that it is a wrapper around many types:

pub type BoxFuture<'a, T> = Pin<Box<dyn Future<Output = T> + Send + 'a, Global>>;

The BoxFuture type itself comes from the futures crate, which we've used plenty of times. Let's add it:

cargo add futures

You can then use BoxFuture for recursion. The syntax isn't pleasant:

/// https://rust-lang.github.io/async-book/07_workarounds/04_recursion.html
use futures::future::{BoxFuture, FutureExt};
fn async_fibonacci(n: u64) -> BoxFuture<'static, u64> {
    async move {
        match n {
            0 => 0,
            1 => 1,
            _ => async_fibonacci(n - 1).await + async_fibonacci(n - 2).await,
        }
    }.boxed()
}

#[tokio::main]
async fn main() {
    println!("fibonacci(10) = {}", async_fibonacci(10).await);
}

I included a link to the Rust "async book" which includes this workaround. I would literally have never figured that out on my own!

Let's do it the easy way

Fortunately, the async_recursion crate can save you from getting a terrible headache when you need to use recursion in async land. Let's add it:

cargo add async_recursion

Then we can use it, and it provides procedural macros to remove the pain:

#![allow(unused)]
fn main() {
use async_recursion::async_recursion;

#[async_recursion]
async fn async_fibonacci_easier(n: u64) -> u64 {
    match n {
        0 => 0,
        1 => 1,
        _ => async_fibonacci_easier(n - 1).await + async_fibonacci_easier(n - 2).await,
    }
}
}

Pinning

The code for this is in 03_async/pinning.

Let's start with something that works:

#[tokio::main]
async fn main() {
    let future = async {
        println!("Hello, world!");
    };
    future.await;
}

Now suppose that you are receiving your future from another source, and you receive a reference to it (maybe because you have futures returning futures!)

This doesn't work:

#[tokio::main]
async fn main() {
    let future = async {
        println!("Hello, world!");
    };
    (&mut future).await;
}

The error message is: [async block@03_async\pinning\src\main.rs:3:18: 5:6] cannot be unpinned consider using Box::pin

So let's pin it using Tokio's pin! macro:

#[tokio::main]
async fn main() {
    let future = async {
        println!("Hello, world!");
    };
    tokio::pin!(future);
    (&mut future).await;
}

That does work! But WHY?

  • Pinning the future guarantees that it won't move in memory for as long as the pin exists.
  • Without a pin, the future could move in memory, and that would be bad---and violate Rust's memory safety guarantees.
  • With a pin, the future can't move---so all is well.

Another Variant

Suppose you really like the old "Command Pattern", in which data results in function pointers determining what to do next. It can be handy in some instances, such as a game server or making your very own Turing machine.

You can return a Future from a function, but what if you want to return different futures (with the same function signature)? The syntax is a little scary, but this is one to keep around for reference:

async fn one() {
    println!("One");
}

async fn two() {
    println!("Two");
}

async fn call_one(n: u32) -> Pin<Box<dyn Future<Output = ()>>> {
    match n {
        1 => Box::pin(one()),
        2 => Box::pin(two()),
        _ => panic!("invalid"),
    }
}

async fn main() {
    let boxed_future = call_one(1).await;
    boxed_future.await;
}

So why does this work? Working outwards from the middle:

  • Future<Output = ()> means a future that returns nothing.
  • dyn Future means a future that implements the Future trait, rather than a concrete signature. It's dynamic.
  • Box wraps the type in a smart pointer (a pointer that will automatically clean up after itself). You have to use a pointer here, because the contents is dynamic---the compiler can't know ahead of time how large the future will be, so it can't allocate it on the stack.
  • Pin means that the future is pinned in memory, so it can't move around. This is necessary because the future is stored on the heap, and the compiler can't know ahead of time how large it will be.

The function Box::pin is a special type of pointer initialization that not only makes a smart pointer, but pins it in memory so that it won't move.

Phew! That's a lot of stuff to remember. Fortunately, you can just copy and paste this code when you need it.

Traits and Generics

Traits and generics form the core of advanced Rust. They are similar to C++ templates (and concepts, if you are lucky enough to have a compiler that supports them).

Traits are basically an interface---a promise to expose an API.

Generics are structures and functions that be templated to adapt to other types that implement traits.

Traits

You've used traits a lot---they are an important part of Rust. But we haven't really talked about them.

Implementing Traits

Whenever you've used #[derive(Debug, Clone, Serialize)] and similar---you are using procedural macros to implement traits. We're not going to dig into procedural macros---they are worthy of their own class---but we will look at what they are doing.

Debug is a trait. The derive macro is implementing the trait for you (including identifying all of the fields to output). You can implement it yourself:

#![allow(unused)]
fn main() {
use std::fmt;

struct Point {
    x: i32,
    y: i32,
}

impl fmt::Debug for Point {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        f.debug_struct("Point")
         .field("x", &self.x)
         .field("y", &self.y)
         .finish()
    }
}
}

Traits are an interface. Each trait defines functions that must be implemented to apply the trait to a type. Once you implement the trait, you can use the trait's functions on the type---and you can also use the trait as a type.

Making a Trait

The code for this is in code/04_mem/make_trait.

Let's create a very simple trait:

#![allow(unused)]
fn main() {
trait Animal {
    fn speak(&self);
}
}

This trait has one function: speak. It takes a reference to self (the type implementing the trait) and returns nothing.

Note: trait parameters are also part of the interface, so if a trait entry needs &self---all implementations of it will need &self.

Now we can make a cat:

#![allow(unused)]
fn main() {
struct Cat;

impl Animal for Cat {
    fn speak(&self) {
        println!("Meow");
    }
}
}

Now you can run speak() on any Cat:

fn main() {
    let cat = Cat;
    cat.speak();
}

You could go on and implement as many speaking animals as you like.

Traits as Function Parameters

You can also create functions that require that a parameter implement a trait:

#![allow(unused)]
fn main() {
fn speak_twice(animal: &impl Animal) {
    animal.speak();
    animal.speak();
}
}

You can call it with speak_twice(&cat)---and it runs the trait's function twice.

Traits as Return Types

You can also return a trait from a function:

#![allow(unused)]
fn main() {
fn get_animal() -> impl Animal {
    Cat
}
}

The fun part here is that you no-longer know the concrete type of the returned type---you know for sure that it implements Animal. So you can call speak on it, but if Cat implements other traits or functions, you can't call those functions.

Traits that Require Other Traits

You could require that all Animal types require Debug be also implemented:

#![allow(unused)]
fn main() {
trait Animal: Debug {
    fn speak(&self);
}
}

Now Cat won't compile until you derive (or implement) `Debug).

You can keep piling on the requirements:

#![allow(unused)]
fn main() {
trait DebuggableClonableAnimal: Animal+Debug+Clone {}
}

Let's make a Dog that complies with these rules:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
struct Dog;

impl Animal for Dog {
    fn speak(&self) {
        println!("Woof");
    }
}

impl DebuggableClonableAnimal for Dog {}
}

Now you can make a dog and call speak on it. You can also use DebuggableCloneableAnimal as a parameter or return type, and be sure that all of the trait functions are available.

Dynamic Dispatch

All of the examples above can be resolved at compile time. The compiler knows the concrete type of the trait, and can generate the code for it. But what if you want to store a bunch of different types in a collection, and call a trait function on all of them?

You might want to try this:

#![allow(unused)]
fn main() {
let animals: Vec<impl Animal> = vec![Cat, Dog];
}

And it won't work. The reason it won't work is that Vec stores identical entries for each record. That means it needs to know the size of the entry. Since cats and dogs might be of different sizes, Vec can't store them.

You can get around this with dynamic dispatch. You've seen this once before, with type GenericResult<T> = std::result::Result<T, Box<dyn std::error::Error>>;. The dyn keyword means that the type is dynamic---it can be different sizes.

Now think back to boxes. Boxes are a smart-pointer. That means they occupy the size of a pointer in memory, and that pointer tells you where the data actually is in the heap. So you can make a vector of dynamic, boxed traits:

#![allow(unused)]
fn main() {
let animals: Vec<Box<dyn Animal>> = vec![Box::new(Cat), Box::new(Dog)];
}

Each vector entry is a pointer (with a type hint) to a trait. The trait itself is stored in the heap. Accessing each entry requires a pointer dereference and a virtual function call. (A vtable will be implemented, but often optimized away---LLVM is very good at avoiding making vtables when it can).

In the threads class, someone asked if you could "send interfaces to channels". And yes, you can---you have to use dynamic dispatch to do it. This is valid:

#![allow(unused)]
fn main() {
let (tx, rx) = std::sync::mpsc::channel::<Box<dyn Animal>>();
}

This works with other pointer types like Rc, and Arc, too. You can have a reference-counted, dynamic dispatch pointer to a trait.

Using dynamic dispatch won't perform as well as static dispatch, because of pointer chasing (which reduces the likelihood of a memory cache hit).

The Any Type

If you really, really need to find out the concrete type of a dynamically dispatched trait, you can use the std::any::Any trait. It's not the most efficient design, but it's there if you really need it.

The easiest way to "downcast" is to require Any in your type and an as_any function:

#![allow(unused)]
fn main() {
struct Tortoise;

impl Animal for Tortoise {
    fn speak(&self) {
        println!("What noise does a tortoise make anyway?");
    }
}

impl DowncastableAnimal for Tortoise {
    fn as_any(&self) -> &dyn Any {
        self
    }
}
}

Then you can "downcast" to the concrete type:

#![allow(unused)]
fn main() {
let more_animals : Vec<Box<dyn DowncastableAnimal>> = vec![Box::new(Tortoise)];
for animal in more_animals.iter() {
    if let Some(cat) = animal.as_any().downcast_ref::<Tortoise>() {
        println!("We have access to the tortoise");
    }
    animal.speak();
}
}

If you can avoid this pattern, you should. It's not very Rusty---it's pretending to be an object-oriented language. But it's there if you need it.

Implementing Operators

"Operator overloading" got a bad name from C++. You can abuse it, and decide that operators do bizarre things. Please don't. If you allow two types to be added together, please use an operation that makes sense to the code reader!

See the 04_mem/operator_overload project.

You can implement operators for your types. Let's make a Point type that can be added together:

use std::ops::Add;

struct Point {
    x: f32,
    y: f32,
}

impl Add for Point {
    type Output = Point;

    fn add(self, rhs: Self) -> Self::Output {
        Point {
            x: self.x + rhs.x, 
            y: self.y + rhs.y
        }
    }
}

fn main() {
    let a = Point { x: 1.0, y: 2.0 };
    let b = Point { x: 3.0, y: 4.0 };
    let c = a + b;
    println!("c.x = {}, c.y = {}", c.x, c.y);
}

There's a full range of operators you can overload. You can also overload the +=, /, * operators, and so on. This is very powerful for letting you express functions (rather than remembering to add x and y each time)---but it can be abused horribly if you decide that + should mean "subtract" or something. Don't do that. Please.

Generics

Generics are very closely tied to traits. "Generics" are meta-programming: a way to write "generic" code that works for multiple types. Traits are a way to specify the requirements for a generic type.

The simplest generic is a function that takes a generic type. Who'se sick of typing to_string() all the time? I am! You can write a generic function that accepts any type that implements ToString---even &str (bare strings) implement ToString:

#![allow(unused)]
fn main() {
fn print_it<T: ToString>(x: T) {
    println!("{}", x.to_string());
}
}

So now you can call print_it with print_it("Hello"), print_it(my_string) or even print_it(42) (because integers implement ToString).

There's a second format for generics that's a bit longer but more readable when you start piling on the requirements:

#![allow(unused)]
fn main() {
fn print_it<T>(x: T)
where
    T: ToString,
{
    println!("{}", x.to_string());
}
}

You can combine requirements with +:

#![allow(unused)]
fn main() {
fn print_it<T>(x: T)
where
    T: ToString + Debug,
{
    println!("{:?}", x);
    println!("{}", x.to_string());
}
}

You can have multiple generic types:

#![allow(unused)]
fn main() {
fn print_it<T, U>(x: T, y: U)
where
    T: ToString + Debug,
    U: ToString + Debug,
{
    println!("{:?}", x);
    println!("{}", x.to_string());
    println!("{:?}", y);
    println!("{}", y.to_string());
}
}

The generics system is almost a programming language in and of itself---you really can build most things with it.

Traits with Generics

See the 04_mem/trait_generic project.

Some traits use generics in their implementation. The From trait is particularly useful, so let's take a look at it:

#![allow(unused)]
fn main() {
struct Degrees(f32);
struct Radians(f32);

impl From<Radians> for Degrees {
    fn from(rad: Radians) -> Self {
        Degrees(rad.0 * 180.0 / std::f32::consts::PI)
    }
}

impl From<Degrees> for Radians {
    fn from(deg: Degrees) -> Self {
        Radians(deg.0 * std::f32::consts::PI / 180.0)
    }
}
}

Here we've defined a type for Degrees, and a type for Radians. Then we've implemented From for each of them, allowing them to be converted from the other. This is a very common pattern in Rust. From is also one of the few surprises in Rust, because it also implements Into for you. So you can use any of the following:

#![allow(unused)]
fn main() {
let behind_you = Degrees(180.0);
let behind_you_radians = Radians::from(behind_you);
let behind_you_radians2: Radians = Degrees(180.0).into();
}

You can even define a function that requires that an argument be convertible to a type:

#![allow(unused)]
fn main() {
fn sin(angle: impl Into<Radians>) -> f32 {
    let angle: Radians = angle.into();
    angle.0.sin()
}
}

And you've just made it impossible to accidentally use degrees for a calculation that requires Radians. This is called a "new type" pattern, and it's a great way to add constraints to prevent bugs.

You can also make the sin function with generics:

#![allow(unused)]
fn main() {
fn sin<T: Into<Radians>>(angle: T) -> f32 {
    let angle: Radians = angle.into();
    angle.0.sin()
}
}

The impl syntax is a bit newer, so you'll see the generic syntax more often.

Generics and Structs

You can make generic structs and enums, too. In fact, you've seen lots of generic enum types already: Option<T>, Result<T, E>. You've seen plenty of generic structs, too: Vec<T>, HashMap<K,V> etc.

Let's build a useful example. How often have you wanted to add entries to a HashMap, and instead of replacing whatever was there, you wanted to keep a list of all of the provided values that match a key.

The code for this is in 04_mem/hashmap_bucket.

Let's start by defining the basic type:

#![allow(unused)]
fn main() {
use std::collections::HashMap;

struct HashMapBucket<K,V>
{
    map: HashMap<K, Vec<V>>
}
}

The type contains a HashMap, each key (of type K) referencing a vector of values (of type V). Let's make a constructor:

#![allow(unused)]
fn main() {
impl <K,V> HashMapBucket<K,V> 
{
    fn new() -> Self {
        HashMapBucket {
            map: HashMap::new()
        }
    }
}

So far, so good. Let's add an `insert` function (inside the implementation block):

```rust
fn insert(&mut self, key: K, value: V) {
    let values = self.map.entry(key).or_insert(Vec::new());
    values.push(value);
}
}

Uh oh, that shows us an error. Fortunately, the error tells us exactly what to do---the key has to support Eq (for comparison) and Hash (for hashing). Let's add those requirements to the struct:

#![allow(unused)]
fn main() {
impl <K,V> HashMapBucket<K,V> 
where K: Eq + std::hash::Hash
{
    fn new() -> Self {
        HashMapBucket {
            map: HashMap::new()
        }
    }

    fn insert(&mut self, key: K, value: V) {
        let values = self.map.entry(key).or_insert(Vec::new());
        values.push(value);
    }
}
}

So now we can insert into the map and print the results:

fn main() {
    let mut my_buckets = HashMapBucket::new();
    my_buckets.insert("hello", 1);
    my_buckets.insert("hello", 2);
    my_buckets.insert("goodbye", 3);
    println!("{:#?}", my_buckets.map);
}

In 21 lines of code, you've implemented a type that can store multiple values for a single key. That's pretty cool. Generics are a little tricky to get used to, but they can really supercharge your productivity.

Amazing Complexity

If you look at the Bevy game engine, or the Axum webserver, you'll find the most mind-boggling combinations of generics and traits. It's not uncommon to see a type that looks like this:

Remember how in Axum you could do dependency injection by adding a layer containing a connection pool, and then every route could magically obtain one by supporting it as a parameter? That's generics and traits at work.

In both cases:

  • A function accepts a type that meets certain criteria. Axum layers are cloneable, and can be sent between threads.
  • The function stores the layers as a generic type.
  • Routes are also generic, and parameters match against a generic+trait requirement. The route is then stored as a generic function pointer.

There's even code that handles <T1>, <T1, T2> and other lists of parameters (up to 16) with separate implementations to handle whatever you may have put in there!

It's beyond the scope of a foundations class to really dig into how that works---but you have the fundamentals.

Iterators

You've used iterators---and their async cousin streams---a lot:

  • Calling iter has created an iterator over a collection. The iterator returns borrowed references to each element of the collection.
  • Calling iter_mut does the same, but gains a mutable reference to each element.
  • Calling into_iter does the same, but gains ownership of each element (returning it out of the collection).
  • Drain is a special iterator that returns owned elements from a collection, but also removes them from the collection---leaving the collection empty but usable.

Once you have an iterator, you have a lot of iterator functions---map, filter, reduce, fold, etc.

Memory and Iterators

When you use iter() and iter_mut(), you aren't having much memory impact: a single pointer to each element is created. Using into_iter() generates a move for each item (which is sometimes optimized away). So in general, iterators are a very lightweight way to operate on a collection.

As soon as you collect, you are using memory for every collected item. If you're collecting references, that's one pointer per collected item. If you're cloning---you just doubled your memory usage. So be careful!

Creating Your Own Iterator

The code for this is in code/04_mem/iterator_generator.

An iterator is a type that implements the Iterator trait. The trait has one function: next(). It returns an Option<T> where T is the type of the iterator's elements. If the iterator is done, it returns None. Otherwise, it returns Some(element). The type itself can store any state required for the its task.

You don't have to iterate over anything---you can use an iterator as a generator. Let's make a simple iterator that counts up from 0 to a "max" number. We'll start with a simple structure and constructor:

#![allow(unused)]
fn main() {
struct Counter {
    count: u32,
    max: u32,
}

impl Counter {
    fn new(max: u32) -> Counter {
        Counter { count: 0, max }
    }
}
}

We can implement Iterator for Counter:

#![allow(unused)]
fn main() {
impl Iterator for Counter {
    type Item = u32;
    fn next(&mut self) -> Option<Self::Item> {
        if self.count < self.max {
            self.count += 1;
            Some(self.count)
        } else {
            None
        }
    }
}
}

So the iterator adds one to its stored value and returns it. That will give you a sequence of numbers from 1 to max. We can use it like this:

fn main() {
    let numbers: Vec<u32> = Counter::new(10).collect();
    println!("{numbers:?}");
}

This will print [1, 2, 3, 4, 5, 6, 7, 8, 9, 10].

If you'd rather return and then add, you can tweak the iterator:

#![allow(unused)]
fn main() {
impl Iterator for Counter {
    type Item = u32;
    fn next(&mut self) -> Option<Self::Item> {
        if self.count < self.max {
            let result = Some(self.count);
            self.count += 1;
            result
        } else {
            None
        }
    }
}
}

This will print [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].

Optimization---Exact Size Iterators

You can achieve a small speed improvement by also indicating that the iterator returns a fixed size:

#![allow(unused)]
fn main() {
impl ExactSizeIterator for Counter {
    fn len(&self) -> usize {
        self.max as usize
    }
}
}

Knowing that he iterator will always return a fixed size allows the compiler to perform some optimizations.

To really optimize it, you need can set MAX as a compile-time constant. This gets a bit messy (there's compile-time generic markers everywhere)---but it works. This is a good example of code that belongs in a library:

struct Counter<const MAX: u32> {
    count: u32,
}

impl <const MAX:u32> Counter<MAX> {
    fn new() -> Self {
        Self { count: 0 }
    }
}

impl <const MAX:u32> Iterator for Counter<MAX> {
    type Item = u32;
    fn next(&mut self) -> Option<Self::Item> {
        if self.count < MAX {
            let result = Some(self.count);
            self.count += 1;
            result
        } else {
            None
        }
    }
}

impl <const MAX:u32> ExactSizeIterator for Counter<MAX> {
    fn len(&self) -> usize {
        MAX as usize
    }
}

fn main() {
    let numbers: Vec<u32> = Counter::<10>::new().collect();
    println!("{numbers:?}");
}

This helps because the optimizer can see exactly how many iterations will occur, and can reduce the number of bounds-checking calls that are required.

Iterating Over a Collection

The code for this is in code/04_mem/iterator_hashbucket.

Remember we created a HashMapBucket type? Let's extend it to provide an iterator.

We'll start by making an empty iterator type:

#![allow(unused)]
fn main() {
struct HashMapBucketIter;

impl Iterator for HashMapBucketIter {
    type Item = ();

    fn next(&mut self) -> Option<Self::Item> {
        None
    }
}
}

Now, let's add some generic properties---including a lifetime. You have to be sure that the structure you are iterating over will last longer than the iterator itself:

#![allow(unused)]
fn main() {
struct HashMapBucketIter<'a, K, V> {
    key_iter: std::collections::hash_map::Iter<'a, K, Vec<V>>,
    current_map_entry: Option<(&'a K, &'a Vec<V>)>,
    current_vec_index: usize,
}
}

You've added:

  • key_iter - an iterator over the map itself. This will return a reference to each key, and a reference to the vector of values. It's exactly the same as calling iter() on the stored map.
  • current_map_entry - the current key and vector of values. This is an Option because the iterator might be complete. It's the same as calling next() on key_iter.
  • current_vec_index. Since we're returning each (K,V) entry separately, we need to store our progress through each key's iterator.

Now we can add an iter() function to the HashMapBucket itself:

#![allow(unused)]
fn main() {
impl <K,V> HashMapBucket<K, V> {
    fn iter(&self) -> HashMapBucketIter<K, V> {
        let mut key_iter = self.map.iter();
        let current_map_entry = key_iter.next();
        HashMapBucketIter {
            key_iter,
            current_map_entry,
            current_vec_index: 0,
        }
    }
}
}

See how this aligns with the iterator type we created? We create an iterator over the map and store it in the iterator. This is why you had to specify a lifetime: you have to convince Rust that the iterator MUST out-live the map itself. Then we call next on the iterator to obtain the first key and vector of values (a reference - we're not copying or cloning).

With this data, we can build the iterator itself:

#![allow(unused)]
fn main() {
// Specify 'a - the lifetime, and K,V on both sides.
// If you wanted to change how the iterator acts for a given type of key or
// value you could cange the left-hand side.
impl <'a, K, V> Iterator for HashMapBucketIter<'a, K, V> {
    // Define `Item` - a type used inside the trait - to be a reference to a key and value.
    // This specifies the type that the iterator will return.
    type Item = (&'a K, &'a V);

    // You use Item to specify the type returned by `Next`. It's always an option of the type.
    fn next(&mut self) -> Option<Self::Item> {
        // If there is a current map entry, and a current vec index
        if let Some((key, values)) = self.current_map_entry {
            // If the index is less than the length of the vector
            if self.current_vec_index < values.len() {
                // Get the value at the current index
                let value = &values[self.current_vec_index];
                // Increment the index
                self.current_vec_index += 1;
                // Return the key and value
                return Some((key, value));
            } else {
                // We're past the end of the vector, next key
                self.current_map_entry = self.key_iter.next();
                self.current_vec_index = 0;

                if let Some((key, values)) = self.current_map_entry {
                    // If the index is less than the length of the vector
                    if self.current_vec_index < values.len() {
                        // Get the value at the current index
                        let value = &values[self.current_vec_index];
                        // Increment the index
                        self.current_vec_index += 1;
                        // Return the key and value
                        return Some((key, value));
                    }
                }
            }
        }

        None
    }
}
}

Web Service in 30 Minutes

Let's take what we've learned, and build a real network service. Along the way, we're going to learn about working with databases and a webserver named Axum.

Working With Databases

If there's one task that's common to Enterprise systems, it's waiting for the database. A lot of programs spend time waiting for the database---adding data, reading it, updating it, deleting it. Database interaction is one area in which async really shines---and Rust has tools to help with it.

Setup

I don't want to ask everyone to install a local copy of PostgreSQL or similar just for this class, that'd be excessive. Instead, we'll use sqlite---a tiny self-contained database. It's not very powerful, but it gets the job done.

The code for this example is in 03_async/database.

Let's start by adding some crates to our program:

cargo add tokio -F full
cargo add sqlx -F runtime-tokio-native-tls -F sqlite
cargo add anyhow
cargo add dotenv
cargo add futures

We'll also install the sqlx command-line tool with:

cargo install sqlx-cli

Lastly, we need to tell sqlx where to find the database we'll be using. In the top-level of your project (next to Cargo.toml and the src directory) create a file named .env. This is a helper for setting environment variables.

In .env, add the following line:

DATABASE_URL="sqlite:hello_db.db"

Create the Database

You can tell sqlx to create an empty database by typing:

sqlx database create

Notice that "hello_db.db" has appeared! This is the database file. You can open it with a SQLite client if you want to poke around.

Create a Migration

Migrations are a common process in applications. You define an initial migration to build tables and add any initial data you require. Then you add migrations to update the database as your application evolves. sqlx supports migrations natively, and can build them into your program.

Let's create a migration.

sqlx migrate add initial

initial is just the name of the migration. If you look in the source folder, a "migrations" folder has appeared. A .sql file containing the migration has been created. It's largely empty.

Let's add some SQL to create a table:

-- Create a messages table
CREATE TABLE IF NOT EXISTS messages
(
    id          INTEGER PRIMARY KEY NOT NULL,
    message     TEXT                NOT NULL
);

--- Insert some test messages
INSERT INTO messages (id, message) VALUES (1, 'Hello World!');
INSERT INTO messages (id, message) VALUES (2, 'Hello Galaxy!');
INSERT INTO messages (id, message) VALUES (3, 'Hello Universe!');

You can run the migrations with:

sqlx migrate run

An extra table is created storing migration status in the database. Migrations won't be run twice.

Accessing the Database via Async Rust

Now that we have a database, let's wire it up with some Rust.

use sqlx::Row;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // Read the .env file and obtain the database URL
    dotenv::dotenv()?;
    let db_url = std::env::var("DATABASE_URL")?;

    // Get a database connection pool
    let pool = sqlx::SqlitePool::connect(&db_url).await?;

    // Fetch the messages from the database
    let messages = sqlx::query("SELECT id, message FROM messages")
        .map(|row: sqlx::sqlite::SqliteRow| {
            let id: i64 = row.get(0);
            let message: String = row.get(1);
            (id, message)
        })
        .fetch_all(&pool)
        .await?;

    // Print the messages
    for (id, message) in messages {
        println!("{id}: {message}");
    }

    Ok(())
}

The program outputs the data we inserted:

1: Hello World!
2: Hello Galaxy!
3: Hello Universe!

Let's Make this a Bit Easier

Mapping each row and parsing with get is messy---and you don't have to do it! Sqlx supports a FromRow system that can automatically convert rows into Rust structs.

Start by making a structure to hold the data:

#![allow(unused)]
fn main() {
use sqlx::FromRow;

#[derive(Debug, FromRow)]
struct Message {
    id: i64,
    message: String,
}
}

Then you can update the query to be much simpler:

#![allow(unused)]
fn main() {
let messages = sqlx::query_as::<_, Message>("SELECT id, message FROM messages")
    .fetch_all(&pool)
    .await?;

// Print the messages
for message in messages.into_iter() {
    println!("{message:?}");
}
}

sqlx is NOT an ORM (Object-Relational-Manager). It won't handle updating the structure and building SQL for you. There are options including SeaOrm and Diesel for this if you need it.

How About Streaming?

Retrieving every single record with fetch_all is fine for small queries, but what is you are retrieving a million records? You will potentially cause all manner of performance problems.

Aside: If you actually need to query a million records at once, that's often a sign of an architectural issue. You should consider smaller chunks, cursors/pagination. You should really check if you actually need all million, or can use a filter.

We talked about streams a bit before. A stream is like an iterator, but accessing the next entry is an async operation. This has two advantages:

  • You are no longer blocking while you retrieve each row.
  • The database driver can receive rows one at a time, reducing overall load.

Conversely---it's not as fast, because you are waiting on each row.

Let's try it out:

#![allow(unused)]
fn main() {
println!("--- stream ---");
use futures::TryStreamExt;
let mut message_stream = sqlx::query_as::<_, Message>("SELECT id, message FROM messages")
    .fetch(&pool);
while let Some(message) = message_stream.try_next().await? {
    println!("{message:?}");
}
}

Let's Automate our Migrations

Having to run the migrations tool by hand each time is cumbersome. We can automate that, too.

This is pretty straightforward. Add the following:

#![allow(unused)]
fn main() {
// Get a database connection pool
// <--> To tell you where this goes

// Run Migrations
sqlx::migrate!("./migrations")
    .run(&pool)
    .await?;
}

Now let's make another migration that adds a bit more data to the database:

sqlx migrate add more_messages

And we'll set the migration contents to:

INSERT INTO messages (id, message) VALUES (4, 'Another Message');
INSERT INTO messages (id, message) VALUES (5, 'Yet Another Message');
INSERT INTO messages (id, message) VALUES (6, 'Messages Never End');

Now don't run the sqlx migration command. Instead, run your program.

The migration ran, and you see your new data:

--- stream ---
Message { id: 1, message: "Hello World!" }
Message { id: 2, message: "Hello Galaxy!" }
Message { id: 3, message: "Hello Universe!" }
Message { id: 4, message: "Another Message" }
Message { id: 5, message: "Yet Another Message" }
Message { id: 6, message: "Messages Never End" }

Run it again. You don't get even more data appearing (or errors about duplicate keys). The migrations table has ensures the migration is not run twice.

Updating Data

Running update and delete queries uses slightly different syntax, but it's basically the same. Let's update the first message:

First, we'll create a function.

#![allow(unused)]
fn main() {
async fn update_message(id: i64, message: &str, pool: &sqlx::SqlitePool) -> anyhow::Result<()> {
    sqlx::query("UPDATE messages SET message = ? WHERE id = ?")
        .bind(message)
        .bind(id)
        .execute(pool)
        .await?;
    Ok(())
}
}

Note:

  • .bind replaces placeholders in the query with the values you provide. This is a good way to avoid SQL injection attacks.
  • .execute runs a query that isn't expecting an answer other than success or failure.

And then in main we call it:

#![allow(unused)]
fn main() {
// Update message 1
update_message(1, "First Message", &pool).await?;
}

Let's Add Tracing

sqlx supports tracing, so you can see what's going on under the hood. Let's add it to our program.

Start by adding the tracing subscriber to your Cargo.toml:

cargo add tracing
cargo add tracing-subscriber

Add a subscription to the tracing to your main function:

#![allow(unused)]
fn main() {
// Enable tracing
tracing_subscriber::fmt::init();
}

Now run the program unchanged. You will see lots of extra information:

2023-05-31T15:11:57.330979Z  INFO sqlx::query: SELECT id, message FROM …; rows affected: 1, rows returned: 6, elapsed: 94.900µs

SELECT
  id,
  message
FROM
  messages

If you didn't see anything, set an environment variable RUST_LOG=info. On *NIX, you can do RUST_LOG=info cargo run. On Windows, $Env:RUST_LOG=info sets the variable.

Axum - Tokio's Web Framework

Axum is a web framework built on top of Tokio. It is inspired by the likes of Rocket, Actix and Warp. It is lightweight and relatively easy to use. It also includes a number of features that make it a good choice for building enterprise web services.

Hello Web

This example is in 03_async/hello_web

Let's make a really trivial webserver that just returns "Hello World" with no formatting, nor even HTML decoration.

To start, you need two services: tokio and axum. Add them to your Cargo.toml:

cargo add tokio -F full
cargo add axum

Now in our main.rs file, we can build a minimal webserver quite easily:

use axum::{routing::get, Router};
use std::net::SocketAddr;

#[tokio::main]
async fn main() {
    let app = Router::new().route("/", get(say_hello_text));
    let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
    axum::Server::bind(&addr)
        .serve(app.into_make_service())
        .await
        .unwrap();
}

async fn say_hello_text() -> &'static str {
    "Hello, world!"
}

Let's unpack this:

  • Router is an Axum service that matches URLs to services. In this case, we're matching the root URL (/) to the say_hello_text service.
  • addr is set with a SocketAddr to bind to localhost on port 3000.
  • axum::Server uses the builder pattern
    • bind sets the address to bind to.
    • serve accepts the Router you created, and launches a webserver.
    • You have to await the server, because it's an asynchronous task.
    • unwrap is used to handle any errors that might occur.

The say_hello_text function is relatively straightforward. Don't worry about 'static --- we'll talk about it in next week's talk about memory management, resource management and variables.

Run the program with cargo run. It won't output anything. Point a browser at http://localhost:3000 and be amazed by the Hello, world! message.

You've made a basic webserver in 16 lines of code. It's also very fast---which is extra easy since it doesn't really do anything yet.

Let's return "Hello World" as HTML

A webserver that doesn't return HTML is a bit odd, so let's turn "Hello World" into a proper HTML page.

#![allow(unused)]
fn main() {
use axum::response::Html;

async fn say_hello_html() -> Html<&'static str> {
    Html("<h1>Hello, world!</h1>")
}
}

And change your route to call the new function. Run the program and go to http://localhost:3000 again. You should see a big, bold "Hello, world!".

HTML in Static Files

It's easier to write large amounts of HTML in a separate file. You can then import the file into your program. Let's do that.

First, in your src directory create a file named hello.html:

<html>
<head>
    <title>Hello World</title>
</head>
<body>
    <p>Greetings, oh lovely world.</p>
</body>
</html>

I'm not great at HTML!

Now, in your main.rs file, you can import the file and return it as HTML:

#![allow(unused)]
fn main() {
async fn say_hello_html_included() -> Html<&'static str> {
    const HTML: &str = include_str!("hello.html");
    Html(HTML)
}
}

Change the route again, and your file will be included when you run the webserver.

HTML in Dynamic Files

There's a very real performance benefit to statically loading your pages, but it makes editing them a pain. Let's make a dynamic page that loads the HTML from a file.

This only requires a small change:

#![allow(unused)]
fn main() {
async fn say_hello_file() -> Html<String> {
    let path = Path::new("src/hello.html");
    let content = tokio::fs::read_to_string(path).await.unwrap();
    Html(content)
}
}

Now run your webserver. Change the HTML file and reload the page. You should see the changes.

Note: You probably want some cache in the real world---but this is great for rapid development.

Add a JSON Get Service

Let's add serde to our project with cargo add serde -F derive.

Now we'll add a structure and make it serializable:

#![allow(unused)]
fn main() {
#[derive(Serialize)]
struct HelloJson {
    message: String,
}
}

And we can add a handler function to return some JSON:

#![allow(unused)]
fn main() {
async fn say_hello_json() -> axum::Json<HelloJson> {
    axum::Json(HelloJson {
        message: "Hello, World!".to_string(),
    })
}
}

Lastly, we need to add a route to use it:

#![allow(unused)]
fn main() {
let app = Router::new()
    .route("/", get(say_hello_file))
    .route("/json", get(say_hello_json));
}

Now run the server, and connect to http://localhost:3000/json. You'll see a JSON response.

Responding to Other Verbs

Axum supports get, post, put, delete, head, options, trace, connect and patch. Let's add a post route.

#![allow(unused)]
fn main() {
async fn say_hello_post() -> &'static str {
    "Hello, POST!"
}
}

Now add it to your routes:

#![allow(unused)]
fn main() {
let app = Router::new()
    .route("/", get(say_hello_file))
    .route("/json", get(say_hello_json))
    .route("/post", post(say_hello_post));
}

Let's update the HTML page to perform the POST for us:

<html>

<head>
    <title>Hello World</title>
</head>

<body>
    <p>Greetings, oh lovely world.</p>
    <p id="result"></p>
</body>

<script>
    function doPost() {
        fetch('/post', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json'
            },
            body: ''
        })
            .then(response => response.text())
            .then(result => {
                document.getElementById('result').innerHTML = result;
            })
            .catch(error => {
                console.error('Error:', error);
            });

    }

    doPost();
</script>

</html>

As you can see, I'm not a JavaScript programmer either!

The same techniques work for all of the HTTP verbs.

Let's Build a Thumbnail Server

The last class in this series will build a fully functional server with all the bells and whistles.

We've covered a lot of ground in the classes so far:

  • We've learned to make basic Rust programs.
  • We can serialize and de-serialize data.
  • We've learned all about system threads, and using Rayon to make them easy to use.
  • We've covered async/await for high-performance servers.
  • We've talked a lot about Tokio and what it can do for you.
  • We've connected to databases.
  • We've built a mini web server using Axum.

That's a lot of ground to cover in just a few hours. To help it "click", let's build a server that draws together some of these concepts. We're squeezing this into the end of the class, so it will be a bit bare-bones.

The Design Idea

We want to create a simple web server that displays thumbnails of images. It will need the following endpoints:

  • / - Display thumbnails of all images. Includes a form for adding an image.
  • /images - JSON list of all uploaded images.
  • (post) - /upload - Upload a new image and create a thumbnail.
  • /image/<id> - Display a single image.
  • /thumb/<id> - Display a single thumbnail.
  • (post) /search - find images by tag.

The code for this is in 03_async/thumbnail_server.

Add Dependencies

We're going to be pulling together much of what we've already learned, so we have quite a few dependencies:

cargo add tokio -F full
cargo add serde -F derive
cargo add axum -F multipart
cargo add sqlx -F runtime-tokio-native-tls -F sqlite
cargo add anyhow
cargo add dotenv
cargo add futures
cargo add dotenv
cargo add tokio_util -F io
cargo add image

Create the Database

Create a .env file in your project containing:

DATABASE_URL="sqlite:images.db"

Then create the database:

sqlx database create

Let's also create a migration to make our initial database:

sqlx migrate add initial

A file has appeared in the migrations directory. Let's flesh out a minimal images database:

-- Create images table
CREATE TABLE IF NOT EXISTS images
(
    id          INTEGER PRIMARY KEY NOT NULL,
    tags        TEXT                NOT NULL
);

Now we'll build our main.rs file to run with Tokio, read the .env file, connect to the database and run any migrations:

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // Read the .env file and obtain the database URL
    dotenv::dotenv()?;
    let db_url = std::env::var("DATABASE_URL")?;

    // Get a database connection pool
    let pool = sqlx::SqlitePool::connect(&db_url).await?;

    // Run Migrations
    sqlx::migrate!("./migrations")
        .run(&pool)
        .await?;

    Ok(())
}

This is a good time to test that everything is working. We can run the server with cargo run and see that it compiles and runs.

Setup Axum with a Layer to Offer Dependency Injection for the Database

Axum can help with global state and dependency injection. We'll use this to inject the database connection pool into our handlers.

First, let's create the Axum application:

#![allow(unused)]
fn main() {
// Build Axum with an "extension" to hold the database connection pool
let app = Router::new()
    .route("/", get(test))
    .layer(Extension(pool));
let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
axum::Server::bind(&addr)
    .serve(app.into_make_service())
    .await
    .unwrap();
}

We'll also make a handler named test that just returns a string:

#![allow(unused)]
fn main() {
async fn test(Extension(pool): Extension<sqlx::SqlitePool>) -> String {
    let result = sqlx::query("SELECT COUNT(id) FROM images")
        .fetch_one(&pool)
        .await
        .unwrap();
    let count = result.get::<i64, _>(0);
    format!("{count} images in the database")
}
}

Now run the program. Go to http://localhost:3000 and you will see "0 images in the database". We've injected our database pool, successfully queries the database and returned dynamic data.

Create a Basic Homepage

Let's create the beginnings of a web page that will display our thumbnails. Create a new file named index.html in your src directory:

<!DOCTYPE html>
<html>
<head>
    <title>My Awesome Thumbnail Server</title>
</head>
<body>
    <h1>Welcome to the thumbnail server</h1>
    <div id="thumbnails"></div>
    <hr />
    <h2>Add an Image</h2>
    <form method="post" action="/upload" enctype="multipart/form-data">
        <input type="text" name="tags" value="" placeholder="Tags" /> <br />
        <input type="file" name="image" /> <br />
        <input type="submit" value="Upload New Image" />
    </form>
</body>
</html>

Now comment out the test function, and add a replacement for loading the index page from disk:

#![allow(unused)]
fn main() {
async fn index_page() -> Html<String> {
    let path = Path::new("src/index.html");
    let content = tokio::fs::read_to_string(path).await.unwrap();
    Html(content)
}
}

Adjust the route to point to the new page: .route("/", get(index_page)).

Run the program now (cargo run) and you can see the HTML form. We're making progress!

Uploading Images

Axum has great support for forms built-in, but files are always a little more complicated. The easiest way is to use a multipart form, which we've done in the HTML file. (You can normally use the Form type to automatically deserialize forms; we'll do that in the search system).

Create a new handler named uploader:

#![allow(unused)]
fn main() {
async fn uploader(mut multipart: Multipart) -> String {
    while let Some(field) = multipart.next_field().await.unwrap() {
        let name = field.name().unwrap().to_string();
        let data = field.bytes().await.unwrap();

        println!("{name} is {} bytes", data.len());
    }
    "Ok".to_string()
}
}

And add it to your routes:

#![allow(unused)]
fn main() {
.route("/upload", post(uploader))
}

Now run the program and submit an image and some tags. You'll see something like this on the server console:

tags is 11 bytes
image is 1143067 bytes

The data made it! Now we need to turn it into useful data. Let's extract the fields we want:

#![allow(unused)]
fn main() {
async fn uploader(mut multipart: Multipart) -> String {
    let mut tags = None; // "None" means "no tags yet"
    let mut image = None;
    while let Some(field) = multipart.next_field().await.unwrap() {
        let name = field.name().unwrap().to_string();
        let data = field.bytes().await.unwrap();

        match name.as_str() {
            "tags" => tags = Some(String::from_utf8(data.to_vec()).unwrap()), // Using Some means we can check we received it
            "image" => image = Some(data.to_vec()),
            _ => panic!("Unknown field: {name}"),
        }
    }

    if let (Some(tags), Some(image)) = (tags, image) { // Destructuring both Options at once

    } else {
        panic!("Missing field");
    }

    "Ok".to_string()
}
}

This gives a relatively robust extractor---we can be sure that we've received both fields we need, and will throw an error if we haven't.

You can run the program now and resubmit the form---if it doesn't error out, it's all good.

Saving the Image

The first thing to do is to add the image with tags to the database, and obtain the new primary key. Let's create a function to do this:

#![allow(unused)]
fn main() {
async fn insert_image_into_database(pool: &Pool<Sqlite>, tags: &str) -> anyhow::Result<i64> {
    let row = sqlx::query("INSERT INTO images (tags) VALUES (?) RETURNING id")
        .bind(tags)
        .fetch_one(pool)
        .await?;

    Ok(row.get(0))
}
}

The function simply inserts the tags and returns the new id. We've used anyhow to simplify error handling.

Now let's call it:

#![allow(unused)]
fn main() {
if let (Some(tags), Some(image)) = (tags, image) {
    let new_image_id = insert_image_into_database(&pool, &tags).await.unwrap();
} else {
    panic!("Missing field");
}
}

We need to save the image to disk. Let's create a function to do this:

#![allow(unused)]
fn main() {
async fn save_image(id: i64, bytes: &[u8]) -> anyhow::Result<()> {
    // Check that the images folder exists and is a directory
    // If it doesn't, create it.
    let base_path = Path::new("images");
    if !base_path.exists() || !base_path.is_dir() {
        tokio::fs::create_dir_all(base_path).await?;
    }

    // Use "join" to create a path to the image file. Join is platform aware,
    // it will handle the differences between Windows and Linux.
    let image_path = base_path.join(format!("{id}.jpg"));
    if image_path.exists() {
        // The file exists. That shouldn't happen.
        anyhow::bail!("File already exists");
    }

    // Write the image to the file
    tokio::fs::write(image_path, bytes).await?;
    Ok(())
}
}

And let's call it from the uploader:

#![allow(unused)]
fn main() {
if let (Some(tags), Some(image)) = (tags, image) {
    let new_image_id = insert_image_into_database(&pool, &tags).await.unwrap();
    save_image(new_image_id, &image).await.unwrap();
} else {
    panic!("Missing field");
}
}

We're not making any thumbnails yet, but we should test our progress so far. Run the program, upload an image and check that it appears in the images folder.

Displaying all the Images

Now that we have an image in both the database and the filesystem, let's display it. We'll need to create a new handler:

#![allow(unused)]
fn main() {
async fn get_image(Path(id): Path<i64>) -> impl IntoResponse {
    let filename = format!("images/{id}.jpg");
    let attachment = format!("filename={filename}");
    let mut headers = HeaderMap::new();
    headers.insert(
        header::CONTENT_TYPE,
        header::HeaderValue::from_static("image/jpeg"),
    );
    headers.insert(
        header::CONTENT_DISPOSITION,
        header::HeaderValue::from_str(&attachment).unwrap()
    );
    let file = tokio::fs::File::open(&filename).await.unwrap();
    axum::response::Response::builder()
        .header(header::CONTENT_TYPE, header::HeaderValue::from_static("image/jpeg"))
        .header(header::CONTENT_DISPOSITION, header::HeaderValue::from_str(&attachment).unwrap())
        .body(StreamBody::new(ReaderStream::new(file)))
        .unwrap()
}
}

This is a bit boilerplatey, but it shows you the options you have. In this case, we build the appropriate HTTP headers and use StreamBody to stream the contents of the image file to the client.

The Path(id) is a very handy Axum extractor. You can specify placeholders in the URL, and use Path to fill variables.

We'll also need to add the route:

#![allow(unused)]
fn main() {
.route("/image/:id", get(get_image))
}

Now run the program and go to http://localhost:3000/image/1 to see the image we just uploaded.

Making a thumbnail

The Rust image crate makes creating thumbnails easy. It also uses Rayon under the hood, so it's fast. Let's create a function to make a thumbnail:

#![allow(unused)]
fn main() {
fn make_thumbnail(id: i64) -> anyhow::Result<()> {
    let image_path = format!("images/{id}.jpg");
    let thumbnail_path = format!("images/{id}_thumb.jpg");
    let image_bytes: Vec<u8> = std::fs::read(image_path)?;
    let image = if let Ok(format) = image::guess_format(&image_bytes) {
        image::load_from_memory_with_format(&image_bytes, format)?
    } else {
        image::load_from_memory(&image_bytes)?
    };
    let thumbnail = image.thumbnail(100, 100);
    thumbnail.save(thumbnail_path)?;
    Ok(())
}
}

We're doing a little dance here and not trusting that the file uploaded is actually a good jpeg. I kept uploading a PNG by mistake. So we load the image as bytes, and then use guess_format to let Image figure it out for us!

Since our first image doesn't have a thumbnail yet, let's use it as test data. We'll build a function that grabs a list of all of the images in our database, checks to see if a thumbnail exists and makes one if it doesn't:

#![allow(unused)]
fn main() {
async fn fill_missing_thumbnails(pool: &Pool<Sqlite>) -> anyhow::Result<()> {
    let mut rows = sqlx::query("SELECT id FROM images")
        .fetch(pool);

    while let Some(row) = rows.try_next().await? {
        let id = row.get::<i64, _>(0);
        let thumbnail_path = format!("images/{id}_thumb.jpg");
        if !std::path::Path::new(&thumbnail_path).exists() {
            spawn_blocking(move || {
                make_thumbnail(id)
            }).await??;
        }
    }

    Ok(())
}
}

Now let's add this to the beginning of the program (before we start Axum):

#![allow(unused)]
fn main() {
// Check thumbnails
    fill_missing_thumbnails(&pool).await?;
}

You could easily run this in the background by spawning it separately. You could also not await on the spawn_blocking call to have the background threads all run concurrently.

Now run the program and check that the thumbnail exists.

Finally, let's also make the thumbnail on upload:

#![allow(unused)]
fn main() {
if let (Some(tags), Some(image)) = (tags, image) {
    let new_image_id = insert_image_into_database(&pool, &tags).await.unwrap();
    save_image(new_image_id, &image).await.unwrap();
    spawn_blocking(move || {
        make_thumbnail(new_image_id).unwrap();
    });
} else {
    panic!("Missing field");
}
}

Listing Available Images

Let's be lazy since it's the end of the day and copy/paste get_image and make a thumbnail fetching version.

#![allow(unused)]
fn main() {
async fn get_thumbnail(Path(id): Path<i64>) -> impl IntoResponse {
    let filename = format!("images/{id}_thumb.jpg");
}

(The rest of the function is unchanged)

Now let's add the route:

#![allow(unused)]
fn main() {
.route("/thumb/:id", get(get_thumbnail))
}

You can test this if you like. Go to http://localhost:3000/thumb/1 and you should see the thumbnail.

Now let's build a JSON service that returns all of the stored images:

#![allow(unused)]
fn main() {
#[derive(Deserialize, Serialize, FromRow, Debug)]
struct ImageRecord {
    id: i64,
    tags: String,
}

async fn list_images(Extension(pool): Extension<sqlx::SqlitePool>) -> Json<Vec<ImageRecord>> {
    sqlx::query_as::<_, ImageRecord>("SELECT id, tags FROM images ORDER BY id")
        .fetch_all(&pool)
        .await
        .unwrap()
        .into()
}
}

And of course, we add a route for it:

#![allow(unused)]
fn main() {
.route("/images", get(list_images))
}

Let's test that real quick. Go to http://localhost:3000/images and you should see a JSON list of all of the images.

Client Side Thumbnail Display

Now let's modify the HTML. You can leave the server running for live testing. We need to call the /images endpoint and display the thumbnails.

<script>
    async function getImages() {
        const response = await fetch('/images');
        const images = await response.json();

        let html = "";
        for (let i=0; i<images.length; i++) {
            html += "<div>" + images[i].tags + "<br />";
            html += "<a href='/image/" + images[i].id + "'>";
            html += "<img src='/thumb/" + images[i].id + "' />";
            html += "</a></div>";
            
        }
        document.getElementById("thumbnails").innerHTML = html;
    }

    getImages();
</script>

Redirect on POST

Rather than just saying "Ok" when someone uploads an image, let's redirect back to the list of images.

We'll make a simple file, src/redirect.html:

<html>
    <body>
        Image Uploaded!

        <script>
            function redirect() {
                window.location.href="/";
            }
            setTimeout(redirect, 1000);
        </script>
    </body>
</html>

Now change the uploader function signature to return Html<String>:

#![allow(unused)]
fn main() {
async fn uploader(
    Extension(pool): Extension<sqlx::SqlitePool>,
    mut multipart: Multipart,
) -> Html<String> {
}

And change the bottom to load redirect.html and return it as HTML:

#![allow(unused)]
fn main() {
    let path = std::path::Path::new("src/redirect.html");
    let content = tokio::fs::read_to_string(path).await.unwrap();
    Html(content)
}
}

Now run the program and add a file. You should go back to the list of images, with your new image displayed.

Finally, I promised a search function.

Let's start by adding another form to index.html:

<body>
    <h1>Welcome to the thumbnail server</h1>
    <div id="thumbnails"></div>
    <hr />
    <form method="post" action="/search">
        <input type="text" name="tags" value="" placeholder="Tags" /> <br />
        <input type="submit" value="Search" />
    </form>
    <hr />
    <h2>Add an Image</h2>

Next, we create a Rust structure to receive the contents of the form post:

#![allow(unused)]
fn main() {
#[derive(Deserialize)]
struct Search {
    tags: String
}
}

Now we can build a handler to server-side render the results:

#![allow(unused)]
fn main() {
async fn search_images(Extension(pool): Extension<sqlx::SqlitePool>, Form(form): Form<Search>) -> Html<String> {
    let tag = format!("%{}%", form.tags);

    let rows = sqlx::query_as::<_, ImageRecord>("SELECT id, tags FROM images WHERE tags LIKE ? ORDER BY id")
        .bind(tag)
        .fetch_all(&pool)
        .await
        .unwrap();

    let mut results = String::new();
    for row in rows {
        results.push_str(&format!("<a href=\"/image/{}\"><img src='/thumb/{}' /></a><br />", row.id, row.id));
    }

    let path = std::path::Path::new("src/search.html");
    let mut content = tokio::fs::read_to_string(path).await.unwrap();
    content = content.replace("{results}", &results);

    Html(content)
}
}

It is referring to src/search.html, so let's make that too. Notice the {results} placeholder.

Tip: There are some great placeholder libraries!

<!DOCTYPE html>
<html>

<head>
    <title>My Awesome Thumbnail Server</title>
</head>

<body>
    <h1>Welcome to the thumbnail server</h1>
    <div id="thumbnails">{results}</div>
    <hr />
    <form method="post" action="/search">
        <input type="text" name="tags" value="" placeholder="Tags" /> <br />
        <input type="submit" value="Search" />
    </form>

</body>

</html>

And run the program and you can search!

233 lines of code in the last coding session of the day. You've:

  • Setup a database with migrations.
  • Auto-run the migrations on start.
  • Made a thumbnail of any image that has been uploaded and doesn't have one.
  • Setup a multi-part from post that saves image information to the database and saves the file.
  • Called into system thread land to generate an image thumbnail.
  • Added a redirect to handle a POSTed form.
  • Added a simple search function.

Not bad - Rust is very productive once you get going.

Writing Rustacean Code

Best Practices: Tooling

Use cargo fmt

cargo fmt is Rust's built-in code formatter. You can configure a style guide by setting up rustfmt.toml and following this guide. Staying close to the standard format is recommended.

Format early, format often

You can dig yourself into a bit of a hole by only occasionally remembering to format your code. Take the following scenario:

  1. Developer A writes some code and doesn't format it.
  2. Developer B goes to fix a bug in the code, and does format it.
  3. Developer B's patch now contains a lot of formatting changes, which makes it hard to review.

So: run cargo fmt often, before each commit.

Checking that everyone remembered to format their code

You can use cargo fmt -- --check to check that all code is formatted. This is a good thing to add to your CI pipeline. In a workspace, you can use cargo fmt -- --check --all to check the entire workspace.

You can also add it to your git hooks (assuming you are using git). Add to or create .git/hooks/pre-commit:

#!/bin/bash

diff=$(cargo fmt -- --check)
result=$?

if [[ ${result} -ne 0 ]] ; then
    cat <<\EOF
There are some code style issues, run `cargo fmt` first.
EOF
    exit 1
fi

exit 0

Don't forget to make it executable with chmod u+x .git/hooks/pre-commit!

Excluding Some Code from Formatting

Sometimes, you like the way something is formatted---even though it's not standard. You can exclude it from formatting by adding a #[rustfmt::skip] attribute to the item. For example:

#![allow(unused)]
fn main() {
#[rustfmt::skip]
mod unformatted {
    pub fn add(a : i32, b : i32) -> i32 { a + b }
    pub fn sub(a : i32, b : i32) -> i32 { a - b }
}
}

This is particularly handy when you are working with a table and you like the tabular formatting, or generated code.

Excluding Whole Files from Formatting

If you have generated code that winds up being committed (particularly common if you are generating bindings for FFI), you can add ignore to your rustfmt.toml file. For example:

ignore = [
    "src/my_c_bindings.rs", # Ignore a file
    "src/bindgen", # Ignore a directory
]

It uses the same rules as .gitignore.

Automatically Formatting in VSCode

If you're using my favorite editor (VS Code), you can set a hook to run code formatting when you save a file. Edit settings.json and add the following:

{
    "[rust]": {
        "editor.defaultFormatter": "rust-lang.rust", // Makes the magic
        "editor.formatOnSave": true // Optional
    },
}

Using Cargo Watch

You can install a tool named cargo-watch with cargo install cargo-watch. Once cargo-watch is installed, you can run:

cargo watch -x 'fmt'

Leave it running. Open a file, make a change, and save it. You'll see the formatting run.

Cargo Watch is a very powerful command. You can use it to re-run servers on changes, run tests (this can be slow), etc. See the documentation for more information.

Clippy - the Linter

Clippy isn't actually an annoying paperclip in Rust.

The clippy linter is installed by default with Rust. It's a great tool that will help you find common mistakes and bad practices in your code. It's also a great way to learn more about Rust.

Clippy Basics

Let's start with some innocuous code:

fn main() {
    let numbers = (0..100).collect::<Vec<i32>>();
    for i in 0 .. numbers.len() {
        println!("{}", numbers[i]);
    }
}

The code works. Clippy will happily tell you why it's not idiomatic Rust:

cargo clippy
warning: the loop variable `i` is only used to index `numbers`
 --> clippy_test\src\main.rs:3:14
  |
3 |     for i in 0 .. numbers.len() {
  |              ^^^^^^^^^^^^^^^^^^
  |
  = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_range_loop
  = note: `#[warn(clippy::needless_range_loop)]` on by default
help: consider using an iterator
  |
3 |     for <item> in &numbers {
  |         ~~~~~~    ~~~~~~~~

The code for i in 0 .. numbers.len() is a common pattern in other languages, but it's not idiomatic Rust. The idiomatic way to do this is to use an iterator:

#![allow(unused)]
fn main() {
for i in &numbers {
    println!("{i}");
}
}

We'll talk about iterators in a minute, let's stay focused on Clippy for now!

Pedantic Clippy

Clippy has a "pedantic mode". You probably don't want to use it all the time---it will drive you nuts! It is worth periodically enabling it, and looking for the deeper problems it finds. Pedantic Clippy is slower than regular Clippy.

Add the following to your top-level file:

#![allow(unused)]
#![warn(clippy::pedantic)]
fn main() {
}

The ! means "apply globally". You'll see a lot of warnings on a large code-base. They may not all be correct (pedantic mode contains a few work-in-progress items), but they often give great hints as to where you can improve your code.

Ignoring Warnings

Sometimes, you want to ignore a warning. You can do this with a #[allow(...)] attribute. For example, if you want to ignore the warning about for i in 0 .. numbers.len(), you can do this:

#![allow(unused)]
fn main() {
#[allow(dead_code)]
fn nobody_called_me() {
    // Code
}
}

Warnings exist for a reason: they often indicate a problem. If you are going to ignore a warning, make sure you understand why it's there, and that you are ignoring it for a good reason. It's often a good idea to put a comment in place to explain why you are ignoring it.

Replace cargo check with cargo clippy in VSCode

  1. Open your settings (ctrl + comma)
  2. Search for "cargo check"
  3. Change "Rust Analyzer > Check Command" to "clippy"

This is a bit slower, but will run the linter every time you save a file. It will also show you the linter results in the "Problems" tab.

Document Your Code

If you're working with anyone else---or even your future self---documentation is a good idea. Rust has a built-in documentation system that is very easy to use. It's also very easy to publish your documentation to the web.

When you're writing a library, you should document every public function and structure. You should also provide module documentation. This is a good habit to get into, even if you're not writing a library.

If you're using an IDE that integrates with a Rust Language Server, documentation will automatically appear in your tooltips. This makes it easy for others to learn to use your code.

Documented Library

A good way to make sure you don't miss some documentation is to add the warning to the top of your crate:

#![allow(unused)]
#![warn(missing_docs)]
fn main() {
}

There's an ongoing debate as to whether this should be enabled by default!

Scope-Level Documentation

The top of your library (and each public module) should have a documentation comment. The //! syntax indicates a scope-level comment. This is where you should describe the purpose of the library, and any other information that is relevant to the entire library.

#![allow(unused)]
fn main() {
//! # Example Module
//! 
//! This module serves to demonstrate Rust's documentation system. It doesn't do anything useful.
//!
//! ## Examples
//! 
//! ```
//! use documented::example_function;
//! example_function();
//! ```
}

Ideally, you want to include:

  • What the library does and why you want to use it
  • Some examples of how to use it

Now when you use the library, the library includes a nice tooltip describing how to use it:

You've also included your documentation in your tests (like we talked about in Unit Tests). This means that your documentation is tested every time you run your tests with cargo test. This makes it harder to forget to update the documentation!

Function-Level Documentation

You should also document every public function and structure. This is done with the /// syntax. You should include:

  • What the function does
  • What the parameters are
  • What the return value is (if any)
  • Any errors that can be returned
  • Examples of how to use the function

Here's a fully documented---but quite useless---function:

#![allow(unused)]
fn main() {
/// This is an example function. It prints a message including the value
/// of `n`, and returns `n * 2`.
/// 
/// # Arguments
/// 
/// * `n` - The number to multiply by 2
/// 
/// # Returns
/// 
/// The result of `n * 2`
/// 
/// # Examples
/// 
/// ```
/// assert_eq!(documented::example_function(2), 4);
/// ```
pub fn example_function(n: i32) -> i32 {
    println!("This is an example function. n = {n}");
    n * 2
}
}

Once again, you've baked a unit test into your documentation. You've also got a nice tooltip for consumers:

If your function uses any unsafe, you will get a compiler warning if you don't include a Safety section.

#![allow(unused)]
fn main() {
/// Example of an unsafe function
/// 
/// # Safety
/// 
/// This function uses `get_unchecked` for fast access to a vector. This is ok, because the 
/// bounds of the vector are known ahead of time.
pub fn example_unsafe() -> i32 {
    let n = vec![1, 2, 3, 4, 5];
    unsafe {
        *n.get_unchecked(3)
    }
}
}

If you have a function that can panic, it's a good idea to warn the user (it's an even better idea to use a Result type!).

#![allow(unused)]
fn main() {
/// An example of a panicking function
/// 
/// # Panics
/// 
/// This function panics if the option is `None`.
pub fn panic_example(n: Option<i32>) -> i32 {
    n.unwrap() * 2
}
}

Modules

Modules require their own block-level documentation if they are public:

#![allow(unused)]
fn main() {
pub mod frobnicator {
    //! The Frobnicator!

    /// Frobnicates the input
    pub fn do_it() {
        println!("Frobnicating!");
    }
}
}

Hyperlinking

You can include hyperlinks in your documentation. These can link to other parts of the documentation:

#![allow(unused)]
fn main() {
/// Wraps [example_function](#example_function)
pub fn example_function_wrapper(n: i32) -> i32 {
    example_function(n)
}
}

Gives you a clickable link that takes you to the wrapped function. You can also use regular markdown syntax for links to HTML.

Official Guidelines

There's a complete documentation guide for the Rust API here. There's an even-more in-depth description in Rust RFC 1574!

Making a Documentation Site

The Rust docs.rs function uses cargo doc to build the site. You can do the same! If you just run cargo doc, you'll often get a huge site---it includes all of your dependencies.

As an example, let's build a documentation site for the code we've been working on in this example:

cargo doc --open --no-deps

open tells Cargo to open your browser to the docs when they are ready. no-deps ignores all of the dependencies.

Your browser will open, giving you a searchable documentation site:

The actual site will be in target/doc. You can include this in your CI to build a running documentation site for coworkers (or clients who depend upon your code).

Spellcheck!

My IDE is set to spellcheck for me. If you need it, you can cargo install spellcheck and run cargo spellcheck to run a spelling checker against your documentation.

Finally... Don't Leave Docs to Last

It's really easy to ignore writing documentation, and leave it to last. Then you have an exhausting day of going through every function, remembering exactly how they work. That's no fun. Try to write your documentation as you go. If you are creating a library that will be shared, add the missing_docs warning early and have it nag you. It's much easier to write documentation as you go, than to try to remember what you did later.

Understanding Dependencies

Rust has a great dependency management system in Cargo. It's easy to use, and it's easy to find and use libraries. On the other hand, it leads to a JavaScript/NPM like situation in which you can have a lot of dependencies. This can lead to difficulties auditing your code, accidental licensing issues (see Denying Dependencies by License), and other problems. You can also run into the age-old problem of depending upon a crate, that crate updating, and your program breaking!

Listing Your Dependencies

You can list your dependencies with:

cargo tree

Our count-lines-mmap project is nice and straightforward:

count-lines-mmap v0.1.0 (C:\Users\Herbert\Documents\Ardan\RustNR-2023-07\code\count-lines-mmap)
└── memmap2 v0.7.1

Our axum_sqlx project has a lot of dependencies:

axum_sqlx v0.1.0 (C:\Users\Herbert\Documents\Ardan\RustNR-2023-07\code\axum_sqlx)
├── axum v0.6.19
│   ├── async-trait v0.1.72 (proc-macro)
│   │   ├── proc-macro2 v1.0.66
│   │   │   └── unicode-ident v1.0.11
│   │   ├── quote v1.0.31
│   │   │   └── proc-macro2 v1.0.66 (*)
│   │   └── syn v2.0.27
│   │       ├── proc-macro2 v1.0.66 (*)
│   │       ├── quote v1.0.31 (*)
│   │       └── unicode-ident v1.0.11
│   ├── axum-core v0.3.4
│   │   ├── async-trait v0.1.72 (proc-macro) (*)
│   │   ├── bytes v1.4.0
│   │   ├── futures-util v0.3.28
│   │   │   ├── futures-core v0.3.28
│   │   │   ├── futures-io v0.3.28
│   │   │   ├── futures-sink v0.3.28
│   │   │   ├── futures-task v0.3.28
│   │   │   ├── memchr v2.5.0
│   │   │   ├── pin-project-lite v0.2.10
│   │   │   ├── pin-utils v0.1.0
│   │   │   └── slab v0.4.8
│   │   │       [build-dependencies]
│   │   │       └── autocfg v1.1.0
│   │   ├── http v0.2.9
│   │   │   ├── bytes v1.4.0
│   │   │   ├── fnv v1.0.7
│   │   │   └── itoa v1.0.9
│   │   ├── http-body v0.4.5
│   │   │   ├── bytes v1.4.0
│   │   │   ├── http v0.2.9 (*)
│   │   │   └── pin-project-lite v0.2.10
│   │   ├── mime v0.3.17
│   │   ├── tower-layer v0.3.2
│   │   └── tower-service v0.3.2
│   │   [build-dependencies]
│   │   └── rustversion v1.0.14 (proc-macro)
│   ├── bitflags v1.3.2
│   ├── bytes v1.4.0
│   ├── futures-util v0.3.28 (*)
│   ├── http v0.2.9 (*)
│   ├── http-body v0.4.5 (*)
│   ├── hyper v0.14.27
│   │   ├── bytes v1.4.0
│   │   ├── futures-channel v0.3.28
│   │   │   ├── futures-core v0.3.28
│   │   │   └── futures-sink v0.3.28
│   │   ├── futures-core v0.3.28
│   │   ├── futures-util v0.3.28 (*)
│   │   ├── http v0.2.9 (*)
│   │   ├── http-body v0.4.5 (*)
│   │   ├── httparse v1.8.0
│   │   ├── httpdate v1.0.2
│   │   ├── itoa v1.0.9
│   │   ├── pin-project-lite v0.2.10
│   │   ├── socket2 v0.4.9
│   │   │   └── winapi v0.3.9
│   │   ├── tokio v1.29.1
│   │   │   ├── bytes v1.4.0
│   │   │   ├── mio v0.8.8
│   │   │   │   └── windows-sys v0.48.0
│   │   │   │       └── windows-targets v0.48.1
│   │   │   │           └── windows_x86_64_msvc v0.48.0
│   │   │   ├── num_cpus v1.16.0
│   │   │   ├── parking_lot v0.12.1
│   │   │   │   ├── lock_api v0.4.10
│   │   │   │   │   └── scopeguard v1.2.0
│   │   │   │   │   [build-dependencies]
│   │   │   │   │   └── autocfg v1.1.0
│   │   │   │   └── parking_lot_core v0.9.8
│   │   │   │       ├── cfg-if v1.0.0
│   │   │   │       ├── smallvec v1.11.0
│   │   │   │       └── windows-targets v0.48.1 (*)
│   │   │   ├── pin-project-lite v0.2.10
│   │   │   ├── socket2 v0.4.9 (*)
│   │   │   ├── tokio-macros v2.1.0 (proc-macro)
│   │   │   │   ├── proc-macro2 v1.0.66 (*)
│   │   │   │   ├── quote v1.0.31 (*)
│   │   │   │   └── syn v2.0.27 (*)
│   │   │   └── windows-sys v0.48.0 (*)
│   │   │   [build-dependencies]
│   │   │   └── autocfg v1.1.0
│   │   ├── tower-service v0.3.2
│   │   ├── tracing v0.1.37
│   │   │   ├── cfg-if v1.0.0
│   │   │   ├── log v0.4.19
│   │   │   ├── pin-project-lite v0.2.10
│   │   │   ├── tracing-attributes v0.1.26 (proc-macro)
│   │   │   │   ├── proc-macro2 v1.0.66 (*)
│   │   │   │   ├── quote v1.0.31 (*)
│   │   │   │   └── syn v2.0.27 (*)
│   │   │   └── tracing-core v0.1.31
│   │   │       └── once_cell v1.18.0
│   │   └── want v0.3.1
│   │       └── try-lock v0.2.4
│   ├── itoa v1.0.9
│   ├── matchit v0.7.0
│   ├── memchr v2.5.0
│   ├── mime v0.3.17
│   ├── percent-encoding v2.3.0
│   ├── pin-project-lite v0.2.10
│   ├── serde v1.0.174
│   │   └── serde_derive v1.0.174 (proc-macro)
│   │       ├── proc-macro2 v1.0.66 (*)
│   │       ├── quote v1.0.31 (*)
│   │       └── syn v2.0.27 (*)
│   ├── serde_json v1.0.103
│   │   ├── itoa v1.0.9
│   │   ├── ryu v1.0.15
│   │   └── serde v1.0.174 (*)
│   ├── serde_path_to_error v0.1.14
│   │   ├── itoa v1.0.9
│   │   └── serde v1.0.174 (*)
│   ├── serde_urlencoded v0.7.1
│   │   ├── form_urlencoded v1.2.0
│   │   │   └── percent-encoding v2.3.0
│   │   ├── itoa v1.0.9
│   │   ├── ryu v1.0.15
│   │   └── serde v1.0.174 (*)
│   ├── sync_wrapper v0.1.2
│   ├── tokio v1.29.1 (*)
│   ├── tower v0.4.13
│   │   ├── futures-core v0.3.28
│   │   ├── futures-util v0.3.28 (*)
│   │   ├── pin-project v1.1.2
│   │   │   └── pin-project-internal v1.1.2 (proc-macro)
│   │   │       ├── proc-macro2 v1.0.66 (*)
│   │   │       ├── quote v1.0.31 (*)
│   │   │       └── syn v2.0.27 (*)
│   │   ├── pin-project-lite v0.2.10
│   │   ├── tokio v1.29.1 (*)
│   │   ├── tower-layer v0.3.2
│   │   ├── tower-service v0.3.2
│   │   └── tracing v0.1.37 (*)
│   ├── tower-layer v0.3.2
│   └── tower-service v0.3.2
│   [build-dependencies]
│   └── rustversion v1.0.14 (proc-macro)
├── serde v1.0.174 (*)
├── sqlx v0.7.1
│   ├── sqlx-core v0.7.1
│   │   ├── ahash v0.8.3
│   │   │   ├── cfg-if v1.0.0
│   │   │   ├── getrandom v0.2.10
│   │   │   │   └── cfg-if v1.0.0
│   │   │   └── once_cell v1.18.0
│   │   │   [build-dependencies]
│   │   │   └── version_check v0.9.4
│   │   ├── atoi v2.0.0
│   │   │   └── num-traits v0.2.16
│   │   │       [build-dependencies]
│   │   │       └── autocfg v1.1.0
│   │   ├── byteorder v1.4.3
│   │   ├── bytes v1.4.0
│   │   ├── crc v3.0.1
│   │   │   └── crc-catalog v2.2.0
│   │   ├── crossbeam-queue v0.3.8
│   │   │   ├── cfg-if v1.0.0
│   │   │   └── crossbeam-utils v0.8.16
│   │   │       └── cfg-if v1.0.0
│   │   ├── dotenvy v0.15.7
│   │   ├── either v1.8.1
│   │   │   └── serde v1.0.174 (*)
│   │   ├── event-listener v2.5.3
│   │   ├── futures-channel v0.3.28 (*)
│   │   ├── futures-core v0.3.28
│   │   ├── futures-intrusive v0.5.0
│   │   │   ├── futures-core v0.3.28
│   │   │   ├── lock_api v0.4.10 (*)
│   │   │   └── parking_lot v0.12.1 (*)
│   │   ├── futures-io v0.3.28
│   │   ├── futures-util v0.3.28 (*)
│   │   ├── hashlink v0.8.3
│   │   │   └── hashbrown v0.14.0
│   │   │       ├── ahash v0.8.3 (*)
│   │   │       └── allocator-api2 v0.2.16
│   │   ├── hex v0.4.3
│   │   ├── indexmap v2.0.0
│   │   │   ├── equivalent v1.0.1
│   │   │   └── hashbrown v0.14.0 (*)
│   │   ├── log v0.4.19
│   │   ├── memchr v2.5.0
│   │   ├── once_cell v1.18.0
│   │   ├── paste v1.0.14 (proc-macro)
│   │   ├── percent-encoding v2.3.0
│   │   ├── serde v1.0.174 (*)
│   │   ├── serde_json v1.0.103 (*)
│   │   ├── sha2 v0.10.7
│   │   │   ├── cfg-if v1.0.0
│   │   │   ├── cpufeatures v0.2.9
│   │   │   └── digest v0.10.7
│   │   │       ├── block-buffer v0.10.4
│   │   │       │   └── generic-array v0.14.7
│   │   │       │       └── typenum v1.16.0
│   │   │       │       [build-dependencies]
│   │   │       │       └── version_check v0.9.4
│   │   │       └── crypto-common v0.1.6
│   │   │           ├── generic-array v0.14.7 (*)
│   │   │           └── typenum v1.16.0
│   │   ├── smallvec v1.11.0
│   │   ├── sqlformat v0.2.1
│   │   │   ├── itertools v0.10.5
│   │   │   │   └── either v1.8.1 (*)
│   │   │   ├── nom v7.1.3
│   │   │   │   ├── memchr v2.5.0
│   │   │   │   └── minimal-lexical v0.2.1
│   │   │   └── unicode_categories v0.1.1
│   │   ├── thiserror v1.0.44
│   │   │   └── thiserror-impl v1.0.44 (proc-macro)
│   │   │       ├── proc-macro2 v1.0.66 (*)
│   │   │       ├── quote v1.0.31 (*)
│   │   │       └── syn v2.0.27 (*)
│   │   ├── tokio v1.29.1 (*)
│   │   ├── tokio-stream v0.1.14
│   │   │   ├── futures-core v0.3.28
│   │   │   ├── pin-project-lite v0.2.10
│   │   │   └── tokio v1.29.1 (*)
│   │   ├── tracing v0.1.37 (*)
│   │   └── url v2.4.0
│   │       ├── form_urlencoded v1.2.0 (*)
│   │       ├── idna v0.4.0
│   │       │   ├── unicode-bidi v0.3.13
│   │       │   └── unicode-normalization v0.1.22
│   │       │       └── tinyvec v1.6.0
│   │       │           └── tinyvec_macros v0.1.1
│   │       └── percent-encoding v2.3.0
│   ├── sqlx-macros v0.7.1 (proc-macro)
│   │   ├── proc-macro2 v1.0.66 (*)
│   │   ├── quote v1.0.31 (*)
│   │   ├── sqlx-core v0.7.1 (*)
│   │   ├── sqlx-macros-core v0.7.1
│   │   │   ├── dotenvy v0.15.7
│   │   │   ├── either v1.8.1 (*)
│   │   │   ├── heck v0.4.1
│   │   │   │   └── unicode-segmentation v1.10.1
│   │   │   ├── hex v0.4.3
│   │   │   ├── once_cell v1.18.0
│   │   │   ├── proc-macro2 v1.0.66 (*)
│   │   │   ├── quote v1.0.31 (*)
│   │   │   ├── serde v1.0.174 (*)
│   │   │   ├── serde_json v1.0.103 (*)
│   │   │   ├── sha2 v0.10.7
│   │   │   │   ├── cfg-if v1.0.0
│   │   │   │   ├── cpufeatures v0.2.9
│   │   │   │   └── digest v0.10.7
│   │   │   │       ├── block-buffer v0.10.4 (*)
│   │   │   │       └── crypto-common v0.1.6
│   │   │   │           ├── generic-array v0.14.7 (*)
│   │   │   │           └── typenum v1.16.0
│   │   │   ├── sqlx-core v0.7.1 (*)
│   │   │   ├── sqlx-sqlite v0.7.1
│   │   │   │   ├── atoi v2.0.0 (*)
│   │   │   │   ├── flume v0.10.14
│   │   │   │   │   ├── futures-core v0.3.28
│   │   │   │   │   ├── futures-sink v0.3.28
│   │   │   │   │   ├── pin-project v1.1.2 (*)
│   │   │   │   │   └── spin v0.9.8
│   │   │   │   │       └── lock_api v0.4.10 (*)
│   │   │   │   ├── futures-channel v0.3.28
│   │   │   │   │   ├── futures-core v0.3.28
│   │   │   │   │   └── futures-sink v0.3.28
│   │   │   │   ├── futures-core v0.3.28
│   │   │   │   ├── futures-executor v0.3.28
│   │   │   │   │   ├── futures-core v0.3.28
│   │   │   │   │   ├── futures-task v0.3.28
│   │   │   │   │   └── futures-util v0.3.28 (*)
│   │   │   │   ├── futures-intrusive v0.5.0 (*)
│   │   │   │   ├── futures-util v0.3.28 (*)
│   │   │   │   ├── libsqlite3-sys v0.26.0
│   │   │   │   │   [build-dependencies]
│   │   │   │   │   ├── cc v1.0.79
│   │   │   │   │   ├── pkg-config v0.3.27
│   │   │   │   │   └── vcpkg v0.2.15
│   │   │   │   ├── log v0.4.19
│   │   │   │   ├── percent-encoding v2.3.0
│   │   │   │   ├── serde v1.0.174 (*)
│   │   │   │   ├── sqlx-core v0.7.1 (*)
│   │   │   │   ├── tracing v0.1.37 (*)
│   │   │   │   └── url v2.4.0 (*)
│   │   │   ├── syn v1.0.109
│   │   │   │   ├── proc-macro2 v1.0.66 (*)
│   │   │   │   ├── quote v1.0.31 (*)
│   │   │   │   └── unicode-ident v1.0.11
│   │   │   ├── tempfile v3.7.0
│   │   │   │   ├── cfg-if v1.0.0
│   │   │   │   ├── fastrand v2.0.0
│   │   │   │   └── windows-sys v0.48.0
│   │   │   │       └── windows-targets v0.48.1 (*)
│   │   │   ├── tokio v1.29.1
│   │   │   │   ├── bytes v1.4.0
│   │   │   │   ├── mio v0.8.8 (*)
│   │   │   │   ├── pin-project-lite v0.2.10
│   │   │   │   ├── socket2 v0.4.9 (*)
│   │   │   │   └── windows-sys v0.48.0 (*)
│   │   │   │   [build-dependencies]
│   │   │   │   └── autocfg v1.1.0
│   │   │   └── url v2.4.0 (*)
│   │   └── syn v1.0.109 (*)
│   └── sqlx-sqlite v0.7.1
│       ├── atoi v2.0.0 (*)
│       ├── flume v0.10.14 (*)
│       ├── futures-channel v0.3.28 (*)
│       ├── futures-core v0.3.28
│       ├── futures-executor v0.3.28 (*)
│       ├── futures-intrusive v0.5.0 (*)
│       ├── futures-util v0.3.28 (*)
│       ├── libsqlite3-sys v0.26.0 (*)
│       ├── log v0.4.19
│       ├── percent-encoding v2.3.0
│       ├── serde v1.0.174 (*)
│       ├── sqlx-core v0.7.1 (*)
│       ├── tracing v0.1.37 (*)
│       └── url v2.4.0 (*)
└── tokio v1.29.1 (*)

In most cases, using full feature flags adds the kitchen sink to your project. If you're size conscious, or worried about dependencies, try to trim your feature flag usage.

In a highly secure environment, it's pretty unlikely that you can audit all of those. Ultimately, it's your decision as to how much you want to trust dependencies, verses bringing code "in house".

Finding The Bloat

The cargo bloat (installed with cargo install cargo-bloat) command can "weigh" each of your dependencies and see how much space it is adding to your binary. For example, for the axum-sqlx project:

    Analyzing target\debug\axum_sqlx.exe

 File  .text    Size        Crate Name
 0.7%   0.9% 45.9KiB              sqlite3VdbeExec
 0.5%   0.7% 34.1KiB    sqlformat sqlformat::tokenizer::get_plain_reserved_token
 0.5%   0.6% 31.9KiB  sqlx_sqlite sqlx_sqlite::connection::explain::explain
 0.4%   0.5% 28.6KiB              yy_reduce
 0.3%   0.4% 22.3KiB    sqlx_core sqlx_core::logger::QueryLogger::finish
 0.3%   0.4% 22.1KiB              sqlite3Pragma
 0.3%   0.4% 20.0KiB enum2$<hyper enum2$<hyper::proto::h1::role::Server>::encode_headers<hyper::proto::h1::role::impl$1::encode_headers_with_orig...
 0.3%   0.4% 20.0KiB enum2$<hyper enum2$<hyper::proto::h1::role::Server>::encode_headers<hyper::proto::h1::role::impl$1::encode_headers_with_lowe...
 0.3%   0.3% 18.1KiB         http http::header::name::StandardHeader::from_bytes
 0.3%   0.3% 16.9KiB        hyper <hyper::proto::h1::role::Server as hyper::proto::h1::Http1Transaction>::parse
 0.2%   0.3% 15.7KiB  sqlx_sqlite sqlx_sqlite::logger::QueryPlanLogger<O,R,P>::finish
 0.2%   0.3% 14.0KiB              sqlite3WhereCodeOneLoopStart
 0.2%   0.2% 11.6KiB        hyper hyper::server::tcp::AddrIncoming::poll_next_
 0.2%   0.2% 11.6KiB              sqlite3Select
 0.2%   0.2% 11.0KiB        hyper hyper::proto::h1::dispatch::Dispatcher<hyper::proto::h1::dispatch::Server<axum::routing::Router<tuple$<>,hyper:...
 0.2%   0.2% 10.6KiB        hyper hyper::proto::h1::conn::Conn<I,B,T>::poll_read_body
 0.2%   0.2% 10.2KiB    sqlformat <(A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U) as nom::branch::Alt<Input,Output,Error>>::choice
 0.2%   0.2% 10.2KiB    sqlformat <(A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U) as nom::branch::Alt<Input,Output,Error>>::choice
 0.2%   0.2% 10.2KiB    sqlformat <(A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U) as nom::branch::Alt<Input,Output,Error>>::choice
 0.2%   0.2% 10.2KiB    sqlformat <(A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U) as nom::branch::Alt<Input,Output,Error>>::choice
69.6%  88.6%  4.5MiB              And 21717 smaller methods. Use -n N to show more.
78.6% 100.0%  5.1MiB              .text section size, the file size is 6.5MiB

If you're size conscious, you can use this to find the biggest offenders and either look for an alternative, or see which feature flag you can omit.

Vendoring Dependencies

A consistent build is very important to shipping applications. You can run cargo vendor to download the current versions of all of your dependencies to a vendor/ directory and build with those. This can take a while to run and use a lot of disk space.

As you approach release, you should pin your dependencies. So instead of:

[dependencies]
axum = "0.6.19"

Specify the EXACT version:

axum = "=0.6.19"

Managing Your Own Dependencies

Cargo is pretty smart, and gives you some options for how you want dependencies to work within your own repo(s).

Option 1: Publish Everything!

If you really want to, you can use cargo publish (after filling out all of the crate metadata) to make your crates public. You should only do this:

  • If you actually want your crate to be public. The source code will be published, too.
  • You have a license policy ready to go.
  • Your crate offers something useful to the community. Let's not get into a left-pad situation.

For your company's pride and joy---a product you make money from---this is often not the best choice. In fact, your manager may be quite upset if you accidentally MIT license your core product.

Option 2: Separate Git Repos Per Project

Cargo can handle dependencies directly from git. For example, you can use bracket-lib (a library I created for Hands-on Rust) from Github directly:

bracket_lib = { git = "https://github.com/amethyst/bracket-lib.git" }

You can tell it to use a specific branch:

bracket_lib = { git = "https://github.com/amethyst/bracket-lib.git", branch = "bevy" }

Now, when you run cargo update---Cargo pulls the latest version from the repo and uses it.

Paths within Git Repos

Cargo supports committing a workspace or series of projects to the same repo. The committed repos can refer to one another with relative paths:

bracket_lib = { path = "../bracket_lib" }

If you then link to that git repo, dependencies within the repo are pulled correctly.

Option 3: Mono-Repos

Using the path = system within the mono-repo works great. You can mix and match a bit, having some git repos, some cargo/crates.io repos, and some path repos.

Option 4: File Server

If you all mount a directory you can share code that way. The downside is that git ownership becomes quite confusing.

Overriding Dependencies

Let's say that Team B has created a crate, and you are happily depending upon it. Down the line, Team A want to test the dependency from Team B---but they want to replace a dependency throughout the project with a new version (maybe to see if a patch works).

You can do this with cargo patches. For example, if you decide to use your own uuid crate---you can add this to Cargo.toml:

[patch.crates-io]
uuid = { path = "../path/to/uuid" }

Be warned that you can't publish (to crates.io) any crates that specify a patch.

Checking for Vulnerabilities

Install the cargo audit tool with cargo install cargo-audit.

Now, at any time you can run cargo audit to check for vulnerabilities in your dependencies. This is a good thing to do periodically, and before you publish a crate. GitHub includes tooling for building this into your CI pipeline. Run it at the top-level of your workspace---it works by reading Cargo.lock.

For example, when I wrote this it warned me that the memmap crate we used is currently without a maintainer:

    Fetching advisory database from `https://github.com/RustSec/advisory-db.git`
      Loaded 554 security advisories (from C:\Users\Herbert\.cargo\advisory-db)
    Updating crates.io index
    Scanning Cargo.lock for vulnerabilities (284 crate dependencies)
Crate:     memmap
Version:   0.7.0
Warning:   unmaintained
Title:     memmap is unmaintained
Date:      2020-12-02
ID:        RUSTSEC-2020-0077
URL:       https://rustsec.org/advisories/RUSTSEC-2020-0077
Dependency tree:
memmap 0.7.0
└── count-lines-mmap 0.1.0

warning: 1 allowed warning found

This is an easy fix (memmap was replaced by memmap2 which is almost identical).

This is a good tool to include in your CI pipeline. You may find that it's irritating---sometimes vulnerabilities don't affect what you're doing, sometimes it takes a little while for a fix to become available. This way, at least you know that there's action required!

Checking for Outdated Dependencies

You can install the cargo-outdated tool with:

cargo install cargo-outdated

Then you can run it with:

cargo outdated

Add the -w flag to check the whole workspace.

Hopefully, you'll see All dependencies are up to date, yay!

Sometimes, you'll see a list of dependencies that are out of date. You can usually update them with:

cargo update

When you're depending on a crate that in turn depends upon other crates, it's quite possible that the version number has been pinned somewhere down the chain (or you pinned it yourself).

I do not recommend putting this into your CI pipeline. Rust crates update a LOT.

Denying Dependencies

Sometimes, you really don't want to accidentally include some GPL code (and suffer the viral consequences of the license). Or you may want to apply other licensing restrictions---depending upon your company's legal department.

The cargo deny tool can help with this. You can install it with cargo install cargo-deny.

Initial Setup

Setup with cargo deny init. This will make a (huge) deny.toml file that you can use to set your policies. Let's allow crates that didn't remember to specify a license (not a great idea), and specifically deny GPL:

[licenses]
# The lint level for crates which do not have a detectable license
unlicensed = "allow"
# List of explicitly allowed licenses
# See https://spdx.org/licenses/ for list of possible licenses
# [possible values: any SPDX 3.11 short identifier (+ optional exception)].
allow = [
    #"MIT",
    #"Apache-2.0",
    #"Apache-2.0 WITH LLVM-exception",
]
# List of explicitly disallowed licenses
# See https://spdx.org/licenses/ for list of possible licenses
# [possible values: any SPDX 3.11 short identifier (+ optional exception)].
deny = [
    #"Nokia",
    "GPL-1.0",
    "GPL-2.0",
]
# Lint level for licenses considered copyleft
copyleft = "warn"

Checking Licenses

cargo deny check licenses will scan your entire workspace for licenses. With any luck, you'll see licenses ok. Revert unlicensed to deny, and you'll discover that rustnr forgot to specify a license. Wait! That's me! I didn't specify a license in the top-level of this project. Generally, you want to include a license= clause for your project licenses---unless you don't want one.

Other Checks

Cargo Deny can also check:

  • You can use cargo deny check bans to check for features or dependencies you've decided to ban.
  • cargo deny check advisories will duplicate the functionality of cargo audit and check for CVEs. I do recommend cargo audit for CI use, it's a lot slimmer.
  • cargo deny check sources allows you to ban importing code from specific sources.

Build Profiles

You should already know that you can run your program in debug mode with cargo run, and release mode with cargo run --release. The latter enables O3 optimization, and does a pretty decent job. You can customize the build optimization process.

Faster Debugging

By default, the debug profile bounds checks everything and produces completely unoptimized code. That's generally a good thing---but in some cases it can make the debug profile too slow to run. You can customize the debug profile to be faster, at the expense of debuggers occasionally "skipping" optimized code. In your Cargo.toml file (the parent, for a workspace---all workspace elements share the same profile), you can add:

[profile.dev]
opt-level = 1

You can customize this further. If integer overflow checking is causing you problems, you can disable it:

[profile.dev]
overflow-checks = false

I don't recommend this, most of the time!

If you don't mind slower compiles for release mode, you can enable LTO. LTO extends "inlining" checks across crate boundaries, and can often produce smaller and faster code. You can enable it with:

[profile.release]
lto = true

The downside is that LTO can be quite slow. If you're doing a lot of development, you may want to disable it.

Making Your Binaries Smaller

Sometimes, you need a smaller binary. You might be deploying to an embedded target and not have much space. You might be shipping a binary to customers and want to minimize download usage (and weight on their target). You might just be picky about saving space (I am!). Rust has a lot of options for this.

Let's establish a baseline by compiling the axum_sqlx binary in release mode with cargo build --release from the appropriate directory.

The optimized binary is 4,080,128 bytes. It's large because Rust has statically linked every dependency. This is a good thing, because it means that you can ship a single binary to your customers. But it's also a bad thing, because it means that you're shipping a large binary to your customers.

LTO

Let's start by enabling LTO by adding this to Cargo.toml:

[profile.release]
lto = true

Now we rebuilt with cargo build --release---it takes quite a bit longer, especially the link portion. The binary has shrunk to 3,464,704 bytes in size. That's a decent improvement---but we can do better!

Optimize for Size

We'll add another line to Cargo.toml:

[profile.release]
lto = true
opt-level = "s"

This tells the underlying compiler to optimize for a small binary---but not to great extremes. Now we rebuild it again. It actually compiles slightly faster, because it's skipping some of the more expensive optimizations. We're down to 2,277,888 bytes!

We can go a step further and replace the opt-level with z:

[profile.release]
lto = true
opt-level = "z"

z tells the compiler to optimize for size, and still perform performance optimizations---but always preferring size benefits. Compile again, and we're now 2,227,712 bytes. A very small improvement, but a lot smaller than our original 4mb.

Now we're going to "strip" the binary. Remove all debug symbols, nice naming hints and similar. This also anonymizes the binary a bit if you like to hide your secret sauce. Add this to Cargo.toml:

[profile.release]
lto = true
opt-level = "z"
strip = "debuginfo"

Once again we rebuild, we're at 2,226,176 bytes. A truly tiny improvement, because release builds already remove a lot of information. You've also just lost the ability to trace crashes to a line number. You can use strip = "symbols" to retain the debugger information.

Since we've killed debug information, we don't really need such nice handling of panics. Displaying nice panic handler messages is surprisingly expensive! You can turn it off as follows:

[profile.release]
lto = true
opt-level = "z"
strip = "debuginfo"
panic = "abort"

Rebuilding this way, we're down to 1,674,752 bytes. If the program crashes, the error message won't help you find the issue---you're relying on having properly handled errors and tracing. You should be doing that anyway!

When Rust compiles, it uses as many CPU cores as possible. This actually removes some optimization opportunities! If you don't mind a really slow compile time, you can disable this:

[profile.release]
lto = true
opt-level = "z"
strip = "debuginfo"
panic = "abort"
codegen-units = 1

You can now only compile one code unit at a time, but optimizations won't be skipped because the build was divided between cores. This results in a relatively tiny improvement: our binary is now 1,614,336 bytes. (1.53 Mb)

That's not bad for an application with full database support, migrations and a web server!

More extreme optimizations are possible if you use no_std mode---but then you're writing without the standard library, and will be doing a lot of things from scratch. That's a topic for another day!

Code Best Practices

Favor Iterators

You often find yourself using for loops to iterate over collections. Rust's iterator system can transform many of these operations into a functional style. The function style is often more concise, and can result in faster code---the optimizer handles iterators really well. It also opens up some possibilities for parallelization.

Here's a type and function to generate some test data:

#![allow(unused)]
fn main() {
struct Row {
    language: String,
    message: String,
}

fn get_rows() -> Vec<Row> {
    vec![
        Row { language : "English".to_string(), message : "Hello".to_string() },
        Row { language : "French".to_string(), message : "Bonjour".to_string() },
        Row { language : "Spanish".to_string(), message : "Hola".to_string() },
        Row { language : "Russian".to_string(), message : "Zdravstvuyte".to_string() },
        Row { language : "Chinese".to_string(), message : "Nǐn hǎo".to_string() },
        Row { language : "Italian".to_string(), message : "Salve".to_string() },
        Row { language : "Japanese".to_string(), message : "Konnichiwa".to_string() },
        Row { language : "German".to_string(), message : "Guten Tag".to_string() },
        Row { language : "Portuguese".to_string(), message : "Olá".to_string() },
        Row { language : "Korean".to_string(), message : "Anyoung haseyo".to_string() },
        Row { language : "Arabic".to_string(), message : "Asalaam alaikum".to_string() },
        Row { language : "Danish".to_string(), message : "Goddag".to_string() },
        Row { language : "Swahili".to_string(), message : "Shikamoo".to_string() },
        Row { language : "Dutch".to_string(), message : "Goedendag".to_string() },
        Row { language : "Greek".to_string(), message : "Yassas".to_string() },
        Row { language : "Polish".to_string(), message : "Dzień dobry".to_string() },
        Row { language : "Indonesian".to_string(), message : "Selamat siang".to_string() },
        Row { language : "Hindi".to_string(), message : "Namaste, Namaskar".to_string() },
        Row { language : "Norwegian".to_string(), message : "God dag".to_string() },
        Row { language : "Turkish".to_string(), message : "Merhaba".to_string() },
        Row { language : "Hebrew".to_string(), message : "Shalom".to_string() },
        Row { language : "Swedish".to_string(), message : "God dag".to_string() },
                
    ]
}
}

A naieve for loop to find a language looks like this:

fn main() {
    let rows = get_rows();
    for row in rows.iter() {
        if row.language == "French" {
            println!("{}", row.message);
            break; // Stop looping
        }
    }
}

That works just fine, and isn't too bad. .iter() transforms the rows into an iterator (receiving references to each entry, no copying). You can do the same thing with the following:

#![allow(unused)]
fn main() {
rows.iter()
    .filter(|r| r.language == "French")
    .for_each(|r| println!("{}", r.message));
}

We can add some timing to the code:

fn main() {
    let now = std::time::Instant::now();
    let rows = get_rows();
    for row in rows.iter() {
        if row.language == "French" {
            println!("{}", row.message);
            break;
        }
    }
    println!("Elapsed: {} nanos", now.elapsed().as_nanos());

    let now = std::time::Instant::now();
    rows.iter()
        .filter(|r| r.language == "French")
        .for_each(|r| println!("{}", r.message));
    println!("Elapsed: {} nanos", now.elapsed().as_nanos());
}

In debug mode, I get:

Bonjour
Elapsed: 187500 nanos
Bonjour
Elapsed: 62200 nano

In release mode, I get:

Bonjour
Elapsed: 132200 nanos
Bonjour
Elapsed: 57900 nanos

This isn't a great benchmark, but the iterator version is faster. The iterator is able to elide some of the range checks (since the size of the iterator is known at compile time).

Working with Data

Iterators could be a class unto themselves. It's always worth looking at the operations offered by iterators. map can be used to transform data on its way through the pipeline. filter_map can combine filtering and mapping into a single operation. all, any can be used to see if a predicate matches all or any element. skip and nth let you navigate within the iterator. fold can apply an accumulator, reduce can shrink your data. With chain and zip you can combine iterators.

In some cases, it's worth learning to make your own iterators. It's relatively simple (very similar to the stream we made).

Remember, iterators don't yield. You can turn an iterator into a stream with a helper function from tokio-streams (and also futures) if you do need to yield at each step in an async program.

Let's transform a program into iterators.

We'll use a really inefficient prime factoring function:

#![allow(unused)]
fn main() {
fn is_prime(n: u32) -> bool {
    (2 ..= n/2).all(|i| n % i != 0 )
}
}

And some code to iterate through the first range of numbers and count the primes we find:

#![allow(unused)]
fn main() {
let now = std::time::Instant::now();
const MAX:u32 = 200000;
let mut count = 0;
for n in 2 .. MAX {
    if is_prime(n) {
        count+=1;
    }
}
println!("Found {count} primes in {:.2} seconds", now.elapsed().as_secs_f32());
}

On my development workstation, I found 17,984 primes in 1.09 seconds.

Let's write the same code, as an iterator:

#![allow(unused)]
fn main() {
let now = std::time::Instant::now();
let count = (2..MAX)
    .filter(|n| is_prime(*n))
    .count();
println!("Found {count} primes in {:.2} seconds", now.elapsed().as_secs_f32());
}

There's no speedup, but we have less code --- making it easier to read. We've also opened ourselves up for a really easy parallelization. Add rayon to the crate (cargo add rayon) and we can use all of our CPU cores with just two lines of code:

#![allow(unused)]
fn main() {
use rayon::prelude::{IntoParallelIterator, ParallelIterator};
let now = std::time::Instant::now();
let count = (2..MAX)
    .into_par_iter()
    .filter(|n| is_prime(*n))
    .count();
println!("Found {count} primes in {:.2} seconds", now.elapsed().as_secs_f32());
}

The result I get shows that we found the same number of primes in 0.10 seconds.

So not only are iterators more idiomatic, they open up a world of possibilities.

Understanding .iter() vs .into_iter()

This is a common mistake when you're getting started, and understanding the difference can make a big performance difference sometimes.

.iter() returns an iterator that yields references to the data. .into_iter() returns an iterator that yields the data itself. This is a subtle difference, but it can make a big difference.

Take the following code:

#![allow(unused)]
fn main() {
let mut v = vec!["one".to_string(), "two".to_string()];
v.iter().for_each(|v| do_something(v));
println!("{v:?}");
}

v---your vector---is still valid after the iter() call. Each iteration receives a reference to the original data. If you collect it into another vector, you get a vector of &String types.

However:

#![allow(unused)]
fn main() {
let mut v = vec!["one".to_string(), "two".to_string()];
v.into_iter().for_each(|v| do_something(v));
println!("{v:?}");
}

Won't compile! v is destroyed by the conversion into an iterator---and each pass is receiving the actual String, not a reference. If you collect it into another vector, you get a vector of String types.

  • Use iter() when you are just referencing the data, and want to retain ownership of it.
  • Use into_iter() when you will never use the data again. You move the data out of the vector, and send it to its new owner.

Minimize Cloning

When you're first getting into Rust, it's really easy to abuse clone(). It's pretty fast (slowing down the more complex your structure is). With the move semantics and the borrow checker, it's very tempting to clone a LOT. The optimizer will minimize the overhead quite a bit, but when you can avoid cloning - it's worth it.

The exception being types that are designed to be cloned, such as Rc or connection pools!

If you find yourself cloning things a lot, so you can fan data out in lots of directions---it's usually a sign that your design needs some work. Should you be destructuring and moving the relevant data? Should you be using a shared type (like an Rc/Arc) and sharing the data? Maybe you should look at reducing the number of &mut and use references?

Don't Emulate Object Oriented Programming

With traits, it's easy to think of Rust as an object-oriented programming language. Traits share some characteristics - they are basically interfaces. But there's no inheritance. At all.

If you're used to some OOP-style systems, it might be tempting to do something like this:

#![allow(unused)]
fn main() {
struct Employee;

impl Name for Employee { .. }
impl Address for Employee { .. }
impl Salary for Employee { .. }
}

You wind up with an employee object, and with the Any system you can cast it into the type you need. Doing so will make your life miserable.

You'll find yourself with an abundance of "if person has a Name trait", "if person has an address trait" - and then if you need to alter BOTH, the borrow checker makes your life painful. You can't mutably borrow the same base object for each trait, so you wind up writing a cycle of "look up data, note the new data, apply each in turn". That works, but it's big and messy.

Instead, favor composition:

#![allow(unused)]
fn main() {
struct Name;
struct Address;
struct Salary;

struct Employee {
    name: Option<Name>,
    address: Option<Address>,
    salary: Option<Salary>,
}
}

Now a function at Employee level can gain mutable access to each of the properties.

Think in terms of Ownership

On top of that, it's beneficial to think in terms of ownership from Rust's perspective. If you have a big list of employees, and want to transfer a red swingline stapler from one person to another - you need to find both people, find out if they have the stapler, and then move it. In a textbook OOP setup, you'd have methods at each level - and probably pass one person to another for the transfer. Rust will be much happier if you implement the operation at the top level, take the time to check each precondition (and return an appropriate error).

Your code will be easier to test, too.

Don't Reference Count Everything

It's really tempting when you find out that Rc and Arc can give you a simplified form of garbage collection at very little cost to make everything a reference and let Rust sort it out. It'll probably even work.

BUT - you're throwing away some potential performance, and complicating things.

Instead:

  • Think about ownership.
    • If data is taking a clear path, move it along each element of the path.
    • If data is predominantly owned inside a function, and "fans out" to functions that operate on it - before coming back, then references make sense.
    • If data is genuinely shared between entities, then reference counting makes good sense.

Favor Small Functions

Rust is really good at inlining---eliminating the cost of calling a function by embedding it in the caller. This gives you no real downside to writing small functions.

Small functions:

  • Are easier to read---if you can see the whole function at once, it's easier to understand.
  • Are easier to test---you can test each function in isolation.
  • Are easier to compose in functional/iterator pipelines.
  • Are easier to optimize---the compiler can inline them, and optimize them individually.

Along the same lines, "pure" functions have a performance and clarity advantage. A "pure" function doesn't mutate any external state, but operates entirely on its parameters. This makes it easier to reason about the function, and easier to optimize.

Name your functions well, and you'll have a program that's easy to understand.

Sometimes your function is long because you're doing something complicated. That's fine! But if you can break it down into smaller functions, you should.

Clever Code

The Go language likes to say "Don't be Clever". It's a good rule of thumb, but can be better expressed as "Write Code you can Explain". If you can't explain it, you probably shouldn't be writing it.

Rust is a language that can be very clever. It's a language that can be very expressive. It's a language that can be very terse. It's a language that can be very confusing. Since Rust gives you the power to be really confusing, it's up to you to tame the complexity.

In some cases, you need clever code---you want to take full advantage of Rust's performance capabilities! I recommend taking the same approach as the Rust standard library: have some "internal" types that contain the cleverness, and wrap them in a user-friendly facade that is easy to understand.

How This Works in Practice

Let's say that you are creating a system that memory maps files as needed, based on the requested data. Multiple requests may be coming in from other systems, and you want to make sure that you don't map the same file twice. You also want to make sure that you don't map too many files at once, and run out of paged memory. The system has a relatively simple interface to the outside world: given a set of coordinates, the system transforms the coordinates into the associated filename, accesses it and returns data about those coordinates.

Start by creating a directory module (a directory containing a mod.rs) file. In that module, create some stub functions (or types) describing the interface you wish to offer. This might be as simple as:

#![allow(unused)]
fn main() {
fn get_coordinate_data(position: LatLon) -> Option<MyLidarData> {
    todo!("Implement this");
}
}

Then for each part of the complicated system, make another file (I prefer one per type) in the directory - and link them to the module with mod my_module;. No pub mod - you're hiding the internals.

So for this example, you might have:

  • A type representing each file (containing a memory map structure, and access logic)
  • A type that converts LatLon into the required filename
  • A cache type that keeps a least-recently-used cache (with a maximum size) of the file types.
  • Lots of tests!

You can then link those to the facade functions. From the outside, you're offering just the one function. Internally, you're doing a lot of clever things. Other teams don't need to see the cleverness, they just need to see the interface.

When to be Clever

Knuth famously said that premature optimization is the root of all evil. That doesn't mean you should write really terrible code; if there's an obvious performance "win", take it---but not at the expense of readability. Then, profile or trace your code with real data and see where the bottlenecks are. The bottlenecks are where it's worth being clever.

Let the Type System Help You

Rust's type system is very powerful, and can help you write better code.

Avoid Ambiguity with New Types

Don't go too crazy with this. If it's obvious that number_of_threads is a usize, and what the parameter does, it doesn't need its own type!

Ambiguous Units

We talked about a generic conversion between units in Traits. This is one of the easiest ways to avoid introducing bugs into your system. In the example, we created Radians and Degrees types and setup Into for converting between them. Now the user has to specify the units, and automatic conversion means that passing degrees into a Radians-based function won't cause a bug.

This applies to almost any unit of measure, and is a great place for "new types"---a type that wraps a value, specifying the type and optionally provides unit conversions.

For example, it's pretty common to count bytes. A Bytes type makes it obvious that you aren't actually expecting kilobytes, megabytes, etc. --- but for output, you probably want those types, too. You can create a Bytes type that implements Into for Kilobytes, Megabytes, etc. and then use Bytes internally. You could even provide some output/formatting options that checks the size of the contained value and returns an appropriately scaled value.

For example:

struct Bytes(usize);
struct Kilobytes(usize);
struct MegaBytes(usize);

impl From<Kilobytes> for Bytes {
    fn from(kb: Kilobytes) -> Self {
        Self(kb.0 * 1024)
    }
}

impl From<MegaBytes> for Bytes {
    fn from(mb: MegaBytes) -> Self {
        Self(mb.0 * 1024 * 1024)
    }
}

impl std::fmt::Display for Bytes {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        let bytes = self.0;
        let kb = bytes / 1024;
        let mb = kb / 1024;
        if mb > 0 {
            write!(f, "{} MB", mb)
        } else if kb > 0 {
            write!(f, "{} KB", kb)
        } else {
            write!(f, "{} B", bytes)
        }
    }
}

fn main() {
    let bytes: Bytes = MegaBytes(8).into();
    println!("{bytes}");
}

Long Parameter Lists

It's a shame that Rust doesn't have a named parameter system. When you have a large number of parameters, it becomes very easy to get them in the wrong order. This is especially true if you have a lot of parameters of the same type.

For example, let's say that you have a function that takes a lot of parameters:

#![allow(unused)]
fn main() {
fn do_something(
    a: usize,
    b: usize,
    c: usize,
    d: usize,
    e: usize,
    // etc
) {
    todo!("Implement this");
}
}

Obviously, you can help the situation by now naming them with one-letter names! Your IDE will show you the parameter list, making it easier. But what if you change a parameter? You have to change it everywhere, and if you happened to still have the same number of parameters of a similar type---you might not notice the error.

There are a couple of solutions here:

  • Use a simple new type for some parameters. If a is actually always a count of rows you could create a RowCount(pub usize) type to make it obvious. That way, even though you are passing usize, you have to specify your intent. You've almost got named parameters that way!
  • Create a structure containing the parameters and pass that. Now you have to name your parameters, and it's much harder to get it wrong.
  • And if your structure is large, use a builder pattern.

Builder Pattern

You've used the builder pattern---it's very common in Rust. It provides a great way to set defaults at the beginning, specify only the parameters you want to change, and then build the final structure.

For example:

struct ThingConfig {
    do_a: bool,
    do_b: bool,
    setting: usize,
    another_setting: usize,
}

impl ThingConfig {
    fn new() -> Self {
        ThingConfig {
            do_a: false,
            do_b: false,
            setting: 0,
            another_setting: 0,
        }
    }

    fn do_a(mut self) -> Self {
        self.do_a = true;
        self
    }

    fn do_b(mut self) -> Self {
        self.do_b = true;
        self
    }

    fn with_setting(mut self, setting: usize) -> Self {
        self.setting = setting;
        self
    }

    fn with_another_setting(mut self, setting: usize) -> Self {
        self.another_setting = setting;
        self
    }

    fn execute(&self) {
        if self.do_a {
            println!("Doing A");
        }
        if self.do_b {
            println!("Doing B");
        }
        println!("Setting: {}", self.setting);
        println!("Another Setting: {}", self.another_setting);
    }
}

fn main() {
    ThingConfig::new()
        .do_a()
        .with_setting(3)
        .execute();
}

Now you've tucked away the complexity, and made it much harder to get the parameters wrong. You can also add validation to the builder, and make sure that the parameters are valid before you execute the function.

You can combine this with the Error Handling system to chain validation calls, and return an error if the parameters are invalid. For example:

use thiserror::Error;

#[derive(Error, Debug)]
enum ThingError {
    #[error("Setting must be between 0 and 10")]
    SettingOutOfRange,
}

type ThingResult<T> = Result<T, ThingError>;

struct ThingConfig {
    do_a: bool,
    do_b: bool,
    setting: usize,
    another_setting: usize,
}

#[allow(dead_code)]
impl ThingConfig {
    fn new() -> Self {
        ThingConfig {
            do_a: false,
            do_b: false,
            setting: 0,
            another_setting: 0,
        }
    }

    fn do_a(mut self) -> ThingResult<Self> {
        self.do_a = true;
        Ok(self)
    }

    fn do_b(mut self) -> ThingResult<Self> {
        self.do_b = true;
        Ok(self)
    }

    fn with_setting(mut self, setting: usize) -> ThingResult<Self> {
        if setting > 10 {
            Err(ThingError::SettingOutOfRange)
        } else {
            self.setting = setting;
            Ok(self)
        }
    }

    fn with_another_setting(mut self, setting: usize) -> ThingResult<Self> {
        self.another_setting = setting;
        Ok(self)
    }

    fn execute(&self) -> ThingResult<()> {
        if self.do_a {
            println!("Doing A");
        }
        if self.do_b {
            println!("Doing B");
        }
        println!("Setting: {}", self.setting);
        println!("Another Setting: {}", self.another_setting);
        Ok(())
    }
}

fn main() -> ThingResult<()> {
    ThingConfig::new()
        .do_a()?
        .with_setting(3)?
        .execute()?;

    Ok(())
}

Defaults

Complex types should implement Default. This allows you to create a default instance of the type, and then override the parameters you want to change. For example:

#![allow(unused)]
fn main() {
pub struct MyType {
    pub a: usize,
    pub b: usize,
    pub c: usize,
}

impl Default for MyType {
    fn default() -> Self {
        Self {
            a: 0,
            b: 0,
            c: 0,
        }
    }
}
}

You can use the shorthand:

#![allow(unused)]
fn main() {
#[derive(Default)]
pub struct MyType {
    pub a: usize,
    pub b: usize,
    pub c: usize,
}
}

You can now instantiate the structure as MyType::default() or use a partial initialization:

fn main() {
    let t = MyType {
        a: 2,
        ..Default::default()
    };
}

You can set default values for enums with Default, too:

#![allow(unused)]
fn main() {
#[derive(Default)]
enum MyType {
    One,
    #[default]
    Two,
    Three,
}
}

Partial Structure Assignment

Don't forget partial structure assignment. It's very helpful when you need to create a new value based mostly on a previous one:

struct MyType {
    a: i32,
    b: i32,
}

fn main() {
    let one = MyType { a: 3, b: 4 };
    let two = MyType {
        a: 4,
        ..one
    };
}

Prefer Enums

Whenever possible, don't store a String with some preferences in it or an opaque integer where 3 means "do this". Use an enumeration. Rust's enumerations are very powerful, and you can add parameter data to options as needed. They also work really well with match, and there's no room for typos.

New Types as Traits

Another way to represent unit types is with a trait. This has certain advantages - the trait defines the possible output types, and you are using a named function to retrieve what you want (no more .0 and tuple syntax). You can also implement the trait for any type, allowing you to arbitrarily create a temperature. Here's an example of creating a temperature conversion with a trait, and then using that trait with an enum implementation as a means of applying user output preferences:

trait TemperatureConversion {
    fn as_celsius(&self) -> f32;
    fn as_farenheit(&self) -> f32;
}

struct Temperature {
    kelvin: f32
}

impl Temperature {
    fn with_celsius(celsius: f32) -> Self {
        Self { kelvin: celsius + 273.15 }
    }
    
    fn with_farenheit(farenheit: f32) -> Self {
        Self { kelvin: ((farenheit - 32.0) * 5.0 / 9.0) + 273.15 }
    }
}

impl TemperatureConversion for Temperature {
    fn as_celsius(&self) -> f32 {
        self.kelvin - 273.15
    }
    
    fn as_farenheit(&self) -> f32 {
        ((self.kelvin - 273.15) * 9.0/5.0) + 32.0
    }
}

enum TemperaturePreference {
    Celsius,
    Farenheit,
}

impl TemperaturePreference {
    fn display(&self, temperature: impl TemperatureConversion) -> String {
        match self {
            Self::Celsius => format!("{:.0}°C", temperature.as_celsius()),
            Self::Farenheit => format!("{:.0}°F", temperature.as_farenheit()),
        }
    }
}

fn main() {
    let temperature = Temperature::with_celsius(100.0);
    let preference = TemperaturePreference::Farenheit;
    println!("{}", preference.display(temperature));
}

Floating Point Numbers

Rust's floating point types are full IEEE 754 floating-point representations---including inaccuracy. It's very easy to get a nearly-right but not accurate number. Checking equality is dangerous!

In particular, don't ever store money in a floating point number. Your accountants won't like it.

If this matters to your code, you have a few options:

  • You can use the approx crate for some helpful macros to help check approximate equality.
  • You can use the bigdecimal to store large decimals without inequality. It's quite fast (not as fast as CPU-enhanced floating point operations), and is supported by Serde, Postgres and other crates.

Platform and Feature Specific Code

It's quite common if you're shipping a binary to run into a situation in which you need to do something differently on different platforms. For example, you might want to use a different library on Windows than you do on Linux. You might want to use a different file path separator. You might want to use a different system call.

Use Library Code

A lot of platform abstraction is done for you by the standard library. For example, prefer using File over the nix crate to obtain a file handle. For memory-mapping, the memmap2 crate handles many of the platform issues for you.

Use Config Directives

You can make blocks of code conditionally compile based on feature flags or platform. For example:

#![allow(unused)]
fn main() {
#[cfg(all(not(feature = "opengl"), feature = "cross_term"))]
}

Will only compile if the opengl feature is disabled, and the cross_term feature is enabled. You'll often need blobs combining feature combinations to determine what to compile. It gets messy fast.

To minimize the mess, define a common interface. It could be a trait, or it could be a structure that will always offer the same methods (the trait is cleaner). Put each platform/feature implementation in a separate module and make the compilation decision at module inclusion time. For example:

#![allow(unused)]
fn main() {
#[cfg(all(feature = "opengl", not(target_arch = "wasm32")))]
mod native;

#[cfg(all(feature = "opengl", not(target_arch = "wasm32")))]
pub use native::*;

#[cfg(all(feature = "opengl", target_arch = "wasm32"))]
mod wasm;

#[cfg(all(feature = "opengl", target_arch = "wasm32"))]
pub use wasm::*;
}

Now when you compile, it only includes the appropriate module and shares the common type defined in each of the modules. That's a great way to share functionality between platform-specific implementations (which can be managed by different teams, even) without resorting to dynamic dispatch.

General Best Practices

TANSTAAFL

There Aint No Such Thing As A Free Lunch

This is something I like to remind everyone of, no matter what language you are using.

Rust is fast, and you often get performance benefits relative to other languages just for using it. But everything is a trade-off:

  • You can trade developer time optimizing vs. execution time and server cost.
  • You can trade code readability vs. performance in some cases.
  • You can trade compilation time vs. execution time.

Generics

Every time you make a generic type or function, the compiler will replace it at compile time with concrete implementations matching what you actually used. This adds to your compile times, and can make for bigger binaries.

Macros

Macros are slow to compile, but can make coding much more pleasant. Use macros sparingly. Don't redefine large swathes of syntax because you can---do it when it makes the interface objectively better.

YAGNI

You Aint Gonna Need It

This is another phrase I use a lot. It's always tempting to super gold-plate your interfaces, until you have something that supports everything you might ever need. And then years later, you're looking through and notice that nobody ever used some of the options.

Try to specify a narrow target that does what's needed. If there are obvious extensions, document them and bear them in mind with your initial design. If there are obscure extensions, note them down. But don't spend hours and hours adding "gold plated" features that may never be used.

The inverse of this is that if you do provide a kitchen-sink interface, you (or your successor) has to support it. The more options you offer, the more support you have in your future.

Look at Windows. Once something hits the Windows API, it never goes away. So you wind up with CreateWindow, CreateWindowExt, CreateWindowWithOptions and a giant combinatorial explosion of the API surface. That's something to avoid is at all possible.

Domain Boundaries and Network Calls

Whether you have a monolith or a host of small services (I prefer starting with a modular monolith and scaling out services if demand requires it), there's a tendency to compartmentalize code. Overall, this is a good thing. Teams or individuals can take ownership of sections of the program, and with sufficient testing to ensure that they do what they say they will do, you have a working system and can still scale up the developer pool as you grow.

Problems always crop-up at domain boundaries.

Defensive Programming

You can avoid a LOT of bugs by:

  • Minimize the publicly accessible portion of your code. Expose high-level functionality that "fans out" into implementation.
  • At the interface, check your preconditions. Not just user input! If your system only works with a range of 0 to 10, check that! Make sure your unit tests check that trying it produces an error, too. This can be a great use-case for the builder pattern with ?.
  • Use strong typing for inputs to ensure that external callers can't give you completely inappropriate input by accident.
  • Use debug and info tracing liberally. It'll make tracking down what went wrong a lot easier.

FFI

When you bind some external code, there's a few things to consider:

  • You need to validate the code's inputs, unless you really trust the code you are wrapping.
  • When you call into Go or Python code, the other language's FFI interface includes some marshaling that is MUCH slower than a pure C call (which is effectively free). So always make sure that the Go code you are calling does enough in one call to warrant the delay. Favor calls that do a lot, over lots of tiny calls---because you are paying for each call.

Network Calls

Aside: I once worked with a team that had a distributed architecture (this was back before the word "microservices" was popular). A request came in, which triggered a call to the authenticator and the service locator. The call was then passed to the appropriate service---which in turn called the authenticator again, and used the service locator to locate the calls it required. In one case, retrieving a balance required over 30 different network calls. Latency was, to say the least, painful.

It's really common to call services, which may call other services. It's very popular to do this with HTTP(s) calls. If you trace performance on these calls, the single largest time consumer is often... opening the network TCP connection. After that, TCP "slow start" prevents the first few bytes from arriving instantly. Even a call to localhost over TCP can take microseconds to open the connection---even if the call resolves in nanoseconds.

You can mitigate this a bit:

  • If you can, run calls concurrently.
  • Where possible, open a TCP connection and use a binary protocol. KEEP the connection open and stream requests/replies through it.
  • Ask the hard question: does the service actually need to be running remotely?

Error-Handling and Debugging

In this section, we will cover:

  • Logging, Tracing and Telemtry
  • Error Handling in Rust
  • Debugging

Error Handling

Much of this section applies to both async and non-async code. Async code has a few extra considerations: you are probably managing large amounts of IO, and really don't want to stop the world when an error occurs!

Rust Error Handling

In previous examples, we've used unwrap() or expect("my message") to get the value out of a Result. If an error occurred, your program (or thread) crashes. That's not great for production code!

Aside: Sometimes, crashing is the right thing to do. If you can't recover from an error, crashing is preferable to trying to continue and potentially corrupting data.

So what is a Result?

A Result is an enum, just like we covered in week 1. It's a "sum type"---it can be one of two things---and never both. A Result is either Ok(T) or Err(E). It's deliberately hard to ignore errors!

This differs from other languages:

LanguageDescriptionError Types
CErrors are returned as a number, or even NULL. It's up to you to decipher what the library author meant. Convention indicates that returning <0 is an error, and >=0 is success.int
C++Exceptions, which are thrown and "bubble up the stack" until they are caught in a catch block. If an exception is uncaught, the program crashes. Exceptions can have performance problems. Many older C++ programs use the C style of returning an error code. Some newer C++ programs use std::expected and std::unexpected to make it easier to handle errors without exceptions.std::exception, expected, int, anything you like!
JavaChecked exceptions---which are like exceptions, but handling them is mandatory. Every function must declare what exceptions it can throw, and every caller must handle them. This is a great way to make sure you don't ignore errors, but it's also a great way to make sure you have a lot of boilerplate code. This can get a little silly, so you find yourself re-throwing exceptions to turn them into types you can handle. Java is also adding the Optional type to make it easier to handle errors without exceptions.Exception, Optional
GoFunctions can return both an error type and a value. The compiler won't let you forget to check for errors, but it's up to you to handle them. In-memory, you are often returning both the value and an empty error structure.error
RustFunctions return an enum that is either Ok(T) or Err(E). The compiler won't let you forget to check for errors, and it's up to you to handle them. Result is not an exception type, so it doesn't incur the overhead of throwing. You're always returning a value or an error, never both.Result<T, E>

So there's a wide range of ways to handle errors across the language spectrum. Rust's goal is to make it easy to work with errors, and hard to ignore them - without incurring the overhead of exceptions. However (there's always a however!), default standard-library Rust makes it harder than it should be.

Strongly Typed Errors: A Blessing and a Curse!

The code for this is in the 03_async/rust_errors1 directory.

Rust's errors are very specific, and can leave you with a lot of things to match. Let's look at a simple example:

use std::path::Path;

fn main() {
    let my_file = Path::new("mytile.txt");
    // This yields a Result type of String or an error
    let contents = std::fs::read_to_string(my_file);
    // Let's just handle the error by printing it out
    match contents {
        Ok(contents) => println!("File contents: {contents}"),        
        Err(e) => println!("ERROR: {e:#?}"),
    }
}

This prints out the details of the error:

ERROR: Os {
    code: 2,
    kind: NotFound,
    message: "The system cannot find the file specified.",
}

That's great, but what if we want to do something different for different errors? We can match on the error type:

#![allow(unused)]
fn main() {
match contents {
    Ok(contents) => println!("File contents: {contents}"),
    Err(e) => match e.kind() {
        std::io::ErrorKind::NotFound => println!("File not found"),
        std::io::ErrorKind::PermissionDenied => println!("Permission denied"),
        _ => println!("ERROR: {e:#?}"),
    },
}
}

The _ is there because otherwise you end up with a remarkably exhaustive list:

#![allow(unused)]
fn main() {
match contents {
    Ok(contents) => println!("File contents: {contents}"),
    Err(e) => match e.kind() {
        std::io::ErrorKind::NotFound => println!("File not found"),
        std::io::ErrorKind::PermissionDenied => println!("Permission denied"),
        std::io::ErrorKind::ConnectionRefused => todo!(),
        std::io::ErrorKind::ConnectionReset => todo!(),
        std::io::ErrorKind::ConnectionAborted => todo!(),
        std::io::ErrorKind::NotConnected => todo!(),
        std::io::ErrorKind::AddrInUse => todo!(),
        std::io::ErrorKind::AddrNotAvailable => todo!(),
        std::io::ErrorKind::BrokenPipe => todo!(),
        std::io::ErrorKind::AlreadyExists => todo!(),
        std::io::ErrorKind::WouldBlock => todo!(),
        std::io::ErrorKind::InvalidInput => todo!(),
        std::io::ErrorKind::InvalidData => todo!(),
        std::io::ErrorKind::TimedOut => todo!(),
        std::io::ErrorKind::WriteZero => todo!(),
        std::io::ErrorKind::Interrupted => todo!(),
        std::io::ErrorKind::Unsupported => todo!(),
        std::io::ErrorKind::UnexpectedEof => todo!(),
        std::io::ErrorKind::OutOfMemory => todo!(),
        std::io::ErrorKind::Other => todo!(),
        _ => todo!(),            
    },
}
}

Many of those errors aren't even relevant to opening a file! Worse, as the Rust standard library grows, more errors can appear---meaning a rustup update run could break your program. That's not great! So when you are handling individual errors, you should always use the _ to catch any new errors that might be added in the future.

Pass-Through Errors

The code for this is in the 03_async/rust_errors2 directory.

If you are just wrapping some very simple functionality, you can make your function signature match the function you are wrapping:

use std::path::Path;

fn maybe_read_a_file() -> Result<String, std::io::Error> {
    let my_file = Path::new("mytile.txt");
    std::fs::read_to_string(my_file)
}

fn main() {
    match maybe_read_a_file() {
        Ok(text) => println!("File contents: {text}"),
        Err(e) => println!("An error occurred: {e:?}"),
    }
}

No need to worry about re-throwing, you can just return the result of the function you are wrapping.

The ? Operator

We mentioned earlier that Rust doesn't have exceptions. It does have the ability to pass errors up the call stack---but because they are handled explicitly in return statements, they don't have the overhead of exceptions. This is done with the ? operator.

Let's look at an example:

#![allow(unused)]
fn main() {
fn file_to_uppercase() -> Result<String, std::io::Error> {
    let contents = maybe_read_a_file()?;
    Ok(contents.to_uppercase())
}
}

This calls our maybe_read_a_file function and adds a ? to the end. What does the ? do?

  • If the Result type is Ok, it extracts the wrapped value and returns it---in this case to contents.
  • If an error occurred, it returns the error to the caller.

This is great for function readability---you don't lose the "flow" of the function amidst a mass of error handling. It's also good for performance, and if you prefer the "top down" error handling approach it's nice and clean---the error gets passed up to the caller, and they can handle it.

What if I just want to ignore the error?

You must handle the error in some way. You can just call the function:

#![allow(unused)]
fn main() {
file_to_uppercase();
}

This will generate a compiler warning that there's a Result type that must be used. You can silence the warning with an underscore:

#![allow(unused)]
fn main() {
let _ = file_to_uppercase();
}

_ is the placeholder symbol - you are telling Rust that you don't care. But you are explicitly not caring---you've told the compiler that ignoring the error is a conscious decision!

You can also use the if let pattern and simply not add an error handler:

#![allow(unused)]
fn main() {
if let Ok(contents) = file_to_uppercase() {
    println!("File contents: {contents}");
}
}

What About Different Errors?

The ? operator is great, but it requires that the function support exactly the type of error that you are passing upwards. Otherwise, in a strong-typed language you won't be able to ensure that errors are being handled.

Let's take an example that draws a bit from our code on day 1.

The code for this is in the 03_async/rust_errors3 directory.

Let's add Serde and Serde_JSON to our project:

cargo add serde -F derive
cargo add serde_json

And we'll quickly define a deserializable struct:

#![allow(unused)]
fn main() {
use std::path::Path;
use serde::Deserialize;

#[derive(Deserialize)]
struct User {
    name: String,
    password: String,
}

fn load_users() {
    let my_file = Path::new("users.json");
    let raw_text = std::fs::read_to_string(my_file)?;
    let users: Vec<User> = serde_json::from_str(&raw_text)?;
    Ok(users)
}
}

This isn't going to compile yet, because we aren't returning a type from the function. So we add a Result:

#![allow(unused)]
fn main() {
fn load_users() -> Result<Vec<User>, Error> {
}

Oh no! What do we put for Error? We have a problem! read_to_string returns an std::io::Error type, and serde_json::from_str returns a serde_json::Error type. We can't return both!

Boxing Errors

You'll learn about the Box type and what dyn means next week. For now, Box is a pointer---and the dyn flag indicates that it its contents is dynamic---it can return any type that implements the Error trait. You'll learn about traits next week, too!

There's a lot of typing for a generic error type, but it works:

#![allow(unused)]
fn main() {
type GenericResult<T> = std::result::Result<T, Box<dyn std::error::Error>>;

fn load_users() -> GenericResult<Vec<User>> {
    let my_file = Path::new("users.json");
    let raw_text = std::fs::read_to_string(my_file)?;
    let users: Vec<User> = serde_json::from_str(&raw_text)?;
    Ok(users)
}
}

This works with every possible type of error. Let's add a main function and see what happens:

fn main() {
    let users = load_users();
    match users {
        Ok(users) => {
            for user in users {
                println!("User: {}, {}", user.name, user.password);
            }
        },
        Err(err) => {
            println!("Error: {err}");
        }
    }
}

The result prints:

Error: The system cannot find the file specified. (os error 2)

You have the exact error message, but you really don't have any way to tell what went wrong programmatically. That may be ok for a simple program.

Easy Boxing with Anyhow

There's a crate named anyhow that makes it easy to box errors. Let's add it to our project:

cargo add anyhow

Then you can replace the Box definition with anyhow::Error:

#![allow(unused)]
fn main() {
fn anyhow_load_users() -> anyhow::Result<Vec<User>> {
    let my_file = Path::new("users.json");
    let raw_text = std::fs::read_to_string(my_file)?;
    let users: Vec<User> = serde_json::from_str(&raw_text)?;
    Ok(users)
}
}

It still functions the same way:

Error: The system cannot find the file specified. (os error 2)

In fact, anyhow is mostly just a convenience wrapper around Box and dyn. But it's a very convenient wrapper!

Anyhow does make it a little easier to return your own error:

#![allow(unused)]
fn main() {
#[allow(dead_code)]
fn anyhow_load_users2() -> anyhow::Result<Vec<User>> {
    let my_file = Path::new("users.json");
    let raw_text = std::fs::read_to_string(my_file)?;
    let users: Vec<User> = serde_json::from_str(&raw_text)?;
    if users.is_empty() {
        anyhow::bail!("No users found");
    }
    if users.len() > 10 {
        return Err(anyhow::Error::msg("Too many users"));
    }
    Ok(users)
}
}

I've included the short-way and the long-way - they do the same thing. bail! is a handy macro for "error out with this message". If you miss Go-like "send any error you like", anyhow has your back!

As a rule of thumb: anyhow is great in client code, or code where you don't really care what went wrong---you care that an error occurred and should be reported.

Writing Your Own Error Types

Defining a full error type in Rust is a bit of a pain. You need to define a struct, implement the Error trait, and then implement the Display trait. You'll learn about traits next week, but for now you can think of them as "interfaces" that define what a type can do.

This is included in the rust_errors3 project. We're just going to look at it, because the Rust community as a whole has decided that this is overly painful and does it an easier way!

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
enum UsersError {
    NoUsers, TooManyUsers
}

use std::fmt;

impl fmt::Display for UsersError {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match *self {
            UsersError::NoUsers => write!(f, "no users found"),
            UsersError::TooManyUsers => write!(f, "too many users found"),
        }
    }
}
}

That's quite a lot of typing for an error! Pretty much nobody in the Rust world does this, unless you are in an environment in which you can't rely on external crates. Let's do the same thing with the thiserror crate:

cargo add thiserror

And then:

#![allow(unused)]
fn main() {
use thiserror::Error;

#[derive(Debug, Error)]
enum UsersError {
    #[error("No users found")]
    NoUsers, 
    #[error("Too many users were found")]
    TooManyUsers
}
}

That's much easier!


Mapping Errors

So let's use the new error type (UsersError):

#![allow(unused)]
fn main() {
fn work_with_my_error() -> Result<Vec<User>, UsersError> {
    let my_file = Path::new("users.json");
    let raw_text = std::fs::read_to_string(my_file)?;
    let users: Vec<User> = serde_json::from_str(&raw_text)?;
    if users.is_empty() {
        Err(UsersError::NoUsers)
    } else if users.len() > 10 {
        Err(UsersError::TooManyUsers)
    } else {
        Ok(users)
    }
}
}

Oh dear - that doesn't compile! Why? Because read_to_string and from_str return errors that aren't your UsersError.

We're trying to make a production library here, and having well-defined errors makes for clearer control flow for our users. So we need to map the errors to a type we can handle. Let's add two more error types to our enumeration:

#![allow(unused)]
fn main() {
#[derive(Debug, Error)]
enum UsersError {
    #[error("No users found")]
    NoUsers, 
    #[error("Too many users were found")]
    TooManyUsers,
    #[error("Unable to open users file")]
    FileError,
    #[error("Unable to deserialize json")]
    JsonError(serde_json::Error),
}
}

Notice that we've added a tuple member for JsonError containing the actual error message. You might want to use it later, since it tells you why it couldn't deserialize the file.

Let's tackle our first ?: reading the file:

#![allow(unused)]
fn main() {
let raw_text = std::fs::read_to_string(my_file).map_err(|_| UsersError::FileError)?;
}

We're using map_err on the function. It calls a function that receives the actual error as a parameter, and then returns a different type of error---the one we've created.

You can do the same for deserializing:

#![allow(unused)]
fn main() {
let users: Vec<User> = serde_json::from_str(&raw_text).map_err(UsersError::JsonError)?;
}

In this case, a Rust shorthand kicks in. This is the same as map_err(|e| UsersError::JsonError(e)) - but because you're just passing the parameter in, a bit of syntax sugar lets you shorten it. Use the long-form if that's confusing (and Clippy - the linter - will suggest the short version).

So what have you gained here?

  • You are now clearly defining the errors that come out of your library or program---so you can handle them explicitly.
  • You've retained the inner error that might be useful, which might be handy for logging.
  • You aren't messing with dynamic types and boxing, you are just mapping to an error type.
  • You've regained control: YOU decide what's really an error, and how much depth you need to handle it.

Back to Async!

The error handling so far has been generic and applies to everything you might write in Rust. Once you get into async land, handling errors becomes even more important. If you've written a network service, there might be hundreds or even thousands of transactions flying around---and you want to handle errors cleanly, without bringing down your enterprise service.

The code for this is in 03_async/rust_errors_async

We're going to make use of a few crates for this project:

cargo add tokio -F full
cargo add futures
cargo add anyhow
cargo add thiserror

Top-Level Error Handling

You can make use of the ? operator in main by returning a Result from main. You also need to return Ok(()) at the end:

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    Ok(())
}

This can let you write pretty clean-looking code and still cause the program to stop with an explicit error message:

async fn divide(number: u32, divisor: u32) -> anyhow::Result<u32> {
    if divisor == 0 {
        anyhow::bail!("Dividing by zero is a bad idea")
    } else {
        Ok(number / divisor)
    }
}

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    divide(5, 0).await?;
    Ok(())
}

Note: It's much easier to use the checked_div function and return an error from that! This is for illustration.

Running this yields:

Error: Dividing by zero is a bad idea
error: process didn't exit successfully: `C:\Users\Herbert\Documents\Ardan\5x1 Day Ultimate Rust\code\target\debug\rust_errors_async.exe` (exit code: 1)

Joining Fallible Futures

You can use the above pattern to simplify your error handling. What if you want to run lots of async operations, any of which may fail?

Let's try this:

#![allow(unused)]
fn main() {
let mut futures = Vec::new();
for n in 0..5 {
    futures.push(divide(20, n));
}
let results = futures::future::join_all(futures).await;
println!("{results:#?}");
}

The program doesn't crash, but the result from join_all is an array of Result types. You could iterate the array and keep the ones that worked, decide to fail because something failed, etc.

What if you'd like to transform [Result, Result, Result] - a list of results - into a single Result[list]? If anything failed, it all failed. There's an iterator for that!

#![allow(unused)]
fn main() {
// Condense the results! ANY error makes the whole thing an error.
let overall_result: anyhow::Result<Vec<u32>> = results.into_iter().collect();
println!("{overall_result:?}");
}

So now you can turn any failure into a program failure if you like:

#![allow(unused)]
fn main() {
let values = overall_result?; // Crashes
}

Or what if you'd like to just keep the good ones (and log the errors!):

#![allow(unused)]
fn main() {
// Separate the errors and the results
let mut errors = Vec::new();
let good: Vec<_> = results
    .into_iter()
    .filter_map(|r| r.map_err(|e| errors.push(e)).ok())
    .collect();
println!("{good:?}");
println!("{errors:?}");
Ok(())
}

You can do whatever you like with the errors. Logging them is a good idea (you could even replace the errors.push with a log call), we'll handle that in tracing.

So that was a larger section, but you now have the basics you need to write fallible---it can fail---but reliable code. We'll use these techniques from now on.

Tracing (and logging)

Add the following dependencies:

cargo add tokio -F full
cargo add tracing
cargo add tracing-subscriber

Tracing works without async and Tokio. We'll be using it with async later, so we're adding those dependencies now.

Subscribing to Trace Events and Logging

The most basic usage is for logging. The following program gets you a minimal logging framework:

#[tokio::main]
async fn main() {
    let subscriber = tracing_subscriber::FmtSubscriber::new();
    tracing::subscriber::set_global_default(subscriber).unwrap();

    tracing::info!("Hello World!");
    tracing::warn!("Hello World!");
    tracing::error!("Hello World!");
}

This will output:

2023-07-20T17:50:03.067756Z  INFO tokio_tracing: Hello World!
2023-07-20T17:50:03.067920Z  WARN tokio_tracing: Hello World!
2023-07-20T17:50:03.068006Z ERROR tokio_tracing: Hello World!

That's nice (and colorful in a console that supports it), but it would be nice to have some more information. We can replace the FmtSubscriber::new() with a builder-pattern builder to heavily customize output:

#![allow(unused)]
fn main() {
let subscriber = tracing_subscriber::fmt()
    // Use a more compact, abbreviated log format
    .compact()
    // Display source code file paths
    .with_file(true)
    // Display source code line numbers
    .with_line_number(true)
    // Display the thread ID an event was recorded on
    .with_thread_ids(true)
    // Don't display the event's target (module path)
    .with_target(false)
    // Build the subscriber
    .finish();
}

Now you get a bit more insight into where the log messages are coming from:

2023-07-20T17:53:42.283322Z  INFO ThreadId(01) tokio_tracing\src\main.rs:22: Hello World!
2023-07-20T17:53:42.283502Z  WARN ThreadId(01) tokio_tracing\src\main.rs:23: Hello World!
2023-07-20T17:53:42.283699Z ERROR ThreadId(01) tokio_tracing\src\main.rs:24: Hello World!

Tracing Spans

That's a great start, but wouldn't it be nice to get some performance indications?

Include this in your use statements:

#![allow(unused)]
fn main() {
use tracing_subscriber::fmt::format::FmtSpan;
}

And add another entry to the subscriber builder:

#![allow(unused)]
fn main() {
// Add span events
.with_span_events(FmtSpan::ENTER | FmtSpan::CLOSE)
}

Now we'll make a function, and decorate it with #[instrument]:

#![allow(unused)]
fn main() {
#[tracing::instrument]
fn do_something() {
    tracing::info!("Doing something");
}
}

Finally, we'll add a call to the instrumented function:

#![allow(unused)]
fn main() {
do_something();
}

Running the program now gives us some performance data:

2023-07-20T17:57:14.201166Z  INFO ThreadId(01) do_something: tokio_tracing\src\main.rs:33: enter
2023-07-20T17:57:14.201334Z  INFO ThreadId(01) do_something: tokio_tracing\src\main.rs:35: Doing something
2023-07-20T17:57:14.201467Z  INFO ThreadId(01) do_something: tokio_tracing\src\main.rs:33: close time.busy=286µs time.idle=37.0µs

You can keep adding functions. For example:

#![allow(unused)]
fn main() {
#[tracing::instrument]
fn do_something() {
    tracing::info!("Doing something");
    for n in 0..3 {
        do_something_else(n);
    }
}

#[tracing::instrument]
fn do_something_else(n: i32) {
    tracing::info!("Doing something else: {n}");
}
}

Gives you detailed output regarding each child call, as well as the parent call:

2023-07-20T18:00:46.121875Z  INFO ThreadId(01) do_something: tokio_tracing\src\main.rs:33: enter
2023-07-20T18:00:46.122054Z  INFO ThreadId(01) do_something: tokio_tracing\src\main.rs:35: Doing something
2023-07-20T18:00:46.122227Z  INFO ThreadId(01) do_something:do_something_else: tokio_tracing\src\main.rs:41: enter n=0
2023-07-20T18:00:46.122375Z  INFO ThreadId(01) do_something:do_something_else: tokio_tracing\src\main.rs:43: Doing something else: 0 n=0
2023-07-20T18:00:46.122540Z  INFO ThreadId(01) do_something:do_something_else: tokio_tracing\src\main.rs:41: close time.busy=311µs time.idle=5.20µs n=0
2023-07-20T18:00:46.122790Z  INFO ThreadId(01) do_something:do_something_else: tokio_tracing\src\main.rs:41: enter n=1
2023-07-20T18:00:46.122916Z  INFO ThreadId(01) do_something:do_something_else: tokio_tracing\src\main.rs:43: Doing something else: 1 n=1
2023-07-20T18:00:46.123041Z  INFO ThreadId(01) do_something:do_something_else: tokio_tracing\src\main.rs:41: close time.busy=250µs time.idle=3.90µs n=1
2023-07-20T18:00:46.123244Z  INFO ThreadId(01) do_something:do_something_else: tokio_tracing\src\main.rs:41: enter n=2
2023-07-20T18:00:46.123361Z  INFO ThreadId(01) do_something:do_something_else: tokio_tracing\src\main.rs:43: Doing something else: 2 n=2
2023-07-20T18:00:46.123473Z  INFO ThreadId(01) do_something:do_something_else: tokio_tracing\src\main.rs:41: close time.busy=229µs time.idle=3.20µs n=2
2023-07-20T18:00:46.123822Z  INFO ThreadId(01) do_something: tokio_tracing\src\main.rs:33: close time.busy=1.94ms time.idle=23.0µs

It's not a formal benchmark, but it's great for quick performance checks.

Async Spans

Let's add an async function:

#![allow(unused)]

fn main() {
#[tracing::instrument]
async fn do_something_async() {
    tracing::info!("We're in an async context");
    tokio::time::sleep(std::time::Duration::from_secs(1)).await;
    tracing::info!("Finished waiting");
}
}

And call it from main:

#![allow(unused)]
fn main() {
do_something_async().await;
}

You get the following output:

2023-07-20T18:04:44.150288Z  INFO ThreadId(01) do_something_async: tokio_tracing\src\main.rs:47: enter
2023-07-20T18:04:44.150405Z  INFO ThreadId(01) do_something_async: tokio_tracing\src\main.rs:49: We're in an async context
2023-07-20T18:04:45.153037Z  INFO ThreadId(01) do_something_async: tokio_tracing\src\main.rs:47: enter
2023-07-20T18:04:45.153248Z  INFO ThreadId(01) do_something_async: tokio_tracing\src\main.rs:51: Finished waiting
2023-07-20T18:04:45.153378Z  INFO ThreadId(01) do_something_async: tokio_tracing\src\main.rs:47: close time.busy=630µs time.idle=1.00s

Notice how it lists idle=1.00s? Tracing is smart enough to list "idle" time as time awaiting something else. So you can get a good gauge of how much time you are spending waiting on async processes.

Axum Integration

We'll switch to a new project, axum_tracing. Let's add some dependencies:

cargo add tokio -F full
cargo add axum
cargo add tower_http -F full

Note that we've added the tracing feature.

And we'll build the basic hello world service again:

use axum::{routing::get, Router};
use std::net::SocketAddr;

#[tokio::main]
async fn main() {
    let app = Router::new().route("/", get(say_hello_text));
    let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
    axum::Server::bind(&addr)
        .serve(app.into_make_service())
        .await
        .unwrap();
}

async fn say_hello_text() -> &'static str {
    "Hello, world!"
}

As before, it displays "Hello, world!" in a boring page. Now, we'll add a tracing subscriber:

cargo add tracing
cargo add tracing-subscriber

And initialize our subscriber at the beginning of the program:

#![allow(unused)]
fn main() {
use tracing_subscriber::fmt::format::FmtSpan;
let subscriber = tracing_subscriber::fmt()
    // Use a more compact, abbreviated log format
    .compact()
    // Display source code file paths
    .with_file(true)
    // Display source code line numbers
    .with_line_number(true)
    // Display the thread ID an event was recorded on
    .with_thread_ids(true)
    // Don't display the event's target (module path)
    .with_target(false)
    // Add span events
    .with_span_events(FmtSpan::ENTER | FmtSpan::CLOSE)
    // Display debug-level info
    .with_max_level(tracing_subscriber::filter::LevelFilter::DEBUG)
    // Build the subscriber
    .finish();

tracing::subscriber::set_global_default(subscriber).unwrap();
}

Notice that we've added with_max_level to display "debug" level events. There are a LOT of those!

Finally, we add a layer to our router handler:

#![allow(unused)]
fn main() {
let app = Router::new()
    .route("/", get(say_hello_text))
    .layer(TraceLayer::new_for_http());
}

Using the "tower" service manager, we are now subscribed to all of its tracing events. You get a LOT of output:

2023-07-20T18:38:03.384687Z DEBUG ThreadId(21) C:\Users\Herbert\.cargo\registry\src\index.crates.io-6f17d22bba15001f\hyper-0.14.27\src\proto\h1\io.rs:207: parsed 12 headers
2023-07-20T18:38:03.385045Z DEBUG ThreadId(21) C:\Users\Herbert\.cargo\registry\src\index.crates.io-6f17d22bba15001f\hyper-0.14.27\src\proto\h1\conn.rs:222: incoming body is empty
2023-07-20T18:38:03.385607Z DEBUG ThreadId(21) request: C:\Users\Herbert\.cargo\registry\src\index.crates.io-6f17d22bba15001f\tower-http-0.4.3\src\trace\make_span.rs:109: enter method=GET uri=/ version=HTTP/1.1
2023-07-20T18:38:03.385852Z DEBUG ThreadId(21) request: C:\Users\Herbert\.cargo\registry\src\index.crates.io-6f17d22bba15001f\tower-http-0.4.3\src\trace\on_request.rs:80: started processing request method=GET uri=/ version=HTTP/1.1
2023-07-20T18:38:03.386088Z DEBUG ThreadId(21) request: C:\Users\Herbert\.cargo\registry\src\index.crates.io-6f17d22bba15001f\tower-http-0.4.3\src\trace\make_span.rs:109: enter method=GET uri=/ version=HTTP/1.1
2023-07-20T18:38:03.386356Z DEBUG ThreadId(21) request: C:\Users\Herbert\.cargo\registry\src\index.crates.io-6f17d22bba15001f\tower-http-0.4.3\src\trace\on_response.rs:114: finished processing request latency=0 ms status=200 method=GET uri=/ version=HTTP/1.1
2023-07-20T18:38:03.386663Z DEBUG ThreadId(21) request: C:\Users\Herbert\.cargo\registry\src\index.crates.io-6f17d22bba15001f\tower-http-0.4.3\src\trace\make_span.rs:109: enter method=GET uri=/ version=HTTP/1.1
2023-07-20T18:38:03.386992Z DEBUG ThreadId(21) request: C:\Users\Herbert\.cargo\registry\src\index.crates.io-6f17d22bba15001f\tower-http-0.4.3\src\trace\make_span.rs:109: close time.busy=1.21ms time.idle=223µs method=GET uri=/ version=HTTP/1.1
2023-07-20T18:38:03.387581Z DEBUG ThreadId(21) C:\Users\Herbert\.cargo\registry\src\index.crates.io-6f17d22bba15001f\hyper-0.14.27\src\proto\h1\io.rs:320: flushed 130 bytes
2023-07-20T18:38:03.429555Z DEBUG ThreadId(04) C:\Users\Herbert\.cargo\registry\src\index.crates.io-6f17d22bba15001f\hyper-0.14.27\src\proto\h1\io.rs:207: parsed 11 headers
2023-07-20T18:38:03.429995Z DEBUG ThreadId(04) C:\Users\Herbert\.cargo\registry\src\index.crates.io-6f17d22bba15001f\hyper-0.14.27\src\proto\h1\conn.rs:222: incoming body is empty
2023-07-20T18:38:03.430240Z DEBUG ThreadId(04) C:\Users\Herbert\.cargo\registry\src\index.crates.io-6f17d22bba15001f\hyper-0.14.27\src\server\server.rs:765: connection error: connection closed before message completed

If you need to debug your webserver, this gives you a lot of information. Let's change output level to info:

#![allow(unused)]
fn main() {
.with_max_level(tracing_subscriber::filter::LevelFilter::INFO)
}

Running the program gives us no output at all! Fortunately, you can do quite fine-grained configuration of the Tower HTTP trace output as follows:

#![allow(unused)]
fn main() {
// Axum App
use tower_http::trace::{self, TraceLayer};
let app = Router::new().route("/", get(say_hello_text)).layer(
    TraceLayer::new_for_http()
        .on_response(trace::DefaultOnResponse::new().level(tracing::Level::INFO)),
);
}

Now running your program gives you some useful information, but not a flood:

2023-07-20T18:48:23.845380Z  INFO ThreadId(21) C:\Users\Herbert\.cargo\registry\src\index.crates.io-6f17d22bba15001f\tower-http-0.4.3\src\trace\on_response.rs:114: finished processing request latency=0 ms status=200

Logging Targets/Format

For automated log ingestion, you can change the output format. Update your subscriber dependency to include the json feature:

tracing-subscriber = { version = "0.3.17", features = [ "json" ] }

And you can initialize a logger with JSON format:

#![allow(unused)]
fn main() {
let subscriber = tracing_subscriber::fmt()
        .json()
        // Display source code file paths
        .with_file(true)
        // Display source code line numbers
        .with_line_number(true)
        // Display the thread ID an event was recorded on
        .with_thread_ids(true)
        // Don't display the event's target (module path)
        .with_target(false)
        // Add span events
        .with_span_events(FmtSpan::ENTER | FmtSpan::CLOSE)
        // Build the subscriber
        .finish();
}

You can use the tracing-appender crate to write to log files (including with rollover).

It's still in-progress, but you can link to OpenTelemetry with this crate

Note that many crates implement this tracing system. SQLX will provide you with query timings, for example.

Tokio Console

If you'd like an htop style view of what your async application is doing in real-time, tokio-console can provide it.

You can install tokio-console with cargo install tokio-console. Then you have to add a configuration option to Cargo.toml:

[build]
rustflags = ["--cfg", "tokio_unstable"]

If you're using a workspace, the parent Cargo.toml controls all builds. You'll need to set it there. You can also set the environment variable RUSTFLAGS to equal --cfg tokio_unstable when you run the build.

In your application, add a dependency to console_subscriber with cargo add console_subscriber. Finally, while you setup your application add a call to:

#![allow(unused)]
fn main() {
console_subscriber::init();
}

Now you can run tokio-console from the command line, and it will show you a real-time view of your async application's execution.

The top-level tasks view shows you all of the async tasks in your application:

The resources view shows you shared resources that are being polled:

And you can select a task for specific information:

Debugging

Good news! Rust emits platform-standard debug information in binaries (unless you turn them off), so your existing debugging solution will work.

For Rust-specific debugging, Rust Rover from JetBrains is the nicest I've found so far. It sets everything up nicely for you, and seamlessly handles stepping into non-Rust code.

On Visual Studio Code, you need the CodeLLDB extension.

Confession: I don't actually do a lot of single-step, breakpoint debugging. I tend to emit tracing messages and use those for debugging unless I'm really, really stuck!

Quick walkthrough of using both debuggers.

Avoiding Bugs

Much of this was covered in the Best Practices section, so we won't belabor it. In particular:

In general, Rust can help you stay productive if you embrace its rules:

  • Minimize unsafe, and wrap it in safe interfaces. Document the need for the lack of safety.
  • Do run in debug mode periodically, to catch overflows and out-of-bounds accesses.
  • Embrace well-contained code with readable functions.
  • Embrace the Result type, and check your preconditions! If this is too slow in production, wrap your checks in conditional compilation and make sure that you test them.
  • Unit test everything that makes sense to unit test.
  • Don't opt out of safety unless you really need to.

These are similar to C++ guidelines, with which you should be familiar.

FFI: Linking Rust and C or C++

Rust behaves very well when talking to other languages---both as a library for other languages to consume, and as a consumer of other languages' libraries.

We'll refer to "C Libraries"---but we really mean any language that compiles to a C-friendly library format. C, C++, Go, Fortran, Haskell, and many others can all be consumed by Rust.

Consuming C Libraries

The code for this is in 04_mem/c_rust (C Rust)

Let's start with a tiny C library:

// A simple function that doubles a number
int double_it(int x) {
    return x * 2;
}

We'd like to compile this and include it in a Rust program. We can automate compilation by including the ability to compile C (and C++) libraries as part of our build process with the cc crate. Rather than adding it with cargo add, we want to add it as a build dependency. It won't be included in the final program, it's just used during compilation. Open Cargo.toml:

[package]
name = "c_rust"
version = "0.1.0"
edition = "2021"

[dependencies]

[build-dependencies]
cc = "1"

Now we can create a build.rs file in the root of our project (not the src directory). This file will be run as part of the build process, and can be used to compile C libraries. We'll use the cc crate to do this:

fn main() {
    cc::Build::new()
        .file("src/crust.c")
        .compile("crust");
}

build.rs is automatically compiled and executed when your Rust program builds. You can use it to automate any build-time tasks you want. The cc calls will build the listed files and include the linked result in your final program as a static library.

Lastly, let's create some Rust to call the C:

#![allow(unused)]
fn main() {
// Do it by hand
extern "C" {
    fn double_it(x: i32) -> i32;
}

mod rust {
    pub fn double_it(x: i32) -> i32 {
        x * 2
    }
}
}

We've used an extern "C" to specify linkage to an external C library. We've also created a Rust version of the same function, so we can compare the two.

Now let's use some unit tests to prove that it works:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn test_double_it() {
        assert_eq!(unsafe { double_it(2) }, 4);
    }

    #[test]
    fn test_c_rust() {
        assert_eq!(unsafe { double_it(2) }, rust::double_it(2));
    }
}
}

And it works when we run cargo test.

Header files and BindGen

You need LLVM installed (clang 5 or greater) to use this. On Windows, winget install LLVM.LLVM will work. Also set an environment variable LIBCLANG_PATH to the location of the Clang install. On Windows, $Env:LIBCLANG_PATH="C:\Program Files\LLVM\bin"

Larger C examples will include header files. Let's add crust.h:

int double_it(int x);

And add C to require it:

#include "crust.h"

// A simple function that doubles a number
int double_it(int x) {
    return x * 2;
}

We can add it to the build.rs file, but it will be ignored (it's just a forward declaration).

Writing the extern "C" for a large library could be time consuming. Let's use bindgen to do it for us.

Add another build-dependency:

[build-dependencies]
cc = "1"
bindgen = "0"

Now in build.rs we'll add some calls to use it:

#![allow(unused)]
fn main() {
let bindings = bindgen::Builder::default()
    .header("src/crust.h")
    .parse_callbacks(Box::new(bindgen::CargoCallbacks))
    .generate()
    .expect("Unable to generate bindings");

// Write the bindings to the $OUT_DIR/bindings.rs file.
let out_path = PathBuf::from(env::var("OUT_DIR").unwrap());
bindings
    .write_to_file(out_path.join("bindings.rs"))
    .expect("Couldn't write bindings!");
}

See this page for details

This is pretty much standard boilerplate, but there are a lot of options available.

Now run cargo build. You'll see a new file in target/debug/build/c_rust-*/out/bindings.rs. This is the automatically generated bindings file. Let's use it:

#![allow(unused)]
fn main() {
include!(concat!(env!("OUT_DIR"), "/bindings.rs"));
}

Your compile time has suffered, but now the header is parsed and Rust bindings are generated automatically. The unit tests should still work.

Calling Rust from Other Languages

The code for this is in 04_mem/rust_c (Rust C)

You can also setup Rust functions and structures for export via a C API. You lose some of the richness of the Rust language ---everything has to be C compatible---but you can still use Rust's safety and performance.

Start with some Cargo.toml entries:

[package]
name = "rust_c"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["staticlib"]

[dependencies]
libc = "0.2"

Providing a lib and crate-type section lets you change compilation behavior. We're instructing Rust to build a C-compatible static library (it can also take a dynlib for dynamic linkage).

Next, we'll build a single Rust function to export:

#![allow(unused)]
fn main() {
use std::ffi::CStr;

/// # Safety
/// Use a valid C-String!
#[no_mangle]
pub unsafe extern "C" fn hello(name: *const libc::c_char) {
    let name_cstr = unsafe { CStr::from_ptr(name) };
    let name = name_cstr.to_str().unwrap();
    println!("Hello {name}");
}
}

Notice that we're using c_char as an array---just like the C ABI. CStr and CString provide Rust friendly layers between string types, allowing you to convert back and forth. C strings will never be as safe as Rust strings, but this is a good compromise.

We've turned off name mangling, making it easy for the linker to find the function.

The function is also "unsafe"---because it receives an unsafe C string type.

Build the project with cargo build, and you'll see that target/debug/rust_c.lib (on Windows, .a on Linux) has been created. This is the static library that we can link to from C.

Linkage via C requires a header file. In this case, it's pretty easy to just write one:

void hello(char *name);

You can now use this in C or another language. In Go, it looks like this:

package main

/*
#cgo LDFLAGS: ./rust_c.a -ldl
#include "./lib/rust_c.h"
*/
import "C"

import "fmt"
import "time"

func main() {
	start := time.Now()
    fmt.Println("Hello from GoLang!")
	duration := time.Since(start)
	fmt.Println(duration)
	start2 := time.Now()
	C.hello(C.CString("from Rust!"))
	duration2 := time.Since(start2)
	fmt.Println(duration2)
}

(There's a few microseconds delay in the Rust call, but it's pretty fast! Marshaling the C string in Go is the slowest part).

Using CBindGen to Write the Header For You

Setup cbindgen as a build dependency:

[package]
name = "rust_c"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["staticlib"]

[dependencies]
libc = "0.2"

[build-dependencies]
cbindgen = "0.24"

And once again, add a build.rs file:

use std::env;
use std::path::PathBuf;
use cbindgen::Config;


fn main() {
    let crate_dir = env::var("CARGO_MANIFEST_DIR").unwrap();

    let package_name = env::var("CARGO_PKG_NAME").unwrap();
    let output_file = target_dir()
        .join(format!("{}.hpp", package_name))
        .display()
        .to_string();

    let config = Config {
        //namespace: Some(String::from("ffi")),
        ..Default::default()
    };

    cbindgen::generate_with_config(&crate_dir, config)
      .unwrap()
      .write_to_file(&output_file);
}

/// Find the location of the `target/` directory. Note that this may be 
/// overridden by `cmake`, so we also need to check the `CARGO_TARGET_DIR` 
/// variable.
fn target_dir() -> PathBuf {
    if let Ok(target) = env::var("CARGO_TARGET_DIR") {
        PathBuf::from(target)
    } else {
        PathBuf::from(env::var("CARGO_MANIFEST_DIR").unwrap()).join("target")
    }
}

This is boilerplate from this guide

Now run cargo build and a target directory appears - with a header file.

Packing, Re-ordering and Mangling

Packing

Let's take a simple program and guess what it does:

struct OneByte {
    a: u8,
}

struct TwoBytes {
    a: u16,
}

struct ThreeBytes {
    a: u16,
    b: u8,
}

struct FourBytes {
    a: u32,
}

fn main() {
    println!("{}", std::mem::size_of::<OneByte>());
    println!("{}", std::mem::size_of::<TwoBytes>());
    println!("{}", std::mem::size_of::<ThreeBytes>());
    println!("{}", std::mem::size_of::<FourBytes>());
}

The result may surprise you:

1
2
4
4

Rust has aligned your 24-bit structure to a 32-bit boundary in memory. It's allowed to do this with the default "packing" scheme. In general, this speeds up execution---it's easier for the CPU to access memory on word-size boundaries. It does mean we're wasting 1 byte of memory, though. If you are working with a lot of data, this can add up---more realistically, if you are parsing bit data to and from a file or network, you may be surprised when your data is not the size you expect.

You can tell Rust that you care about exact byte alignment by adding the decoration:

#![allow(unused)]
fn main() {
#[repr(packed)]
struct ThreeBytes {
    a: u16,
    b: u8,
}
}

Running the program again, we get the more expected "1, 2, 3, 4".

Re-Ordering

On top of changing your structure size, Rust reserves the right to rearrange your structure! This is called "re-ordering". Rust is allowed to do this because it doesn't change the semantics of your program---you access fields by name. But if you are doing binary serialization, you may be surprised when your data is not in the order you expect.

You can tell Rust that you need your structure to be in the order you defined it by adding the decoration:

#![allow(unused)]
fn main() {
#[repr(C)]
}

You can combine these decorations, e.g. #[repr(packed, C)].

Mangling

Rust has a concept of "mangling" names. This is a way of encoding the type information into the name of the function. The linker will use these "internal" names to resolve symbols. This is a way of avoiding name collisions. It also means that you can't call a Rust function from C without some extra work.

You can tell Rust to not mangle a name by adding the decoration:

#![allow(unused)]
fn main() {
#[no_mangle]
}

If you are working with other languages (via the foreign function interface), you may need to use this decoration. Otherwise, the other language will not be able to find your function.

So on the boundaries of your program, where you are dealing with binary data and/or other languages you may need to remember #[repr(C)])] and #[no_mangle]. You may need #[repr(packed)]---but most other languages also pack. Be aware of packing for serialization!

Rust as a Service

Every organization I've ever worked with has had slightly different preferences for containerization, deployment and managing live services. That makes it tricky to teach a "one size fits all" solution! In some cases, you are best off using bare metal or close to it for performance---abstracting your hardware does carry a performance cost. If you're using Rust for massive per-server performance, being close to the metal has its advantages.

Using containers can also be advantageous: it becomes easier to manage a large set of containers, deployment can be automated, and you can slide Rust in alongside your existing setup.

Building for Docker

Docker has great built-in support for Rust. Docker defaults to using a multi-stage builder for Rust:

  • The first stage uses the officially provided (by the Rust Foundation) docker build platform for Rust.
  • The second stage assembles the resultant binaries into a a small debian:bullseye-slim container.

We'll use the projects/docker/hello_web_docker project in the example code. Notice that in the parent Cargo.toml, I've excluded it from the workspace with exclude = [...]. This is to avoid confusion that can lead to Dockerizing the whole workspace---which is often what you want, but not when teaching a small section!

Build an Application

We'll start very simple, with a "hello web"---just like we did before:

The Cargo.toml file:

[package]
name = "hello_web_docker"
version = "0.1.0"
edition = "2021"

[dependencies]
tokio = { version = "1.32.0", features = ["full"] }
axum = "0.6.20"
anyhow = "1.0.75"

The main.rs file:

use axum::{routing::get, Router};
use std::{net::SocketAddr, path::Path};
use axum::response::Html;

#[tokio::main]
async fn main() {
    let app = Router::new()
        .route("/", get(say_hello_html));
    let addr = SocketAddr::from(([127, 0, 0, 1], 3001));    
    axum::Server::bind(&addr)
        .serve(app.into_make_service())
        .await
        .unwrap();
}

async fn say_hello_html() -> Html<&'static str> {
    Html("<h1>Hello, world!</h1>")
}

This is nothing too impressive: it listens on port 3001 and serves Hello World to your browser.

Let's Dockerize it

In your project's parent folder, run docker init and:

  • Select Rust from the platform list.
  • Accept the default (the current version) unless you need a different ont.
  • We're listening on port 3001 (which is suggested!), so accept that.

With any luck, you'll see the following:

✔ Your Docker files are ready!

Take a moment to review them and tailor them to your application.

When you're ready, start your application by running: docker compose up --build

Your application will be available at http://localhost:3001

Consult README.Docker.md for more information about using the generated files.

Run:

docker compose up --build

The first time, you'll have to wait while Dockerfiles are downloaded and setup for the build process. Once they are cached, it's very fast.

You can now connect to your docker process, it's running! You can stop it with ctrl-c.

Go to http://localhost:3000. It doesn't work! You get "connection was reset".

You need to make one small change to your program. Your containerized app is only listening to localhost, meaning it isn't available outside of its container:

#![allow(unused)]
fn main() {
let addr = SocketAddr::from(([0, 0, 0, 0], 3001));
}

Now rerun docker compose up --build and try again. You'll see "Hello world" served from http://localhost:3000.

Note that docker init has created README.Docker.md with instructions for supporting other platforms, and a Dockerfile.

Including Other Files

It's relatively unlikely that your program is completely self-contained. Because of the two-step build process, you need to edit your Dockerfile to include any files that are part of your program in the build, and in the final version.

For example, to include a migrations folder for SQLX you need to find the following section:

RUN --mount=type=bind,source=src,target=src \
    --mount=type=bind,source=Cargo.toml,target=Cargo.toml \
    --mount=type=bind,source=Cargo.lock,target=Cargo.lock \
    --mount=type=cache,target=/app/target/,id=rust-cache-${APP_NAME}-${TARGETPLATFORM} \
    --mount=type=cache,target=/usr/local/cargo/git/db \
    --mount=type=cache,target=/usr/local/cargo/registry/ \
    <<EOF

And add a line to include your migrations scripts:

--mount=type=bind,source=migrations,target=migrations \

You can also include environment variables (such as DATABASE_URL):

# Set the DB URL
ENV DATABASE_URL="sqlite::memory:"

You probably have a "secrets" setup for your container solution. Use it as normal. There are too many possible choices to reasonably try and teach that here.

Embedded Challenges

There's a lot of different types of embedded out there. "Embedded" can mean a full-featured Raspberry PI 4---or a tiny microcontroller. Different platforms will have differing levels of support for embedded Rust. LLVM currently bounds which platforms you can target; Rust on GCC is advancing rapidly but isn't ready for production yet.

Minimizing Binary Size

For size-constrained builds, Rust has a lot of options:

Optimize for Size

In Cargo.toml, you can specify optimization levels by profile. Add this to the Cargo.toml file:

[profile.release]
opt-level = "s"

Run cargo build --release. It'll take a moment, it has to recompile every dependency and also optimize the dependency for size.

On Windows, the resulting binary is now: 510,976 bytes (499 kb). A small improvement.

There's also an optimization level named "z". Let's see if it does any better?

[profile.release]
opt-level = "z"

It weighs in at 509,440 bytes (497.5 kb). A very tiny improvement.

Strip the binary

In Cargo.toml, let's also strip the binary of symbols.

[profile.release]
opt-level = "z"
strip = true # Automatically strip symbols

Compiling again, this reduces the binary to 508,928 (497 kb).

Enable LTO

In Cargo.toml, let's enable link-time optimization. This optimizes across crate boundaries, at the expense of a SLOW compile.

[profile.release]
opt-level = "z"
strip = true # Automatically strip symbols
lto = true

We're down to 438,272 bytes (428 kb). Getting better!

Reduce Codegen Units

By default, Rust parallelizes builds across all of your CPUs - which can prevent some optimizations. Let's make our compilation even slower in the name of a small binary:

[profile.release]
opt-level = "z"
strip = true # Automatically strip symbols
lto = true
codegen-units = 1

You may have to run cargo clean before building this.

Our binary is now 425,472 bytes (415 kb). Another small improvement.

Abort on Panic

A surprising amount of a Rust binary is the "panic handler". Similar to an exception handler in C++, it adds some hefty code to unwind the stack and provide detailed traces on crashes. We can turn this behavior off:

[profile.release]
opt-level = "z"
strip = true # Automatically strip symbols
lto = true
codegen-units = 1
panic = "abort"

This reduces my binary to 336,896 bytes (329 kb). That's a big improvement! The downside is that if your program panics, it won't be able to tell you all the details about how it died.

Heavy Measures: Optimize the Standard Library for Size

If you don't have nightly installed, you will need it:

rustup toolchain install nightly
rustup component add rust-src --toolchain nightly

Then find out your current build target:

rustc -vV

And use that target to issue a build that includes the standard library:

cargo +nightly build -Z build-std=std,panic_abort -Z build-std-features=panic_immediate_abort --target x86_64-apple-darwin --release

The binary goes into target/(platform)/release. There's a pretty substantial size improvement: 177,152 bytes (173 kb)

That's about as small as it gets without using a different standard library. Let's see what we can do about the dependencies.

Using Cargo Bloat

Install a tool, cargo install cargo-bloat. And run cargo bloat to see exactly where your binary size is going.

Building Without the Standard Library

If you are on a platform without standard library support (or for really small builds), you can combine these steps with adding #[no_std] to your binary. You can still opt-in to parts of the library with core::---depending upon what is available. This can also be useful for WASM builds in the browser. You can also use extern crate alloc to opt-in to a Rust-provided allocator:

#![allow(unused)]
#![no_std]
fn main() {
extern crate alloc;
}

This allows you to use Vec and similar in your code. You don't have the full standard library, but it's a pretty pleasant environment.

Using a Different Allocator

Rust defaults to using your platform's allocator. It used to use jemallocator, but that didn't work properly on all platforms. Jem is amazing---it offers memory usage profiling, a pool-based system that minimizes the penalty for reallocation, and can improve the performance of real-time sytems significantly. The LibreQoS project adopted it for real-time packet analysis, and saw runtime performance improvements up to 15%.

To opt-in to jemalloc, add the following to Cargo.toml:

[target.'cfg(any(target_arch = "x86", target_arch = "x86_64"))'.dependencies]
jemallocator = "0.5"

And add this to your main.rs file (outside of any functions):

#![allow(unused)]
fn main() {
// Use JemAllocator only on supported platforms
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
use jemallocator::Jemalloc;

#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
#[global_allocator]
static GLOBAL: Jemalloc = Jemalloc;
}

The rest of the Rust system will pickup on these changes and use Jem. There are quite a few other allocation systems available.

Optimizing Rust

A lot of optimization techniques from other languages apply. In particular, C++ optimizations not only apply --- in most cases, LLVM is using the same optimizations. For example, "moving" out of a function triggers the same return-value optimizations as C++. Accessing the heap always requires a pointer-chase---with the same cache implications, so you want to avoid pointers to pointers.

The golden rule also applies: profile before you micro-optimize. Compilers are amazingly smart.

A few of the Best Practices also help: Favor Iterators, Minimize Cloning, Don't Emulate OOP, Don't Reference Count Everything. Iterators in particular tend to optimize better than equivalent loops. And its very easy to try to "beat" the borrow checker by cloning/reference counting---but doing so comes at a cost. Deciding where to pay the cost is up to you.

Memory Fragmentation

If your program is going to be running for a long time (maybe its a server), memory fragmentation can become a hard to diagnose problem. It can also be a problem if you're working with a lot of data and you're running out of memory.

The heap is allocated by the operating system. Every time you request a chunk of data, the available heap is searched for a chunk that is big enough to hold the requested memory---contiguously. If there isn't one, the heap is expanded. If the heap is expanded, the operating system has to find a new chunk of memory that is big enough to hold the new heap---contiguously. If it can't, it has to expand the heap again. This can lead to a lot of memory fragmentation.

So now imagine that you are allocating and de-allocating repeatedly---with differing size chunks. You can end up using more memory than you need to, because the heap is fragmented.

This is the same as the old defrag program on disks!

Most of the time, you don't need to worry. Allocators are pretty smart and will try to avoid this problem. But, I'd like you to think about a few ways to not run into this problem:

  • If you're going to be storing a large amount of data over time, consider using a Vec, HashMap or other structure with pre-allocated capacity. Don't "shrink to fit" when you clear it or remove items. Let the vector act as a big buffer, expanding within its allocated size. This will reduce the number of times the heap has to be expanded, and remove the need to find a contiguous chunk of memory for the new heap.
  • If you're on a platform where expanding the heap is expensive, consider using an Arena---we'll talk about that in a bit.

Changing Allocator

We're only going to touch on this, it's quite advanced. It's also a useful tool to have in your kit.

By default, Rust uses the allocator that comes with your platform. malloc on UNIX-likes, and HeapAlloc on Windows. You can opt-in to using a different allocator. JemAlloc is a popular allocator, optimized for repeated allocation, reallocation and removal of memory. It's also optimized for multi-threaded use. You can use it by adding this to your Cargo.toml:

[dependencies]
jemallocator = "0.5"

Then in your program you can use it like this:

#![allow(unused)]
fn main() {
#[global_allocator]
static GLOBAL: Jemalloc = jemallocator::Jemalloc;;
}

Windows isn't a great choice for the Jem allocator---but it's a great choice for Linux and Mac. It carries some downsides:

  • It's not as well tested as the default allocator---but it's very stable in practice. FreeBSD still uses it, Rust used to use it by default, and lots of projects use it. My own LibreQoS used it in the main monitor daemon, and it's been both very stable and very fast. (LibreQoS tracks a LOT of statistics, causing quite a bit of memory churn.)
  • Some tools such as valgrind assume that you are going to be using the default allocator. It won't give you as much useful information if you use it on your program.

If you check out the jemalloc website there are a lot of really helpful tools included. You can instrument memory allocation and build detailed reports as to what you are doing. It's sometimes worth switching to Jemalloc just to run the tools and see what's going on---and then switch back if needs-be.

See this gist for a guide to using Jemalloc for heap profiling.

Arenas

An "Arena" is a pre-allocated area of memory that you use over and over to store data. A vector with a preset capacity that will never grow is the simplest form of arena. Arenas are used:

  • When you need to avoid memory fragmentation.
  • When allocation is expensive; you allocate your arena up-front and then use it.
  • When you absolutely can't afford to fail to allocate memory. Arenas are pre-allocated, so you can't fail to allocate memory once they are started.
  • On some platforms (real-time), you can't allocate memory after the program starts. You have to pre-allocate everything you need.
  • Some arenas are used to store data that is never de-allocated. This is common in games, where you might have a "level" arena, a "game" arena, and a "global" arena. You can de-allocate the level arena when you move to a new level, but the game and global arenas are never de-allocated.
  • In turn, this can allow you to fix the pain of "cyclic references"---references that refer to one another. If you have a cyclic reference, you can't de-allocate the memory. If you use an arena, you can de-allocate the entire arena at once, and the cyclic references are no longer a problem.

Bump Style Arenas

A "bump" style arena is the simplest form of arena. You allocate a chunk of memory, and then you just keep allocating from it. You keep a pointer to the end of the allocated memory, and when you run out, you allocate another chunk. You can't de-allocate memory, but you can de-allocate the entire arena at once.

This allows you to solve cyclic references, and by pre-allocating the arena, you can avoid the problem of running out of memory.

See code/04_mem/arena_bump for code.

We'll test out Bumpalo. Bumpalo is pretty easy to use:

use bumpalo::Bump;

struct MyData {
    a: i32,
}

fn main() {
    let arena = Bump::new();
    arena.set_allocation_limit(Some(8192)); // Limit the size of the arena to 8 KiB
    let x = arena.alloc(MyData { a: 123 });
}

You can also enable the collections feature and use BumpaloVec and BumpaloString to store data in the arena:

use bumpalo::Bump;
use bumpalo::collections::String;
use bumpalo::collections::Vec;

struct MyData {
    a: i32,
}

fn main() {
    let arena = Bump::new();
    arena.set_allocation_limit(Some(8192)); // Limit the size of the arena to 8 KiB
    let x = arena.alloc(MyData { a: 123 });

    // With collections enabled
    let mut my_string = String::from_str_in("Hello, world!", &arena);
    let mut vec = Vec::new_in(&arena);
    vec.push(12);
}

Downside: Drop will never be called in a Bump arena. You can enable unstable compiler features and make it work, but for now---you're not dropping anything in the arena!

Use a bump arena to allocate memory up-front (or in chunks) and store data inside the arena. You can't de-allocate individual items, but for something like a data-collector that must not suddenly fail to allocate memory or expand its heap, it's a great choice.

Slab Arenas

A "slab arena" pre-allocates space for a uniform type, indexing each entry by key. This is similar to a pre-allocated Vec, but you don't have to keep usize around for entries---and the slab keeps track of vacant entries for you. It's also similar to a HashMap, but you don't have to hash keys. Slabs are great for pre-allocating a big pool of resources and then using them as needed.

See code/04_mem/arena_slab for code.

use slab::Slab;

fn main() {
    let mut slab = Slab::with_capacity(10);
    let hello = slab.insert("hello");
    let world = slab.insert("world");

    assert_eq!(slab[hello], "hello");
    assert_eq!(slab[world], "world");

    slab.remove(hello);
}

Note that you can remove items! Slabs work like a "slot map" - entries are either Vacant or filled with your data type. Slabs won't ever fragment, and entries will be stored in contiguous memory. This makes them very fast to iterate over. If you can preallocate a slab of data, it's a great choice for high-performance and not fragmenting memory.

From Bytes to Types

You often need to convert a binary format---bytes---info a Rust type (and vice versa). You might be reading a file, or parsing a network packet---or interacting with a system written in another programming language. Rust has a few tools to help you with this.

Note: If you have a specific format in mind, you can use Serde. The "bincode" crate provides a binary format for Serde that is basically a memory dump.

Saving Bytes to a File

The code for this is in code/04_mem/save_bytes

There are unsafe code options to directly transform a structure into an array of bytes, but let's stick with safe code. The bytemuck crate is a safe wrapper around unsafe code that does this for you.

Add bytemuck to your project:

cargo add bytemuck -F derive
#[repr(C)]
#[derive(bytemuck::Zeroable, bytemuck::Pod, Clone, Copy, Debug)]
struct OurData {
    number: u16,
    tag: [u8; 8],
}

fn main() {
    let some_data = vec![
        OurData {
            number: 1,
            tag: *b"hello   ",
        },
        OurData {
            number: 2,
            tag: *b"world   ",
        }
    ];

    let bytes: &[u8] = bytemuck::cast_slice(&some_data);
    std::fs::write("data.bin", bytes).unwrap();

    // Read the data back
    let bytes = std::fs::read("data.bin").unwrap();
    let data: &[OurData] = bytemuck::cast_slice(&bytes);

    // Debug print the data to show the round-trip worked
    println!("{data:?}");
}

We define a type with some fixed-sized data in it. We then use bytemuck's Pod type (to define Plain Old Data) and Zeroable (required by Pod---the type can be zeroed in memory). Then we can use cast_slice to create a slice of bytes from our data, and write it to a file.

Note that the bytes type is a slice of bytes, not a vector. It's a reference to the data in some_data, not a copy of it. This is a zero-copy operation. Zero-copy is very, very fast.

Reading Bytes from a File - and Casting to a Type

Reading the data back is equally straightforward once you have the data:

#![allow(unused)]
fn main() {
// Read the data back
let bytes = std::fs::read("data.bin").unwrap();
let data: &[OurData] = bytemuck::cast_slice(&bytes);

// Debug print the data to show the round-trip worked
println!("{data:?}");
}

bytes now contains the concrete data, and data is a zero-copied reference to it.

This is a great pattern for fixed-size records in binary data. I've used it to parse netlink data from the Linux kernel in mere nanoseconds.

Converting Bytes to a String

Strings work differently. In the example above, we used a fixed-size array of bytes with spaces in the gaps. That's very convenient for fixed-size records (and is common in many file formats), but you'd probably rather further transform the data into a string.

#![allow(unused)]
fn main() {
// Print the first record's tag as a string
println!("{}", std::str::from_utf8(&data[0].tag).unwrap());
}

Fortunately, str includes a conversion from a slice of bytes to a string. It can fail if the bytes don't align with valid unicode. You may want to trim out extra characters---which will make converting back a little trickier.

Writing a Protocol

The code for this is in code/04_mem/save_dynamic_bytes.

You quite often want to read/write data as a protocol---a stream. This allows you to account for variable sizing.

If you are in async, Tokio has a "framing" feature to help with this.

Let's write some data to a file:

use std::{fs::File, io::Write};

struct OurData {
    number: u16,
    tag: String,
}

fn main() {
    let a = OurData {
        number: 12,
        tag: "Hello World".to_string(),
    };

    // Write the record in parts
    let mut file = File::create("bytes.bin").unwrap();

    // Write the number and check that 2 bytes were written
    assert_eq!(file.write(&a.number.to_le_bytes()).unwrap(), 2);

    // Write the string length IN BYTES and check that 8 bytes were written
    let len = a.tag.as_bytes().len();
    assert_eq!(file.write(&(len as u64).to_le_bytes()).unwrap(), 8);

    // Write the string and check that the correct number of bytes were written
    assert_eq!(file.write(a.tag.as_bytes()).unwrap(), len);
}

So we're defining some data, and creating a file. Then we write the number field as two bytes (specifying endian format). Then we write the length of the string as 8 bytes (a u64), and then we write the string itself.

Reading it back is mostly reversing the process:

#![allow(unused)]
fn main() {
///// READ THE DATA BACK
// Read the whole file as bytes.
let bytes = std::fs::read("bytes.bin").unwrap();

// Read the number
let number = u16::from_le_bytes(bytes[0..2].try_into().unwrap());

// Read the string length
let length = u64::from_le_bytes(bytes[2..10].try_into().unwrap());

// Decode the string
let tag = std::str::from_utf8(&bytes[10..(10 + length as usize)]).unwrap();

let a = OurData {
    number,
    tag: tag.to_string(),
};
println!("{a:?}");
}

Notice that this isn't zero copy. In an ideal world, we'd do a bit of both. Read descriptors, and use those to cast bytes to types.

You may also want to read the file with a buffered reader, a few bytes at a time if you have memory constraints (or a HUGE file).

WASM for the Browser

The oldest use case for WASM is including in the browser. Emscripten (C++) was the first system to popularize it. Browser WASM can be written as regular Rust, with a few exceptions---notably threads don't work in current browser setups.

I recommend keeping this reference handy: https://rustwasm.github.io/wasm-bindgen/introduction.html

To work with Rust in the browser, you need two components:

Installing Required Components

  • WASM compiler toolchain. You can add it with rustup target add wasm32-unknown-unknown.
  • WASM Bindgen, which generates JavaScript/Typescript bindings connecting Rust to the browser. You can install it with cargo install wasm-bindgen-cli.

Your project will also need to include wasm-bindgen in its dependencies. Note that when you upgrade wasm-bindgen, you need to also update wasm-bindgen-cli to the matching version.

Testbed Server

Browsers don't like running WASM from localhost, it violates the sandbox rules. So you typically need a webserver from which to test your code. I often keep a small server like nginx around while I'm developing WASM for the browser for quick turnaround.

In this case, let's build ourselves a mini Axum server that serves a directory. You can serve a folder named web with this short program:

use axum::Router;
use std::net::SocketAddr;
use tower_http::services::ServeDir;

#[tokio::main]
async fn main() {
    let app = Router::new()
        .fallback_service(ServeDir::new("web"));
    let addr = SocketAddr::from(([127, 0, 0, 1], 3000));    
    axum::Server::bind(&addr)
        .serve(app.into_make_service())
        .await
        .unwrap();
}

And the Cargo.toml:

[package]
name = "wasm_web_server"
version = "0.1.0"
edition = "2021"

[dependencies]
axum = "0.6.18"
tokio = { version = "1.28.2", features = ["full"] }
tower-http = { version = "0.4.0", features = ["fs", "trace", "cors"] }

Using the fallback_service and ServeDir lets you serve a file by name if it didn't match any routes. Since we didn't define any roots, it'll serve any file with a matching name from the web directory.

Let's add a file, web/index.html:

<html>
    <head>
        <title>Hello World</title>
    </head>
    <body>
        <p>Hello, World!</p>
    </body>
</html>

Run the project with cargo run, and visit http://localhost:3001 to verify that the server works.

Creating a Rust Function to Call From JavaScript

Let's create a project with cargo new --lib wasm_lib.

Our Cargo.toml file will need a wasm-bindgen dependency:

[package]
name = "wasm_lib"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["cdylib"]

[dependencies]
wasm-bindgen = "0.2.86"

Note that we have to build a cdylib - a C compatible dynamic library. Otherwise, we'll get a statically linkable rlib (Rust library format) and no .wasm file will be created.

In our lib.rs, we'll start with the following:

#![allow(unused)]
fn main() {
use wasm_bindgen::prelude::*;

#[wasm_bindgen]
extern "C" {
    #[wasm_bindgen(js_namespace = console)]
    fn log(s: &str);
}

#[wasm_bindgen]
pub fn hello_js() {
    log("Hello from Rust!");
}
}

There's a few parts here:

  • We're importing the prelude of wasm_bindgen - useful imports.
  • We have an extern block decorated with wasm_bindgen - the bindings generator will use this to map calls.
  • We defined a log function, and indicated that its in the JavaScript namespace console. This adds a Rust function named log, which is equivalent to calling console.log in JavaScript.
  • Then we build a regular Rust function that calls it. Decorating the function with [wasm_bindgen] instructs the wasm_bindgen system to generate a matching call within the generated web assembly wrapper to allow JavaScript to call it.

Now we have to build it. We can instruct Cargo to use the correct output with the target flag:

cargo build --release --target wasm32-unknown-unknown

In your target/wasm32-unknown-unknown/release directory, you will see libwasm_lib.*. This provides raw WASM, but doesn't provide any browser help (you can't really run it yet). You have to use wasm-bindgen to read the project, and create the JavaScript for the browser. By default, it will also generate TypeScript and use modern JS modules. We're going to keep it simple today.

mkdir -p out
wasm-bindgen target/wasm32-unknown-unknown/release/wasm_lib.wasm --out-dir out --no-modules --no-typescript

In your out folder, you will see two files: wasm_lib_bg.wasm (a processed .wasm file) and wasm_lib.js (a JavaScript binding library to use it).

Now in our webserver, we'll make a quick placeholder to use it:

<html>
  <head>
    <meta content="text/html;charset=utf-8" http-equiv="Content-Type" />
  </head>
  <body>
    <script src="./wasm_lib.js"></script>
    <script>
      window.addEventListener("load", async () => {
        await wasm_bindgen("./wasm_lib_bg.wasm");
        wasm_bindgen.hello_js();
      });
    </script>
  </body>
</html>

Put this file along with the two generated files into the web directory. Open http://localhost:3001/hello_wasm.html and check the web console - you will see the message Hello from Rust!. That worked, you've called a Rust function from JavaScript --- which in turn has called a JavaScript function. That gives you the basis of calling functions back and forth.

Passing Types

Now let's add a simple math function:

#![allow(unused)]
fn main() {
#[wasm_bindgen]
pub fn add(a: i32, b: i32) -> i32 {
    a + b
}
}

Modify index.html to also call:

console.log(wasm_bindgen.add(5, 10));

Go through the same build setup:

cargo build --release --target wasm32-unknown-unknown
mkdir -p out
wasm-bindgen target/wasm32-unknown-unknown/release/wasm_lib.wasm --out-dir out --no-modules --no-typescript
cp out/* ../wasm_web_server/web/

And sure enough, your math function outputs 15. So primitive types work fine. How about strings?

Add another function:

#![allow(unused)]
fn main() {
#[wasm_bindgen]
pub fn greet(s: String) -> String {
    format!("Hello {s}")
}
}

And add a line of JavaScript:

console.log(wasm_bindgen.greet("Herbert"));

How about vectors?

#![allow(unused)]
fn main() {
#[wasm_bindgen]
pub fn sum(arr: &[i32]) -> i32 {
    arr.iter().sum()
}
}
console.log(wasm_bindgen.sum([1, 2, 3, 4]));

Custom Types

In other words, normal Rust works very smoothly. What if you want to define a type? That starts to get more complicated. The JS browser environment only has very limited types: classes, 64-bit signed integers and 64-bit floats (there are also some typed memory buffers). Rust has lots of types. So when you pass data between the two contexts, you find yourself needing some conversion code.

Classes

If you'd like to represent struct + implementations as JavaScript classes, wasm-bindgen can help you. For example:

#![allow(unused)]
fn main() {
#[wasm_bindgen]
pub struct Person {
    pub name: String,
    pub age: u8,
}

#[wasm_bindgen]
impl Person {
    #[wasm_bindgen(constructor)]
    pub fn new(name: String, age: u8) -> Self {
        Self { name, age }
    }

    pub fn greet(&self) -> String {
        format!("Hello, my name is {} and I am {} years old", self.name, self.age)
    }

    pub fn set_age(&mut self, age: u8) {
        self.age = age;
    }

    pub fn get_age(&self) -> u8 {
        self.age
    }
}
}

Note that you're marking wasm_bindgen on both the structure and its implementation, and have to tag the constructor. Now let's take a look at this from the JavaScript side:

let person = new wasm_bindgen.Person("Herbert", 48);
console.log(person.greet());
console.log(person.age);
console.log(person.get_age());        

Creating the Person works, and calling greet and get_age work. But referencing person.age does not work! You don't get an automatic bridge to fields, because of type conversion requirements. Getters will do the work for you---but you are back to writing lots of getters and setters.

Arbitrary Data with Serde

You can work around this by using Serde, and Serde JSON to build a bridge between the systems. Add serde and serde_json to your project:

cargo add serde -F derive
cargo add serde_json

And now we can serialize our person and return JSON:

#![allow(unused)]
fn main() {
use serde::Serialize;

#[derive(Serialize)]
#[wasm_bindgen]
pub struct Person {
    name: String,
    age: u8,
}

#[wasm_bindgen]
impl Person {
    #[wasm_bindgen(constructor)]
    pub fn new(name: String, age: u8) -> Self {
        Self { name, age }
    }

    pub fn greet(&self) -> String {
        format!("Hello, my name is {} and I am {} years old", self.name, self.age)
    }

    pub fn set_age(&mut self, age: u8) {
        self.age = age;
    }

    pub fn get_age(&self) -> u8 {
        self.age
    }
}

#[wasm_bindgen]
pub fn serialize_person(person: &Person) -> String {
    serde_json::to_string(person).unwrap()
}
}

Now in your JavaScript you can use JSON to fetch the person without having to worry about getters/setters:

let person_json = wasm_bindgen.serialize_person(person);
let person_deserialized = JSON.parse(person_json);
console.log(person_deserialized);

You can use this to handle passing complicated types to and from JavaScript via the built-in JSON system. serde_json is really fast, but there is a performance penalty to transitioning data between the WASM sandbox and the browser.

Communicating with Servers via REST

You can handle the API side of things directly from the WASM part of the browser.

Let's add a little more functionality to our webserver. In our wasm_web_server project, let's add Serde with cargo add serde -f derive. Then we'll add a simple JSON API.

use axum::{Router, routing::get};
use std::net::SocketAddr;
use tower_http::services::ServeDir;
use serde::Serialize;

#[derive(Serialize)]
struct HelloJson {
    message: String,
}

async fn say_hello_json() -> axum::Json<HelloJson> {
    axum::Json(HelloJson {
        message: "Hello, World!".to_string(),
    })
}

#[tokio::main]
async fn main() {
    let app = Router::new()
        .route("/json", get(say_hello_json))
        .fallback_service(ServeDir::new("web"));
    let addr = SocketAddr::from(([127, 0, 0, 1], 3001));    
    axum::Server::bind(&addr)
        .serve(app.into_make_service())
        .await
        .unwrap();
}

Our WASM library needs some JavaScript imports to use the JS fetch API. You also need to add wasm-bindgen-futures. In Cargo.toml add:

[dependencies]
serde = { version = "1.0.193", features = ["derive"] }
serde_json = "1.0.108"
wasm-bindgen = "0.2.89"
wasm-bindgen-futures = "0.4.39"

[dependencies.web-sys]
version = "0.3.4"
features = [
  'Headers',
  'Request',
  'RequestInit',
  'RequestMode',
  'Response',
  'Window',
]

In our WASM library, we can now add the following to call the JSON API:

#![allow(unused)]
fn main() {
use wasm_bindgen_futures::JsFuture;
use web_sys::{Request, RequestInit, RequestMode, Response};

#[wasm_bindgen]
pub async fn fetch_hello_json() -> Result<JsValue, JsValue> {
    let mut opts = RequestInit::new();
    opts.method("GET");
    opts.mode(RequestMode::Cors);

    let url = format!("/json", repo);

    let request = Request::new_with_str_and_init(&url, &opts)?;

    request
        .headers()
        .set("Accept", "application/vnd.github.v3+json")?;

    let window = web_sys::window().unwrap();
    let resp_value = JsFuture::from(window.fetch_with_request(&request)).await?;

    // `resp_value` is a `Response` object.
    assert!(resp_value.is_instance_of::<Response>());
    let resp: Response = resp_value.dyn_into().unwrap();

    // Convert this other `Promise` into a rust `Future`.
    let json = JsFuture::from(resp.json()?).await?;

    // Send the JSON response back to JS.
    Ok(json)
}
}

Shrinking Your WASM

WASM with WASI

WASI offers a way to use Web Assembly as a container, for secure remote deployment.

Building a WASI project is quite familiar:

cargo new wasm_hello_world

Then edit the main.rs file:

// Import rust's io and filesystem module
use std::io::prelude::*;
use std::fs;

// Entry point to our WASI applications
fn main() {

  // Print out hello world!
  println!("Hello world!");

  // Create a file
  // We are creating a `helloworld.txt` file in the `/helloworld` directory
  // This code requires the Wasi host to provide a `/helloworld` directory on the guest.
  // If the `/helloworld` directory is not available, the unwrap() will cause this program to panic.
  // For example, in Wasmtime, if you want to map the current directory to `/helloworld`,
  // invoke the runtime with the flag/argument: `--mapdir /helloworld::.`
  // This will map the `/helloworld` directory on the guest, to  the current directory (`.`) on the host
  let mut file = fs::File::create("/helloworld/helloworld.txt").unwrap();

  // Write the text to the file we created
  write!(file, "Hello world!\n").unwrap();
}

To actuall build the project you need to install the WASI target:

rustup target add wasm32-wasi

And add a dependency to your project:

cargo add wasmtime

You can then build it with:

cargo build --target wasm32-wasi

To execute WASI projects locally, use wasmtime. You can install WASMTime by going to https://wasmtime.dev/ and following the instructions there. Typically, just run (on Linux or MacOS):

curl https://wasmtime.dev/install.sh -sSf | bash

Finally, you can run the WASI program inside the wasmtime host:

wasmtime --mapdir /helloworld::. target/wasm32-wasi/debug/wasi_hello_world.wasm

If all went well, a file was created. You are running a wasm binary in a container!