Using Web Assembly in the Browser — Luke Jones

Note: some things were a little hard for me to discover correctly, so if I've made a mistake, I'd welcome an email.

WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine.

This essentially means that is is fast, because the program is compiled to a much more compact format, making it faster to parse.

Wasm can be written by hand if you're looking for a challenge, but is primarily meant to be written in another language, and then compiled to Wasm. You may know a little about Assembly language and how it works - here's a quick refresher in-case you're rusty:

A very brief introduction

A small function in C language:

int add(int num1, int num2) {
    return num1 + num2;
}

Like wise for Rust language:

pub fn add(num1: &i32, num2: &i32) -> i32 {
    num1 + num2
}

Both of these examples compile down to the following (intermediary) step in Assembly language:

example::add:
  push rbp
  mov rbp, rsp
  mov eax, dword ptr [rsi]
  add eax, dword ptr [rdi]
  pop rbp
  ret

As you can see both languages use the same base instructions for this very simple function (the actual Rust example is slightly more optimised to Assembly than the C example). But how would this look in Wasm? Something like this if written by hand:

(module
  (func (param $num1 i32) (param $num2 i32) (result i32)
    get_local $num1
    get_local $num2
    i32.add))

While if using a compiler for Rust to Wasm, the web assembly output looks like this (some parts removed to keep it clean):

add:
    .param      i32, i32
    .result     i32
    get_local    $push3=, 1
    i32.load    $push1=, 0($pop3)
    get_local    $push4=, 0
    i32.load    $push0=, 0($pop4)
    i32.add     $push2=, $pop1, $pop0
    end_function

Not too far off what we saw with the Rust/C to Assembly example, and you can also see that it is a little more complex than the hand-written example while sharing the same basic layout and instruction use.

The above has a one-to-one relationship with the actual final binary output which looks like the following :

One small thing to note is that Emscripten, which is an LLVM-to-JavaScript compiler can also compile to Wasm, but the way it does it is a little different to the Rust->Wasm example above.

Rust->Emscripten->Wasm. This is the Emscripten step:

function _add($0,$1) {
 $0 = $0|0;
 $1 = $1|0;
 var $2 = 0, $3 = 0, $4 = 0, label = 0, sp = 0;
 sp = STACKTOP;
 $2 = HEAP32[$0>>2]|0;
 $3 = HEAP32[$1>>2]|0;
 $4 = (($3) + ($2))|0;
 return ($4|0);
}

Emscripten also performs many additional steps such as outputting JS wrapper functions and memory allocators, helper functions, and more. But the final result is similar to the straight Rust->Wasm example above.

What this all adds up to is that Wasm has potential for many things such as pre-optimisation in the compilation step, and in the case of languages like Rust you also get that languages benefits such as static typing, data lifetime checks, no dangling pointers, no use-after-free etc.

The final compiled Wasm binary is run inside a Virtual Machine much like JavaScript (or in the JVM in many cases). The binary is stateless, meaning it can be run across many instances without issue.

The Rust

Prerequisite: You should have the rust target wasm32-unknown-emscripten installed, along with the emsdk for emscripten.

The example we're going to use here will implement a couple of methods that do the same thing, but using a different argument format; one will be passing pointers to Rust allocated memory via out-args, and the other will be writing in to a supplied buffer that is allocated outside of Rust.

Start with cargo new wasm-example --lib

Open Cargo.toml and add to it;

[[bin]]
name = "sphere_wasm"
path = "src/lib.rs"

The Rust Emscripten target won't compile a library crate and so requires a binary compilation and a main to execute once the Wasm is loaded. main doesn't need to do anything at all and can be blank, so, replace src/lib.rs with the following:

#[allow(dead_code)] // required since main is never used
fn main() {}

#[#[derive(Debug)]]
#[no_mangle]
pub struct World {
    w: String,
}

#[no_mangle]
pub unsafe extern "C" fn world_ptr(out_ptr: *mut usize) -> i8 {
    if out_ptr.is_null() {
        return -1
    }
    // Must allocate on the heap or it will be lost to the void
    let world_in_a_box = Box::new(World {
        w: String::from("World"),
    });
    // turn it in to a raw pointer and cast to usize (pointer size)
    *out_ptr = Box::into_raw(world_in_a_box) as usize;
    0
}

What's going on here is the function we've created takes as an argument, a pointer to a usize - this usize type is always guaranteed to be the size of a pointer, and so we can actually cast a raw pointer as a usize to make passing arguments a little easier. Since this is going across the FFI border, we can't really enforce the type by using out_ptr: *mut *mut World as the arg either.

Next we create an allocation of a World struct on the heap and store the adress to it as world_in_a_box. And then, we dereference the out_ptr to get that location on the heap, and write a raw pointer to the World in it cast to a usize type.

There are two reasons to do this:

Arguments passed to functions are passed on the stack, they are local to that function only,
Because of how the above affects the way we wrote the function, you also want to return (or use another output pointer) a result for success or failure of the function so the callee won't try using invalid data.

If you were to pass a single layered pointer on the stack, this would work fine if it is just a pointer to data - we can modify the data which is on the heap and the changes are not temporary. But if we modify that pointer to point to newly allocated data, that change is local only to that function - the callee will be referring to the previous location and you'll have leaked memory. So what we want to do is allocate some memory on the heap large enough to conatin a pointer (usize here), and pass a pointer to that on the stack. This way it kind of works like the single layered pointer example above except the data we're changing now is an address to another location.

A visualisation of the arg we passed would be:

And when we finish with it, it would look something like:

Testing

It's always wise to write unit tests to check that things work as they should, so at the bottom of src/lib.rs add:

#[cfg(test)]
mod tests {
    use {world_ptr, World};
    #[test]
    fn test_world_ptr() {
        // create a raw pointer to a usize on the heap
        let ptr_to_heap = Box::into_raw(Box::new(0usize));
        unsafe {
            assert_eq!(world_ptr(ptr_to_heap), 0);
        }
        unsafe {
            // retakes ownership so we can check it (also stops leaking memory)
            let w: Box<World> = Box::from_raw(*ptr_to_heap as *mut World);
            assert_eq!(w.w, "World");
        }
    }
}

Run with cargo test.

With Box::new(0usize) we're allocating memory on the heap large enough to hold a usize type, and assigning a 0 to it (0usize). Then with Box::into_raw() we get the raw pointer - this also conveniently drops ownership of that allocated memory, which will also be a memory leak if you don't reclaim ownership which is what we do in the last unsafe block.

In that last unsafe block we derefence the ptr_to_heap to get at the usize stored there, and cast that to a pointer-to-World type so we can reclaim ownership and Rust will automatically drop it (free the memory).

In-between all this we call world_ptr(ptr_to_heap) of course, which allocates the World and writes the pointer to it as an usize type in the location we allocated above.

Returning the pointer to free memory

All good so far. But let's say you compiled this to Wasm (which we will later) and have called world_ptr(), great! But... How does it get freed? There's actually a way to do so using some methods in Wasm/Emscripten but we'll get to those later on. For now we're going to write a function to take that World pointer and take ownership of it so it can be freed in much the same way as the test does.

In src/lib.rs add:

#[no_mangle]
pub unsafe extern "C" fn free_world(ptr: *mut usize) -> i8 {
  if ptr.is_null() {
        return -1;
    }
    // dereference the ptr to get the usize, then cast as a pointer to World
    let w: Box<World> = Box::from_raw(*ptr as *mut World);
    // You could also add some struct checks here if desired
    0
}

and in the test replace the last unsafe block with:

        unsafe {
            assert_eq!(free_world(ptr_to_heap), 0);
        }

You'll need to add that function name to the use statement also.

One more function

One last function, just to add a ittle more glitz - this one is similar to the above, but prints it to the console so you can see it.

#[no_mangle]
pub unsafe extern "C" fn print_world(ptr: *const usize) -> i8 {
    if ptr.is_null() {
        return -1;
    }
    let world = &*(*ptr as *const World);
    println!("Hello, {}", (*world).w);
    0
}

Here world is:

(*ptr as *const World), dereference to get the usize then cast that as a pointer to World,
*(*ptr as *const World), dereference that to get World,
&*(*ptr as *const World), borrow, because we can't move it out of the raw pointer.

Now we're ready to start playing with Wasm, and add new functions as things progress such as one to wite to an external buffer.

How to Compile

Use the command:

cargo rustc --target wasm32-unknown-emscripten --bin=example -- \
-Clink-arg='-s' -Clink-arg='EXTRA_EXPORTED_RUNTIME_METHODS=["setValue","getValue"]'

The setValue and getValue here are extra JavaScript wrappers for easier memory manipulation provided by emscripten. The resulting JS + Wasm will be in target/wasm32-unknown-emscripten/debug/

To optimise, add -Cdebuginfo=0 -Copt-level=3 to the end of the args for rustc.

The Memory Space

When the Wasm binary is loaded, a block of memory is also allocated for its use. This is basically an array of bytes of $n * byte$ length, allocated in the VM memory space.

When the Wasm program/library needs to allocate memory, it does so within the bounds of the array above. For example if a string were being allocated it would look something like the following figure:

As a general rule, such things as integers, short static arrays, or single chars will be allocated in the stack rather than on the heap (unless explicitly allocated that way). Almost anything using a buffer such as a vector (dynamic array), hash map, strings and more will be allocated on the heap by default.

Accessing Wasm Memory

Accessing, reading, and writing Wasm memory from outside of the Wasm module itself isn't easy, so for this reason we will use emcripten which provides many very helpful features and functions such as cwrap(), _malloc(), and _free(). When using Rust to write code for compilation to Wasm it is guaranteed to be memory safe, but only within that Wasm module. Once you start exporting functions for use externally and requiring some manual memory management, that's when some unsafety can arise.

_malloc() is a memory allocation function from emscripten which wraps some lower level Wasm instructions in a more user-friendly way, and for every malloc there must also be a _free() or you will leak memory - you also need to be aware of the dangers of trying to read/write freed memory too. If you've used C or C++ much, you'll know what to expect here.

The emscripten preamble is what contains many of the helpful functions we will need, a handful of which will be covered in the following sections.

Read

First things first; you need to know where in memory you want to read. You could start anywhere if you wanted to, there's not much to stop you except for being able to make sense of what you're reading.

We're going to start by writing some wrappers in JavaScript for the functions we've written in Rust to make things easier to manage on the JS side. Because we can't be sure the Wasm has been loaded and compiled before the functions are run we need to create js/wrapped.js with the following at the start of it.

var wasmLoaded = false;

// Standard way to load wasm using emscripten
var Module = {
  wasmBinaryFile: "./example.wasm",
  onRuntimeInitialized: function() {
    wasmLoaded = true;
    console.log("Wasm loaded");
  }
};

async function wasmLoadWait() {
  // Early exit
  if (wasmLoaded === true) {
    return true
  }
  let check = function() {
    return new Promise( load => {
      setTimeout(() => {
        load(wasmLoaded);
      }, 10);
    });
  };
  // Require an async check because the function captures the global
  // as it is when the function is called - meaning infinite loops
  while (!await check()) {}
}

You could load Wasm another way and have all of the functions called in a single block within onRuntimeInitialized: but that limits the capabilities of what we're doing somewhat. The purpose of wasmLoadWait() is so we can use an await wasmLoadWait() call at the start of each function wrapper to hold off executing the full function until the Wasm has fully loaded, without this the functon will cause an error when it get's to the point in the body where an exported Wasm function is called. And with that let's create the first wrapper function.

async function world() {
  // prevent body progress unless the Wasm module is loaded
  await wasmLoadWait();

  // emscripten provides a way to allocate memory
  let out_ptr = Module._malloc();
  // create a pointer to a pointer, this will be passed to
  // the Rust lib so that a new pointer can be written to it
  Module.setValue(out_ptr, 0, '*');

  // finally, call the exported function
  if (Module._world_ptr(out_ptr) === 0) {
    return [true, out_ptr];
  }
  return [false, out_ptr];
}

Two key parts in this block are the _malloc() call, and the setValue() call on the resulting _malloc(). We set the value to 0, and the type to * which is a pointer.

If you run console.log(world()); you will see output similar to [true, 5314256] - this is the result of the function call, and the address in Wasm memory space where the allocation is. A visual representation is similar to this (not highly accurate - the language is also aware of the layout of the struct of course):

Pointers

HTML

Example:

!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>A Wasm example</title>
    <script type='text/javascript' src="wrapped.js"></script>
    <script type="text/javascript" src="example.js"></script>
    <script>
      world().then( x => {
        if (x[0]) {
          let world_ptr = x[1];
          document.getElementById("trial").innerHTML += world_ptr;
          document.getElementById("alloc1").innerHTML += "success";

          if (print_world(world_ptr)) {
            document.getElementById("world1").innerHTML += "success";
          }

          if (free_world(world_ptr)) {
            document.getElementById("free1").innerHTML += "success";
          }
        }
      });
    </script>
</head>
<body>
    <h1>Displaying results</h1>
    <p id="trial">Pointer to World struct = </p>
    <p id="alloc1">Result of allocation of World struct = </p>
    <p id="world1">Result of print_world() = </p>
    <p id="free1">Result of free_world() = </p>
</body>
</html>

Note that the emscripten generated example.js must be loaded after wrapped.js. This is because we defined the onRuntimeInitialized:member in the global Module - emscripten picks this up and uses it when loading the Wasm. We can also tell it where the Wasm is located.

Signing off...

I've had this blog post series sitting in back log for the last 4 months unfortunately, just couldn't find the time to finish it off. Life and work conspired against me to make me super busy, but also gave me a wee little boy to teach the joys of low-level programming to.

I will finish this series soonish. The next part will be about writing to Wasm memory space, along with a few other things I've learned over the last few months such as using the same wrapper for both Node.JS and HTML.

By now you may be looking at this post and thinking "But wait, aren't there tools to do all this?".

Yep. There certainly is. But where's the fun in that? Hopefully you've learned along the way, as I have.

PS: Check out the new book, "Programming WebAssembly with Rust" by Kevin Hoffman. It's very well written, and Kevin takes you right in to the deep end of the pool first before wading us in to the shallow end with a collection of new skills. Wish it had been available a few months ago - then again maybe I wouldn't have written this post (the overall theme of this post I wrote for Sphere Identity internal documentation).

Code

The repo for the code in this post is here