Note: some things were a little hard for me to discover correctly, so if I've made a mistake, I'd welcome an email.
WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine.
This essentially means that is is fast, because the program is compiled to a much more compact format, making it faster to parse.
Wasm can be written by hand if you're looking for a challenge, but is primarily meant to be written in another language, and then compiled to Wasm. You may know a little about Assembly language and how it works - here's a quick refresher in-case you're rusty:
A very brief introduction
A small function in C language:
int add(int num1, int num2) { return num1 + num2; }
Like wise for Rust language:
pub fn add(num1: &i32, num2: &i32) -> i32 { num1 + num2 }
Both of these examples compile down to the following (intermediary) step in Assembly language:
example::add: push rbp mov rbp, rsp mov eax, dword ptr [rsi] add eax, dword ptr [rdi] pop rbp ret
As you can see both languages use the same base instructions for this very simple function (the actual Rust example is slightly more optimised to Assembly than the C example). But how would this look in Wasm? Something like this if written by hand:
(module (func (param $num1 i32) (param $num2 i32) (result i32) get_local $num1 get_local $num2 i32.add))
While if using a compiler for Rust to Wasm, the web assembly output looks like this (some parts removed to keep it clean):
add: .param i32, i32 .result i32 get_local $push3=, 1 i32.load $push1=, 0($pop3) get_local $push4=, 0 i32.load $push0=, 0($pop4) i32.add $push2=, $pop1, $pop0 end_function
Not too far off what we saw with the Rust/C to Assembly example, and you can also see that it is a little more complex than the hand-written example while sharing the same basic layout and instruction use.
The above has a one-to-one relationship with the actual final binary output which looks like the following :
One small thing to note is that Emscripten, which is an LLVM-to-JavaScript compiler can also compile to Wasm, but the way it does it is a little different to the Rust->Wasm example above.
Rust->Emscripten->Wasm. This is the Emscripten step:
function _add($0,$1) { $0 = $0|0; $1 = $1|0; var $2 = 0, $3 = 0, $4 = 0, label = 0, sp = 0; sp = STACKTOP; $2 = HEAP32[$0>>2]|0; $3 = HEAP32[$1>>2]|0; $4 = (($3) + ($2))|0; return ($4|0); }
Emscripten also performs many additional steps such as outputting JS wrapper functions and memory allocators, helper functions, and more. But the final result is similar to the straight Rust->Wasm example above.
What this all adds up to is that Wasm has potential for many things such as pre-optimisation in the compilation step, and in the case of languages like Rust you also get that languages benefits such as static typing, data lifetime checks, no dangling pointers, no use-after-free etc.
The final compiled Wasm binary is run inside a Virtual Machine much like JavaScript (or in the JVM in many cases). The binary is stateless, meaning it can be run across many instances without issue.
The Rust
Prerequisite: You should have the rust target wasm32-unknown-emscripten
installed, along with the emsdk
for emscripten.
The example we're going to use here will implement a couple of methods that do the same thing, but using a different argument format; one will be passing pointers to Rust allocated memory via out-args, and the other will be writing in to a supplied buffer that is allocated outside of Rust.
Start with cargo new wasm-example --lib
Open Cargo.toml
and add to it;
[[bin]] name = "sphere_wasm" path = "src/lib.rs"
The Rust Emscripten target won't compile a library crate and so requires a binary compilation and a main to execute once the Wasm is loaded. main
doesn't need to do anything at all and can be blank, so, replace src/lib.rs
with the following:
#[allow(dead_code)] // required since main is never used fn main() {} #[#[derive(Debug)]] #[no_mangle] pub struct World { w: String, } #[no_mangle] pub unsafe extern "C" fn world_ptr(out_ptr: *mut usize) -> i8 { if out_ptr.is_null() { return -1 } // Must allocate on the heap or it will be lost to the void let world_in_a_box = Box::new(World { w: String::from("World"), }); // turn it in to a raw pointer and cast to usize (pointer size) *out_ptr = Box::into_raw(world_in_a_box) as usize; 0 }
What's going on here is the function we've created takes as an argument, a pointer to a usize
- this usize
type is always guaranteed to be the size of a pointer, and so we can actually cast a raw pointer as a usize to make passing arguments a little easier. Since this is going across the FFI border, we can't really enforce the type by using out_ptr: *mut *mut World
as the arg either.
Next we create an allocation of a World
struct on the heap and store the adress to it as world_in_a_box
. And then, we dereference the out_ptr
to get that location on the heap, and write a raw pointer to the World
in it cast to a usize
type.
There are two reasons to do this:
- Arguments passed to functions are passed on the stack, they are local to that function only,
- Because of how the above affects the way we wrote the function, you also want to return (or use another output pointer) a result for success or failure of the function so the callee won't try using invalid data.
If you were to pass a single layered pointer on the stack, this would work fine if it is just a pointer to data - we can modify the data which is on the heap and the changes are not temporary. But if we modify that pointer to point to newly allocated data, that change is local only to that function - the callee will be referring to the previous location and you'll have leaked memory. So what we want to do is allocate some memory on the heap large enough to conatin a pointer (usize
here), and pass a pointer to that on the stack. This way it kind of works like the single layered pointer example above except the data we're changing now is an address to another location.
A visualisation of the arg we passed would be:
And when we finish with it, it would look something like:
Testing
It's always wise to write unit tests to check that things work as they should, so at the bottom of src/lib.rs
add:
#[cfg(test)] mod tests { use {world_ptr, World}; #[test] fn test_world_ptr() { // create a raw pointer to a usize on the heap let ptr_to_heap = Box::into_raw(Box::new(0usize)); unsafe { assert_eq!(world_ptr(ptr_to_heap), 0); } unsafe { // retakes ownership so we can check it (also stops leaking memory) let w: Box<World> = Box::from_raw(*ptr_to_heap as *mut World); assert_eq!(w.w, "World"); } } }
Run with cargo test
.
With Box::new(0usize)
we're allocating memory on the heap large enough to hold a usize
type, and assigning a 0
to it (0usize
). Then with Box::into_raw()
we get the raw pointer - this also conveniently drops ownership of that allocated memory, which will also be a memory leak if you don't reclaim ownership which is what we do in the last unsafe block.
In that last unsafe block we derefence the ptr_to_heap
to get at the usize
stored there, and cast that to a pointer-to-World type so we can reclaim ownership and Rust will automatically drop it (free the memory).
In-between all this we call world_ptr(ptr_to_heap)
of course, which allocates the World
and writes the pointer to it as an usize
type in the location we allocated above.
Returning the pointer to free memory
All good so far. But let's say you compiled this to Wasm (which we will later) and have called world_ptr()
, great! But... How does it get freed? There's actually a way to do so using some methods in Wasm/Emscripten but we'll get to those later on. For now we're going to write a function to take that World
pointer and take ownership of it so it can be freed in much the same way as the test does.
In src/lib.rs
add:
#[no_mangle] pub unsafe extern "C" fn free_world(ptr: *mut usize) -> i8 { if ptr.is_null() { return -1; } // dereference the ptr to get the usize, then cast as a pointer to World let w: Box<World> = Box::from_raw(*ptr as *mut World); // You could also add some struct checks here if desired 0 }
and in the test replace the last unsafe block with:
unsafe { assert_eq!(free_world(ptr_to_heap), 0); }
You'll need to add that function name to the use
statement also.
One more function
One last function, just to add a ittle more glitz - this one is similar to the above, but prints it to the console so you can see it.
#[no_mangle] pub unsafe extern "C" fn print_world(ptr: *const usize) -> i8 { if ptr.is_null() { return -1; } let world = &*(*ptr as *const World); println!("Hello, {}", (*world).w); 0 }
Here world
is:
(*ptr as *const World)
, dereference to get theusize
then cast that as a pointer toWorld
,*(*ptr as *const World)
, dereference that to getWorld
,&*(*ptr as *const World)
, borrow, because we can't move it out of the raw pointer.
Now we're ready to start playing with Wasm, and add new functions as things progress such as one to wite to an external buffer.
How to Compile
Use the command:
cargo rustc --target wasm32-unknown-emscripten --bin=example -- \ -Clink-arg='-s' -Clink-arg='EXTRA_EXPORTED_RUNTIME_METHODS=["setValue","getValue"]'
The setValue
and getValue
here are extra JavaScript wrappers for easier memory manipulation provided by emscripten. The resulting JS + Wasm will be in target/wasm32-unknown-emscripten/debug/
To optimise, add -Cdebuginfo=0 -Copt-level=3
to the end of the args for rustc.
The Memory Space
When the Wasm binary is loaded, a block of memory is also allocated for its use. This is basically an array of bytes of $n * byte$ length, allocated in the VM memory space.
When the Wasm program/library needs to allocate memory, it does so within the bounds of the array above. For example if a string were being allocated it would look something like the following figure:
As a general rule, such things as integers, short static arrays, or single chars will be allocated in the stack rather than on the heap (unless explicitly allocated that way). Almost anything using a buffer such as a vector (dynamic array), hash map, strings and more will be allocated on the heap by default.
Accessing Wasm Memory
Accessing, reading, and writing Wasm memory from outside of the Wasm module itself isn't easy, so for this reason we will use emcripten which provides many very helpful features and functions such as cwrap
(), _malloc()
, and _free()
. When using Rust to write code for compilation to Wasm it is guaranteed to be memory safe, but only within that Wasm module. Once you start exporting functions for use externally and requiring some manual memory management, that's when some unsafety can arise.
_malloc()
is a memory allocation function from emscripten which wraps some lower level Wasm instructions in a more user-friendly way, and for every malloc there must also be a _free()
or you will leak memory - you also need to be aware of the dangers of trying to read/write freed memory too. If you've used C or C++ much, you'll know what to expect here.
The emscripten preamble is what contains many of the helpful functions we will need, a handful of which will be covered in the following sections.
Read
First things first; you need to know where in memory you want to read. You could start anywhere if you wanted to, there's not much to stop you except for being able to make sense of what you're reading.
We're going to start by writing some wrappers in JavaScript for the functions we've written in Rust to make things easier to manage on the JS side. Because we can't be sure the Wasm has been loaded and compiled before the functions are run we need to create js/wrapped.js
with the following at the start of it.
var wasmLoaded = false; // Standard way to load wasm using emscripten var Module = { wasmBinaryFile: "./example.wasm", onRuntimeInitialized: function() { wasmLoaded = true; console.log("Wasm loaded"); } }; async function wasmLoadWait() { // Early exit if (wasmLoaded === true) { return true } let check = function() { return new Promise( load => { setTimeout(() => { load(wasmLoaded); }, 10); }); }; // Require an async check because the function captures the global // as it is when the function is called - meaning infinite loops while (!await check()) {} }
You could load Wasm another way and have all of the functions called in a single block within onRuntimeInitialized:
but that limits the capabilities of what we're doing somewhat. The purpose of wasmLoadWait()
is so we can use an await wasmLoadWait()
call at the start of each function wrapper to hold off executing the full function until the Wasm has fully loaded, without this the functon will cause an error when it get's to the point in the body where an exported Wasm function is called. And with that let's create the first wrapper function.
async function world() { // prevent body progress unless the Wasm module is loaded await wasmLoadWait(); // emscripten provides a way to allocate memory let out_ptr = Module._malloc(); // create a pointer to a pointer, this will be passed to // the Rust lib so that a new pointer can be written to it Module.setValue(out_ptr, 0, '*'); // finally, call the exported function if (Module._world_ptr(out_ptr) === 0) { return [true, out_ptr]; } return [false, out_ptr]; }
Two key parts in this block are the _malloc()
call, and the setValue()
call on the resulting _malloc()
. We set the value to 0, and the type to *
which is a pointer.
If you run console.log(world());
you will see output similar to [true, 5314256]
- this is the result of the function call, and the address in Wasm memory space where the allocation is. A visual representation is similar to this (not highly accurate - the language is also aware of the layout of the struct of course):
HTML
Example:
!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>A Wasm example</title> <script type='text/javascript' src="wrapped.js"></script> <script type="text/javascript" src="example.js"></script> <script> world().then( x => { if (x[0]) { let world_ptr = x[1]; document.getElementById("trial").innerHTML += world_ptr; document.getElementById("alloc1").innerHTML += "success"; if (print_world(world_ptr)) { document.getElementById("world1").innerHTML += "success"; } if (free_world(world_ptr)) { document.getElementById("free1").innerHTML += "success"; } } }); </script> </head> <body> <h1>Displaying results</h1> <p id="trial">Pointer to World struct = </p> <p id="alloc1">Result of allocation of World struct = </p> <p id="world1">Result of print_world() = </p> <p id="free1">Result of free_world() = </p> </body> </html>
Note that the emscripten generated example.js
must be loaded after wrapped.js
. This is because we defined the onRuntimeInitialized:
member in the global Module
- emscripten picks this up and uses it when loading the Wasm. We can also tell it where the Wasm is located.
Signing off...
I've had this blog post series sitting in back log for the last 4 months unfortunately, just couldn't find the time to finish it off. Life and work conspired against me to make me super busy, but also gave me a wee little boy to teach the joys of low-level programming to.
I will finish this series soonish. The next part will be about writing to Wasm memory space, along with a few other things I've learned over the last few months such as using the same wrapper for both Node.JS and HTML.
By now you may be looking at this post and thinking "But wait, aren't there tools to do all this?".
Yep. There certainly is. But where's the fun in that? Hopefully you've learned along the way, as I have.
PS: Check out the new book, "Programming WebAssembly with Rust" by Kevin Hoffman. It's very well written, and Kevin takes you right in to the deep end of the pool first before wading us in to the shallow end with a collection of new skills. Wish it had been available a few months ago - then again maybe I wouldn't have written this post (the overall theme of this post I wrote for Sphere Identity internal documentation).
Code
The repo for the code in this post is here